Andreas Fragner Andreas Fragner

Figma challengers

Figma was built for designers. And it was built exceptionally well for that demographic. But its dominance now becomes a liability as UI work shifts both upstream and downstream.

Figma is the dominant software for UI/UX design today. It’s now being challenged by agent-first products like Pencil, Stitch and others — tools built to be used principally by and through agents. The designer’s role is to steer, not to use the underlying tool. Understanding how those new kinds of products work under the hood — vs something built for humans like Figma — gives some interesting insights into defensibility of both incumbents and challengers.


Pencil is one Figma challenger that’s been gaining traction lately. You interact with it primarily via Claude Code. You tell Claude what you want to design (e.g. screens for a new user flow), it then creates and mutates pen files using the Pencil MCP, and renders screens from those on a canvas. It takes screenshots to validate the work and iterates until it likes the output or more input is given by the user.

The file format is one innovation here. pen files are json-like and easy for LLMs to generate. They can be version-controlled and rendered in IDEs. The MCP tells Claude how those files work and provides methods to read and mutate them. The main method, batch_design, allows for atomic updates across multiple nodes in the design which can be rolled back in one go.

The second innovation is the built-in feedback loop via screenshots. The MCP provides two methods, get_screenshot (visual snapshot of a node or the whole canvas) and snapshot_layout (returns bounding boxes of nodes on the canvas). The latter allows the agent to understand where things are positioned, which isn’t information it can easily infer from a png image.

Designing the MCP around batch operations turns out to be key to get the feedback loop to work well. Multi-modal models are still pretty limited in terms of visual intelligence. They will often struggle to reason from visual output alone what specific operation caused the end result to deviate from expectations. Backtracking a whole batch of operations is often necessary to make progress [1]. The agent could in theory take smaller steps (down to single operations) but taking screenshots and feeding them into an image model at each step is inefficient and expensive. The feedback loop would be agonizingly slow as a result.

Taken together, Claude + Pencil produces quality output overall. It still ends up missing some obvious issues here and there (e.g. text overflow, inconsistent styling across screens), but nudging it to fix those still ends up being faster than doing the designs yourself from scratch.

Claude knows a lot about UI design patterns and there’s often no need to be super specific about what you want. It will often suggest layouts that are better than what I could’ve come up with for example. Still, design involves judgement and having the human in the loop turns out to be important. Auto-generating/one-shotting designs sort of works but for great, polished UX you need to bring taste, product intuition and a deep understanding of your customer base. All things the models don’t have (yet).

Figma also has a bi-directional MCP now. The main method is use_figma, which takes a code parameter and executes arbitrary javascript against the Figma Plugin API. The code runs remotely in a Figma plugin sandbox. The approach is different from Pencil's batch_design. There’s no explicit update/move/delete/etc primitives that operate on a structured node tree. Instead, Claude writes and debugs javascript to implement a design. The files it operates on are hosted on Figma’s servers, so every read/write is a network call, while pen files are local. Taken together, this tends to make for slower iterations.

I’ve used Claude + Figma and Claude + Pencil side by side, working on non-trivial design problems like user flows with multiple levels of nesting, mental model ambiguity and other intricacies. Pencil was faster and more accurate, and required significantly less re-prompting. I got it to work with Figma as well eventually, but the experience was more tedious. Claude obviously knows how to write javascript well but like a human designer, it benefits from a more predictable and targeted tool. And it told me as much afterwards when I asked it to do a post-mortem of the parallel sessions [2].


Why didn’t Figma release a better MCP? The reason I think is simple: Figma was built for humans. And it was built exceptionally well for that demographic. But its dominance now becomes a liability as UI work shifts both upstream (to product managers) and downstream (to engineers).

Figma has a large installed base and an apparent distribution advantage, and they may still provide a better agent experience down the line. But a distribution advantage isn’t worth as much if your existing customer base is being disintermediated, and the userbase broadens and shifts. Many of the people using Pencil et al. were never Figma pro users to begin with. They are product engineers like myself who had perhaps some grasp of Figma but always wished for something more tightly integrated with the codebase, and a faster feedback loop that didn’t involve costly design handoffs. It makes sense for designs to live close to code now that agents can generate both. At a minimum this cuts down on synchronization issues and helps reduce a large class of translation errors.

The obvious challenge for the new breed of Figma competitors on the other hand is vertical integration by foundation model providers. And indeed, Claude Design is already out in research beta. I like Pencil and its fellow challengers, but it’s not obvious to me what’s defensible about their products, or indeed how they’re going to monetize. They’re subject to significant supplier power — you can’t use them without a foundation model. One option is to attempt to vertically integrate themselves, another is to bet on product focus and attention. Anthropic sure has the resources to build a similar offering but does it have the leadership attention to build one of the same quality? History shows this is not always the case for incumbents.


Footnotes

[1] Not just because of limited visual intelligence, but also because of frequent API errors. In my experiments, the dominant class of errors Claude made turned out to be using the MCP incorrectly - syntax errors, unsupported operations, null-errors. I’m guessing those will go away over time as Pencil optimizes the MCP.

[2] These post-mortems/self-analyses are insightful but must be taken with a grain of salt. Depending on how you frame the analysis, models will happily hallucinate themselves into an opinion that confirms your leading questions. Even a hint of preference will bias them.

Read More
Andreas Fragner Andreas Fragner

Using fewer parts

Fewer parts make for better software and better products.

The best-performing firms make a narrow range of products very well. The best firms’ products also use up to 50 percent fewer parts than those made by their less successful rivals. Fewer parts means a faster, simpler (and usually cheaper) manufacturing process. Fewer parts means less to go wrong; quality comes built in. And although the best companies need fewer workers to look after quality control, they also have fewer defects and generate less waste.

— Yvon Chouinard, Let my people go surfing

Chouinard’s observation applies to software products almost verbatim. Using fewer parts makes for better software: Easier to maintain, easier to extend, better margins. But what does “fewer parts” mean? And how do you know which ones to remove?

Fewer parts means making parts reusable. A good design minimizes number of components at constant functionality. That means avoiding duplication and making things reusable. If you can reimplement a system with a smaller number of components (functions, classes, services, etc.), that’s a sign that the original solution was either over- or under-engineered. Over-engineered because it introduced abstractions that weren’t necessary; under-engineered because it failed to identify reusable parts. It can be tempting to make fewer but larger components but those almost always end up being less re-usable. You might have fewer functions in such a design but you don’t have fewer parts.

Fewer parts means fewer representations of the data. All else equal, the amount of logic required to support n representations of the same data scales like . It’s not uncommon for teams to maintain protobuf models, SQL schemas, Open API specs, GraphQL schemas, etc. all to support a single product. They might have a source of truth that defines the “core” data models (e.g. in protobuf), but still end up spending a ton of bandwidth on maintaining model converters and crafting migrations. Most people intuitively prefer to have fewer data representations, but the challenge is that different applications typically need different views or different derived properties of the data. That can lead to a proliferation of derived models which may not have strict one-to-one relationships with the original models.

Fewer parts means fewer languages and fewer tools. There is almost never a good enough reason to add another language to your stack. The increase in complexity and maintenance burden is consistently underestimated vs. the benefits. The same goes for databases. Performance reasons are often not strong enough to justify adding a new type of DB to cater to your latest special use case.

Fewer parts means smaller teams. Smaller teams spend less time coordinating and more time building and owning things. In most start-ups, a small number of engineers (3-4) build the first iteration of the product, which ends up generating 80% of the lifetime value of the product. It’s clearly possible to build complex things with a small, focused team. But as more money is raised, engineering teams balloon because they lose focus and add components that are not directly aligned with creating customer value. It’s Parkinson’s law at work. Companies perceive things to be mission-critical for the product, then craft a budget based on that, which must then be used once allocated, so more people are hired who then produce yet more parts, and so on.

Fewer parts means fewer counterparties. Most things break at the boundaries (especially if they’re external). The greater the surface area, the riskier and the harder to maintain a system becomes. Prefer to deal with a small number of high-quality vendors, and be prepared to pay a premium. The obvious interjection here is concentration risk: If a key vendor goes into administration or decides to drop the product you rely on, that might pose an existential risk to you. Such counterparty risk can indeed matter greatly and needs to be considered, but I’ve found in practice it’s often more manageable than people think. There are SLAs and contractual notice periods, and the majority of counterparties will honor them, giving you time to adjust. If you do need to replace a vendor, you start out with a much clearer picture of the requirements and the scope of the integration, which cuts down on time-to-market.

If using fewer parts is a good idea, how come modern software production appears to be so bloated? Dozens of vendors, a stack that’s 7 layers deep and includes 4 languages, teams of 60+ developers, etc. feel like the norm. Clearly, companies believe they need this many parts to deliver value to customers. Few people are deliberately trying to waste resources after all. But the problem is that people lose sight of what activities actually create value. As a company grows, a disconnect starts to develop between the activities performed by its employees and the value that is delivered to customers. In a 10 person firm, everyone speaks to customers, everyone knows the value chain and everyone uses the product. In a 1000 person firm, by definition most employees have never spoken to customers and may work on parts of the system that are increasingly far removed from what the customer sees. This is one instance where great management can make a huge difference. In well-managed firms, management goes to great lengths to communicate the link between firm activities and value creation. The focus is on customers and the problems they face, rather than process and efficiency gains. If you focus on serving your customers better, efficiency will take care of itself.

A few principles I follow to keep the number of parts small:

  1. Hire fewer but better people and pay them more.

  2. Work with fewer but better vendors and be willing to pay a premium. Be systematic about selecting them and understand the risks.

  3. Each project you decide to allocate resources to must have a 3-4 sentence description of how it creates value for customers. People often struggle with this if the work is abstract or far removed from what the customer sees (say, work on infrastructure) but I’ve found it’s always possible if the work is worth pursuing.

Read More