A few conversations with friends building AI products lately have surfaced the same frustration: the underlying model keeps improving, but retention won't budge.

I've been sitting with that problem for a while.

Capability Is Not Value

The most common mistake AI product managers make is treating "model capability" as equivalent to "product value."

GPT-4 can write code, translate documents, summarize meetings. Those capabilities are real. But when you package them into a product, users almost always say the same thing: "This can do a lot of things, but none of them feel quite right."

The reason is simple: capability is a point. Workflow is a line.

Users aren't using "AI capabilities." They're completing a specific task — one with context, upstream steps, and downstream actions. When AI is just an isolated tool rather than something woven into the whole workflow, it doesn't matter how capable the model is. You end up with something users describe as "useful but forgettable."

Three Failure Modes

Failure mode one: confusing "can use" with "would use"

User research on AI products will often show high satisfaction scores. But ask the follow-up: "How many times did you use this feature last week?" The silence is usually informative.

Can use means the user understands what the product does. Would use means there's a specific situation where they reach for it without thinking. The gap between those two is where most AI products die.

Failure mode two: overestimating prompt skill

Prompt engineering is a skill. Most users don't have it and don't want to learn it. When your product requires a well-crafted prompt to return useful output, you've quietly transferred your product design problem to the user. That's a design failure, not a user failure.

Failure mode three: "everything" as a positioning

"Our AI assistant can write emails, summarize meetings, generate code, analyze data…"

That positioning works in the early curiosity phase. But retention is built on habit, not curiosity. Habits require a specific, repeated use case — not an unbounded capability set.

A Framework I Use

When I'm making feature decisions on an AI product, I ask three questions:

  1. How did users do this before this feature existed? If the answer is "they didn't," you probably don't have a real market. If the answer is "they did it badly and slowly," that's a real opportunity.

  2. What is the friction cost? How many screens, concepts, or inputs does using this feature require? Every added step degrades retention. AI products should make users feel faster, not trained.

  3. What emotional state is the user in? If they're already overwhelmed, anything that requires learning will get abandoned. The best AI product wedge is the task users most want help with when they're exhausted.


Good AI products aren't the ones that do the most. They're the ones that do one thing better than any alternative.

That sounds straightforward. In practice, it requires an unusually deep understanding of how users actually work — and the discipline to say no to a long list of things that look impressive but don't fit.

Both are hard.