What Is Product-Market Fit, Really?

Product-market fit is what you have when the people who use your product would be genuinely upset to lose it, and they keep coming back without being chased. The most cited proxy is the Sean Ellis survey: ask users "how would you feel if you could no longer use this product?" and if 40% or more say "very disappointed," you supposedly have fit. It's a useful smoke signal, but it's a survey, not the thing itself. The real evidence of product-market fit lives in your retention curve: do your cohorts stop churning and settle onto a flat plateau, or do they bleed to zero? That plateau, rather than a 40% threshold, is the strongest evidence you can get.

The 40% test became famous because it gives a tired founder a single number to chase, and a single number is comforting. But fit is a behaviour, not an opinion. People can tell a survey they'd be "very disappointed" and then quietly stop logging in. The retention curve doesn't lie the way a survey does.

The short version

Product-market fit = a defensible match between what your product does and a market that demonstrably wants it, shown by users who retain (come back on their own) and, ideally, expand and refer. The Sean Ellis 40% test is one input. The harder evidence is:

A retention curve that flattens: cohorts reach a stable plateau instead of decaying to zero.
Cohort behaviour that holds or improves: newer cohorts retain at least as well as older ones.
Metrics that reinforce each other: retention, unit economics, and any viral loop point the same direction rather than fighting.

None of these is a single magic number. They're a shape and a set of relationships. That's the part most "is it fit yet?" guides skip.

Where the Sean Ellis 40% test comes from, and what it misses

Sean Ellis ran growth at Dropbox, LogMeIn and Eventbrite, and noticed that the companies that took off shared a survey pattern: roughly 40% of users said they'd be "very disappointed" without the product. Below that line, growth efforts mostly fizzled; above it, they compounded. So the 40% benchmark was an observation, not a law of physics.

That origin matters because it tells you exactly where the test is weak:

It's a stated preference, not a revealed one. What people say in a survey and what they do with their calendar are different data sets. Revealed behaviour (did they come back this week, unprompted?) outranks stated behaviour every time.
It's biased toward people still using the product. You're surveying survivors. The users who already churned aren't there to say they wouldn't miss you. (And yes, that means a healthy-looking 40% can sit on top of a leaky bucket.)
It's a snapshot. Fit isn't a moment you pass through; it's a state you hold or lose. A one-time survey can't show you whether last quarter's cohort is decaying.
The 40% line is directional, not sacred. It came from a handful of consumer-ish products. Your number depends on your market, your switching costs, and how you sampled. Treating 40% as a pass/fail gate is exactly the benchmark-as-target trap.

So run the survey: it's cheap and the open-text answers are gold for finding your most-loved use case. Just don't mistake the proxy for the territory.

What product-market fit actually looks like in the data

1. The retention curve flattens

This is the single clearest signal, and it's the one the survey can't give you. Plot the percentage of a cohort still active over time. Three shapes are possible:

Decay to zero: retention keeps dropping month after month and never stops. No fit. You're acquiring users into a bucket with no bottom.
Decay, then a plateau: retention drops at first (every product loses the merely-curious) but then flattens onto a stable line. That flat tail is a stable base of users who found durable value. This is the signature of product-market fit.
Smiling curve: retention drops, flattens, then ticks up as dormant users return and the core deepens. It's rare enough that you should be suspicious of your instrumentation before you celebrate.

The number that matters isn't the height of any single point. What matters is whether the curve goes flat. The way our benchmark tool frames it: look for curve flattening, not just absolute numbers. A 25% plateau that holds is healthier than a 45% that's still sliding.

What "flat" looks like by month is heavily context-dependent, which is the whole point, but here are sourced reference bands to read your own curve against.

Retention benchmarks: B2B SaaS vs Consumer

Retention point	B2B SaaS (avg)	Consumer apps (avg)
Day 1	50% to 70%	20% to 30%
Day 7	40% to 60%	8% to 15%
Day 14	35% to 55%	4% to 8%
Day 90	25% to 35%	1% to 4%

Sources: Pendo Product Benchmarks, Amplitude, Mixpanel (B2B); Adjust, AppsFlyer, UXCam (consumer).

Look at the gap. A consumer app with 3% Day-90 retention can have excellent product-market fit, while a B2B tool at the same 3% is dying. Different growth mechanics, different physics, which is why comparing your B2B numbers to a consumer benchmark (or vice versa) will lie to you. Read the curve against the right reference class: a B2B number judged against a consumer band will flatter or terrify you for no reason.

The other tell, hiding in the table: by Day 14 the drop should be flattening. In the way our benchmark tool reads it, a steep decline that hasn't levelled by then signals a weak core loop. The slope tells you more than any single point. (For the curve-by-curve breakdown, see retention-rate-benchmarks.md.)

2. Cohort behaviour holds, or gets better

A flattening curve for one cohort is good. The stronger signal is comparing cohorts over time. Group users by the month they joined and plot each cohort's retention separately.

If newer cohorts retain as well as or better than older ones, your fit is real and improving: product changes and a sharper ICP are landing.
If each new cohort retains worse than the last, you're scaling acquisition faster than fit. Often you nailed it for an early niche and are now buying users from adjacent segments who don't have the same job-to-be-done. The blended number can look fine while the trend rots underneath.

This is why a single aggregate retention figure is dangerous: it averages a loyal early cohort with a churning recent one and hides the divergence. Fit is a question you have to keep asking of each cohort, not answer once.

3. The metrics interact: fit is a system, not a stat

Here's the interesting part: product-market fit rarely shows up in one metric. It shows up in how your metrics relate. A few of the interactions that matter most:

Retention is the foundation of unit economics. Lifetime value is mostly retention with a price tag (margin and expansion do the rest). A strong LTV:CAC ratio (the healthy band is 3:1 to 5:1 for B2B per Bessemer, OpenView and a16z; 2:1 to 4:1 for consumer per Adjust and AppsFlyer) is built on a flat retention curve. If LTV looks great but the curve is still sliding, your LTV is a forecast resting on churn that hasn't finished happening. Evaluate retention quality before you trust CAC efficiency. (More on the ratio and its traps in ltv-cac-ratio.md.)
A "great" ratio can be a warning sign. An LTV:CAC above 5:1 often means you're underinvesting in growth: you've found fit and you're being too timid to pour fuel on it. The metric only means something next to your CAC payback period (6 to 12 months for SMB/self-serve B2B; longer is fine only with high retention and expansion, per OpenView and KeyBanc).
Virality without retention is a leak, not a loop. K-factor (the viral coefficient, how many new users each user brings) is weak in B2B by nature (0.1 to 0.3, per Reforge and Andrew Chen) and stronger but rarely above 1 in consumer (0.3 to 0.7). But a K-factor only compounds if those referred users stay. As Andrew Chen has argued for years, the best way to drive viral growth is to improve retention first: a flat curve gives every cohort many more chances to spread the product before it churns out. Virality bolted onto a leaky bucket just fills the bucket faster.

So which signal is "the" signal? None of them alone. Fit is the state where retention has plateaued, cohorts are holding, and the unit economics and any loop all point the same way. When they fight each other (great survey, leaking curve; great ratio, decaying cohorts), you don't have fit yet, you have a number that's flattering you.

Why benchmarks are context, not targets

You'll have noticed I keep giving ranges and then immediately undercutting them. That's deliberate, and it's the core thesis of how we think about this.

A benchmark is a reference class, not a goal line. "Good" Day-90 retention is 25% to 35% for B2B and 1% to 4% for consumer, but your "good" depends on your business model (freemium vs. self-serve vs. enterprise sales move the bar a lot), your contract length, your switching costs, and what job your product is hired to do. A frequency-of-use product should have a higher, flatter curve than a once-a-quarter tool, and both can have genuine fit.

The failure mode is treating a benchmark as a target: you optimise to hit 40% on the survey, or to drag Day-7 up to the middle of the band, and you start gaming the proxy instead of building the thing. The benchmark's job is to tell you which questions to ask (is my curve flattening at all? are my newer cohorts holding? do my economics rest on real retention or forecast retention?), not to hand you a finish line.

In my honest, experience-earned opinion: the teams who win don't ask "did we hit 40%?" They ask "has the curve gone flat, and is it staying flat as we scale?" That's a harder question, and a truer one.

How to actually assess your product-market fit

A practical sequence, in order of how much each step tells you:

Plot your retention curve and look for the plateau. Does it flatten, or decay to zero? This is the first thing to look at, and the most honest.
Split it by cohort. Are newer cohorts holding up against older ones? Divergence here is an early warning the aggregate hides.
Compare against the right reference class. B2B to B2B, consumer to consumer. The slope (is it flattening by Day 14?) matters more than any single point.
Check whether your economics rest on that retention. Is your LTV:CAC built on the flat part of the curve, or on churn that hasn't finished? Is payback sane?
Then, run the Sean Ellis survey, for the open-text answers as much as the 40%. It tells you who loves you and why, which is how you sharpen your ICP. Treat it as colour, not verdict.

Do these in this order and the survey becomes what it should be: a way to understand your fans, not a gate you're trying to clear.

If the diagnosis comes back ugly, the levers are roughly in this order too: fix the curve before you fix the funnel. Sharpen the ICP so you stop acquiring users who were never going to stay, deepen the core loop that drives repeat use, and only then pour fuel on acquisition or virality, since a leaky bucket just churns faster the more you fill it. The retention and unit-economics breakdowns below go deeper on each lever.

See where your numbers land

The hard part is not knowing that retention should flatten. The hard part is reading your own curve against the right reference class without fooling yourself. That's exactly what our free benchmark tool does: paste in your retention by month and your unit economics, and it charts your curve against the B2B and consumer bands, flags whether you're plateauing or still sliding, and shows how your retention, K-factor and LTV:CAC relate, as context, not as targets. It takes a couple of minutes.

Check your product-market fit signals → benchmark.scilla.studio

For the wider picture, see our full B2B SaaS growth benchmarks for 2026.

FAQ

Is the Sean Ellis 40% test still valid? It's a useful proxy, not proof. The 40%-very-disappointed threshold came from observing a handful of high-growth products, so it's directional rather than a universal pass/fail line. Run it for the open-text insight into who loves your product and why, but verify fit with revealed behaviour (a flattening retention curve), since surveys only sample surviving users and capture stated, not actual, behaviour.

What's the clearest signal of product-market fit? A retention curve that flattens onto a stable plateau instead of decaying to zero. That flat tail is a base of users who found durable value and keep returning on their own. The height of the plateau depends on your market (a 25% B2B plateau and a 3% consumer plateau can both indicate fit), but in every case it has to flatten.

What's a good retention rate for product-market fit? For B2B SaaS, roughly 50% to 70% Day-1, 40% to 60% Day-7 and 25% to 35% Day-90 (Pendo, Amplitude). For consumer apps, 20% to 30% Day-1, 8% to 15% Day-7 and 1% to 4% Day-90 (Adjust, AppsFlyer). These are reference bands, not targets, and the slope flattening matters more than any single point.

Can you have product-market fit with low retention? Yes, if you're consumer and your curve still flattens. A consumer app with a 3% Day-90 plateau can have real fit, while a B2B tool at the same level is dying. Always read your numbers against the right reference class (B2B to B2B, consumer to consumer) because they follow different growth mechanics.

How is product-market fit different from traction? Traction is growth you can sometimes buy with spend or hype. Product-market fit is durable: it shows up as users who retain, cohorts that hold as you scale, and unit economics that rest on real retention rather than forecast retention. You can have traction without fit (a leaky bucket filling fast); fit is what makes the traction compound instead of leak.

See where your metrics land

Joni Lindgren

Founder & Growth Product Manager

Skicka DM på LinkedIn