7
20 Comments

Most failed Stripe payments aren't one problem — they're two completely different ones

I've been digging into Stripe failed payments lately, and I think a lot of SaaS founders are treating them like one problem when they're actually two very different ones.

Type 1 — Timing problems
Insufficient funds, temporary holds, short-lived issues.
Retries can work here. Just need the right moment.

Type 2 — Customer-action problems
Expired cards, changed card numbers, some issuer declines.
Retrying does nothing. The customer has to update something. Another retry won't change that.

The mistake is treating both the same way: turn on retries, send a generic payment-failed email, and wait.

Type 2 failures just sit there. Unrecovered. Permanently.

Looking at failures this way changed how I think about recovery entirely — the real gap isn't retry timing, it's knowing which failures need communication at all.

Curious how others handle this: do you actually look at decline codes, or mostly rely on Stripe's built-in retries?

(I've been deep in the weeds on this building DunnAI — happy to run this breakdown on your Stripe account if you're curious.)

posted to Icon for group Saas Makers
Saas Makers
on March 30, 2026
  1. 2

    Hey — random one, but I can help you grow this with AI UGC videos on TikTok/IG. We’ve been getting solid organic traction for apps. Worth a chat if you’re open.

  2. 1

    I'm still pretty new to the SaaS game, so I had no idea that kind of pricing friction was a thing. To be fair, no one has paid for my apps yet, so I haven't had the chance to find out! lol

    1. 1

      @Fluxo When you do get your first paying users, worth setting up decline code tracking early — it's much easier to build the habit before the failures pile up. Good luck with the apps!

  3. 1

    The Type 1 / Type 2 distinction is a useful mental model
    that's easy to overlook when you're first setting up
    payment recovery.

    The decline code angle is where it gets interesting.
    Codes like "do_not_honor" or "card_velocity_exceeded"
    sit in an ambiguous middle ground — sometimes retriable,
    sometimes not — and generic retry logic handles them poorly.

    One thing that made a difference in our setup was treating
    the customer communication differently based on the failure
    type. Type 2 failures get an immediate, specific prompt
    to update their card. Type 1 failures get a quieter retry
    flow first, with communication only if retries exhaust.

    The main challenge is that decline code reliability varies
    by issuer, so even a solid classification system needs
    a fallback for when the signal is ambiguous.

    1. 1

      @EspritCode The ambiguous middle ground on do_not_honor is exactly right — it's one of the hardest codes to classify reliably. We handle it with a fallback category that uses a softer communication approach rather than a hard retry or a hard "update your card" message.
      The issuer reliability variance is a real problem. Are you running this on an active SaaS currently?

  4. 1

    nice — the free-until-it-works model is smart. removes all friction.

    im selling a $19 CSV of 1,000+ marketing agency contacts across 54 countries (with SEO scores for each), and a $9 chrome extension that audits any website for SEO issues in one click. both on gumroad. the free sample strategy has been the best move so far — people download the 50-row sample and see the quality before buying the full list.

    1. 1

      @vemtraclabs The free sample strategy is smart — same logic behind DunnAI's free-until-$49-recovered model. Good luck with the CSV and extension!

  5. 1

    This is a great breakdown — especially the distinction between timing vs customer-action issues.

    I’m seeing something similar in network traffic as well. From the outside everything looks like “normal traffic”, but in reality there are completely different categories — harmless vs malicious vs automated probing — and treating them the same is the real problem.

    Curious — did you discover this mainly from data analysis, or from customer support patterns?

    1. 1

      @firegate Great parallel — the same pattern shows up everywhere: surface label vs. underlying cause. For failed payments, it was mostly data analysis — looking at decline codes and realizing the recovery logic had to be completely different for each category. Support patterns confirmed it later. What are you building in the network traffic space?

  6. 1

    A lot of recovery systems fail because they treat one surface symptom as one underlying reality.

    But “failed payment” is not a diagnosis.
    It’s a label.

    Some failures want time.
    Some failures want action.

    Misread the layer,
    and you optimize the wrong system.

    1. 1

      @HeritageLab "Failed payment is not a diagnosis, it's a label" — that's exactly the framing. The two buckets need two different systems, not one retry cadence applied to both.

  7. 1

    This framing really clicks. We use Stripe for our SaaS and I'll admit we were guilty of the "turn on Smart Retries and forget about it" approach for way too long. The wake-up call was when we actually exported our failed payment data and realized nearly half our involuntary churn was Type 2 — expired cards and hard declines where no amount of retrying would ever recover the revenue.

    The thing that surprised us most was how much of Type 2 churn was preventable with proactive communication BEFORE the card even fails. Stripe sends card_expiring webhooks, and we started emailing users 2 weeks before expiration with a direct link to update their card. That alone cut our Type 2 failures by about 30% before they even happened.

    For the Type 1 stuff, we found that retry timing matters more than retry count. A retry at 2am when banks are doing batch processing has a meaningfully higher success rate than retrying during business hours. Small detail but it compounds over time when you're running enough subscriptions.

    1. 1

      Really valuable — the proactive card_expiring webhook approach is something a lot of founders skip entirely. Catching Type 2 before it happens is obviously better than recovering after.
      That 30% reduction from pre-expiry emails is significant. DunnAI focuses on post-failure recovery, but this is a good reminder that the best recovery is prevention.
      What stack are you running for the outreach side?

  8. 1

    The decline code distinction is massively underused. Most founders run retries on everything and wonder why recovery rates stay low.

    The specific codes that matter for Type 2 sorting: do_not_honor, card_velocity_exceeded, generic declines from the issuer. Those are almost always Type 2. insufficient_funds and processing_error tend to be Type 1. The issuer-level ones are sneaky because they look temporary but often aren't.

    The thing I've found helps most is the outreach timing. For Type 2 a dunning email on day 1 converts way better than day 3-4. By day 3 the customer has already mentally cancelled. Day 1 they still care.

    Also worth layering in: a direct card update link vs just asking them to log in and find billing. Removing that friction cuts 20-30% of drop-off between email opens and actual card updates.

    Good framing on the two-problem split. Most SaaS dunning docs don't make this distinction clearly.

    1. 1

      The day 1 vs day 3 timing point is underappreciated. By day 3 the mental cancellation has already happened — that's exactly right.
      The direct card update link vs "log in and find billing" friction point is also something I built into DunnAI — it generates a direct link so customers don't have to hunt.
      Curious what you're building — are you still dealing with this actively?

  9. 1

    The timing vs customer-action split is exactly right. I ran into this with a small SaaS last year and our retries looked "reasonable" on paper, but most unrecovered failures were expired cards so no retry cadence was ever going to fix them. Once we split recovery into retryable declines vs update-needed declines and sent a direct card-update link for the second bucket, recovery got a lot better.

    1. 1

      @microbuilderco That's exactly the split that matters — and most dunning setups treat both buckets the same way, which is why recovery rates stay low. The card-update link for the second bucket is the right move.
      Curious what you're building now — are you still running into this on current projects?

  10. 1

    interesting approach. have you thought about giving away a free version to build trust first? i started offering a free sample of my data product and it changed the conversion conversation completely.

    1. 1

      Thanks! Actually DunnAI is already free until it recovers $49 for you — no charge before that. The free diagnostic report is also available right after connecting Stripe. Curious what your data product is, if you don't mind sharing.

  11. 1

    building in public is underrated as a growth channel. the posts about struggles get way more engagement than the polished ones. keep sharing the real numbers.

Trending on Indie Hackers
I'm a lawyer who launched an AI contract tool on Product Hunt today — here's what building it as a non-technical founder actually felt like User Avatar 152 comments Never hire an SEO Agency for your Saas Startup User Avatar 92 comments A simple way to keep AI automations from making bad decisions User Avatar 66 comments “This contract looked normal - but could cost millions” User Avatar 54 comments 👉 The most expensive contract mistakes don’t feel risky User Avatar 41 comments Are indie makers actually bad customers? User Avatar 36 comments