2
1 Comment

I missed real downtime because my alerts were noisy

I built Pulsekeep after missing actual production downtime.

The frustrating part wasn’t lack of monitoring — it was too much of it.
One region would time out.
Another would recover.
Alerts would fire, then silence, then fire again.

I started ignoring them.

That’s when a real outage slipped through.

So I’m building Pulsekeep around a simple rule:
If something is down, you should know immediately.
If everything is fine, you shouldn’t hear anything.

Recently I shipped multi-region checks, and the hard part wasn’t infra — it was deciding when an alert is actually justified.
A single probe failing isn’t downtime.
Consensus is.

Still early (MVP), still learning, but this constraint has shaped every decision so far.

If you’ve fought alert fatigue or false positives, I’d love to hear how you handle i

on December 27, 2025
  1. 1

    Alert fatigue is such an underappreciated problem. The irony is that more monitoring often means less visibility because the signal-to-noise ratio tanks.

    Your "consensus" approach to multi-region checks is smart. One failure could be a network blip; multiple failures in different regions is a pattern worth waking someone up for.

    We've dealt with this by implementing severity tiers with different notification channels. Critical (consensus-confirmed outage) = phone call. Warning (single probe failing) = Slack message that batches. Info = dashboard only. The key was being ruthless about what qualifies as "critical" - if it doesn't require immediate human action, it's not critical.

    Curious how you're handling the "flapping" case - where something goes down/up/down/up rapidly. That's where most alert systems get noisy.

Trending on Indie Hackers
The most underrated distribution channel in SaaS is hiding in your browser toolbar User Avatar 185 comments I launched on Product Hunt today with 0 followers, 0 network, and 0 users. Here's what I learned in 12 hours. User Avatar 157 comments I gave 7 AI agents $100 each to build a startup. Here's what happened on Day 1. User Avatar 98 comments How are you handling memory and context across AI tools? User Avatar 85 comments Do you actually own what you build? User Avatar 48 comments Show IH: RetryFix - Automatically recover failed Stripe payments and earn 10% on everything we win back User Avatar 34 comments