I was using an existing monitoring app to monitor ~50 websites.
And it was working, but some issues were there.
It was sending a lot of emails for some websites that are not stable due to some random 500 errors or time out errors.
I was tired with lots of email alerts for 2-3 websites I added, all other websites were stable.
Most of the time nothing was actually “down.”
It was just temporary instability.
I was worried it will cause neglecting the real down alerts.
So I built my own approach. I created a new uptime monitoring tool, with a focous.
Never send an alert without multiple checks.
Send an alert only after 4 checks are done, that too from multiple locations.
Now before we send an alert:
• We retry 3–4 times
• We verify the monitor is down from multiple locations
• We make sure it’s actually consistently failing
If it’s just a minor flip, we ignore it.
If it’s really down, you’ll know.
Way fewer false alarms.
Way more confidence when an alert does fire.
Curious — how are you handling noisy alerts in your stack?
If any one would like to check it, please see https://statuseagle.com/
I’ve faced this too. Too many alerts for small issues can be really frustrating. Your approach is smart, especially checking multiple times and from different locations before sending an alert. It reduces false alarms and makes alerts more reliable. I also add a small delay or threshold so only real issues trigger notifications.
Alert fatigue is real. I’ve seen teams reduce it a lot by grouping alerts and adding simple automation to filter the noisy ones. What monitoring stack are you using right now?