Built an AI DevOps Agent that explains outages like a human (launched today)

Hi AI builders 👋

We just launched AI Incident Investigator — an AI agent that helps DevOps teams debug cloud incidents using natural language explanations instead of dashboards and logs.

It uses:

Claude as the base model
LangChain to structure prompts & memory
RAG to enrich LLM output with real-life infrastructure debugging patterns from AWS (ECS, ALB, CloudWatch)

Example query:
“Why did our staging app return 5xx errors at 2PM yesterday?”

🧠 AI Output:

Deployment at 13:58 changed env var MAX_WORKERS

Instance ran out of memory

5xx errors started at 14:03

Suggest: rollback or increase container memory

We’re live on Product Hunt today!
🔗 https://www.producthunt.com/products/microtica-ai-agents-for-devops

Would love feedback from other AI devs here — especially if you’re working with LLMs for infra.