1
0 Comments

I built an AI prompt engineering platform with a self-correcting generation loop — here's the full story

What it actually does
Four core systems:

  1. Prompt generation engine — takes user input, applies personality style, and returns a structured prompt optimized for AI image generation.
  2. Character Lock — a 41-field character sheet system that ensures visual consistency across unlimited generations. Gender, age, ethnicity, facial features, hair, clothing, accessories — stored and injected into every prompt automatically.
  3. Helios personality engine — 6 AI archetypes that blend different stylistic approaches to prompt output. Users pick a personality, the engine adjusts tone, structure, and emphasis accordingly.
  4. Vertex AI RAG pipeline — instead of relying purely on the model's base knowledge, every generation request queries a Vertex AI Search data store (Discovery Engine) backed by 34 curated prompt engineering documents before the model runs. Outputs reference established prompt engineering principles rather than hallucinated style advice.
    The RAG layer runs on Gemini 3.1 Pro. The data store lives on Google Cloud and gets queried on every /generate call.

The self-correcting generation loop
Added a critic=true parameter to /generate. If the output scores below 80 on an internal quality rubric, the system silently regenerates — max 1 retry. The user never sees the failed attempt. Combined with RAG context, the first attempt is already better, and the retry almost always clears the threshold.

Stack
FastAPI, Next.js, PostgreSQL, SQLAlchemy, Zustand + TanStack Query. Vertex AI Search for RAG. Gemini 3.1 Pro as the generation model.

What the audit fixed today

Replaced python-jose (had CVE-2022-29217) with PyJWT
Stateless refresh tokens with 7-day expiry
SECRET_KEY mandatory in production — app won't boot without it
LimitUploadSizeMiddleware — blocks requests over 2MB before they hit memory
Prompt injection sanitization at Pydantic layer, user input isolated in XML tags
api_usage table auto-cleanup — 90-day retention
File lock on Alembic migrations for multi-worker safety
Extracted 40+ endpoints from monolithic main.py into 8 specialized routers

Before: 57 tests. After: 137 passing.

Testing approach
Cross-model QA — Claude wrote the code, Gemini generated the tests independently to avoid author bias. Gemini caught edge cases I wouldn't have written tests for myself.

Where it is now
Architecture is B2B-ready. Working on the monetization layer next.
Happy to answer questions about the RAG setup, the self-correction loop, or the Helios personality system.

on March 23, 2026
Trending on Indie Hackers
I shipped a productivity SaaS in 30 days as a solo dev — here's what AI actually changed (and what it didn't) User Avatar 123 comments Never hire an SEO Agency for your Saas Startup User Avatar 100 comments A simple way to keep AI automations from making bad decisions User Avatar 67 comments “This contract looked normal - but could cost millions” User Avatar 54 comments Are indie makers actually bad customers? User Avatar 36 comments We automated our business vetting with OpenClaw User Avatar 35 comments