4
19 Comments

I built an AI governance layer and opened a developer preview

Most AI apps today directly call an LLM and return the response.

That is fine for demos, but production AI needs more control: policy, identity consistency, memory boundaries, traceability, and runtime governance.

So I built NEES Core Engine — a governance layer that sits between an AI app and the model provider.

Flow:

User → App → NEES Core Engine → Model Provider → Governed Response

I just opened a public developer preview repo with docs and quickstart examples:

https://github.com/NEES-Anna/nees-core-developer-preview

It includes Python, Node.js, cURL examples, API reference, governance flow docs, and templates for API key requests and developer feedback.

I’m looking for honest feedback from AI builders:

Would this be useful in your AI app?
Is the API approach clear?
Would trace IDs and governance metadata help you trust/debug AI responses?
What would you expect before using something like this in production?

This is still early, but the core engine is live and I’m using the repo to collect real builder feedback.

Would love your thoughts.

posted to Icon for group Product Launch
Product Launch
on May 4, 2026
  1. 1

    Thanks for the transparency, Anna. I completely agree—at this stage, high-quality feedback from production-level builders is worth 100x more than generic traffic.

    What I have in mind is a system that allows you to partner with technical influencers or niche community leaders who already have the trust of your target audience (AI engineers and dev-tool users). Instead of broad visibility, you can offer them a structured incentive to bring their "inner circle" of builders into your developer preview. This way, you get the qualified feedback and SDK testing you need, while ensuring the distribution is aligned with your current technical goals.

    Would you be open to a quick chat or a DM to see how this could help you recruit the specific type of builders you’re looking for?

    1. 1

      That makes sense, and I agree with the direction.

      For this stage, I’m not looking for broad traffic as much as qualified builders who can test the API, challenge the assumptions, and give practical feedback on the developer experience.

      A trusted community-led approach could be useful if it brings in the right people: AI engineers, dev-tool builders, agent developers, and teams already dealing with production AI workflows.

      The main thing I’d want to preserve is signal quality. The developer preview should stay focused on feedback around:

      • API clarity
      • traceability
      • memory boundaries
      • runtime policy
      • SDK needs
      • real production use cases

      There is also a live sample app connected to the governed runtime here:
      https://naina.nees.cloud

      So the ideal flow for interested builders would be: review the developer preview repo, try the live sample app, request API access if the governance layer feels relevant, and then share practical feedback.

      I’m open to discussing it. You can DM me with how you’d structure it, what kind of communities/builders you have in mind, and what the incentive model would look like.

  2. 1

    This direction is interesting.

    I think “AI governance” is going to matter a lot more as products move from simple assistants to systems that influence real decisions.

    I’m working on a trading-related product, and the hardest part is not making AI generate outputs.

    It’s making the workflow observable:
    what data was trusted,
    what confidence level existed,
    what risk limits applied,
    and when the system should not act.

    Would be curious how you explain governance to users without making it sound too abstract.

    1. 1

      This is exactly the kind of use case where governance becomes easier to explain.

      For users, I try not to explain governance as an abstract compliance layer. I explain it as control and visibility around AI behavior.

      In a trading-related workflow, the question is not only “can AI generate an answer?”

      The real questions are:

      • what data did it rely on?
      • what confidence or uncertainty existed?
      • what risk limits applied?
      • what was the system allowed to do?
      • when should it stop or escalate?
      • can the decision path be reviewed later?

      That is governance in practical terms.

      So instead of saying “AI governance,” I usually frame it as:

      “Before the AI responds or acts, the system checks what rules, context, risk boundaries, and traceability requirements apply.”

      For developers, governance means policy, memory scope, identity/mode control, and trace metadata.

      For users, it means the AI is not just guessing — it is operating inside visible boundaries.

      Your trading example is a strong one because observability and risk limits are not optional there. That kind of workflow needs governance before automation becomes trustworthy.

  3. 1

    The governance layer space is getting crowded fast, but the hard part is getting adoption before "AI governance" becomes standard boilerplate in every stack. What's your entry point — are you targeting teams that already have AI incidents, or trying to get in front of the problem before it happens?

    The "developer preview" framing is smart. It signals "we're serious enough to give you access, but we're listening" — which is exactly the right credibility posture for a governance tool where trust is the primary product.

    What does the happy path look like for a developer integrating this for the first time? The first 15 minutes of the experience will make or break early retention.

    1. 1

      That’s a very fair point. The space is definitely getting crowded, so the entry point matters a lot.

      Right now I’m not trying to sell NEES only as a “compliance/governance” product after something goes wrong. My preferred entry point is earlier: teams that are already building AI apps or agents and are starting to feel the gap between prototype behavior and production behavior.

      Usually the trigger is not a major AI incident yet. It is smaller friction:

      • inconsistent AI responses across users
      • unclear memory/context behavior
      • no trace of why a response happened
      • difficulty debugging model routing or fallback logic
      • different assistants behaving differently inside the same product
      • teams realizing prompts alone are not enough

      So the wedge is: “before you scale AI usage, add a governed runtime layer so behavior is easier to control and review.”

      On the first 15 minutes, I agree completely. The happy path needs to be extremely simple:

      1. Get developer API key
      2. Copy Python / Node / cURL quickstart
      3. Send first governed request
      4. Receive a normal AI reply plus governance metadata
      5. See trace ID, mode, policy status, and engine source
      6. Understand: “this is not just another model call — this gives me a control/audit layer”

      The ideal first win is not a complex enterprise policy setup. It is the developer seeing one request go through NEES and immediately understanding the value of traceability and runtime control.

      I’m also thinking the next improvement should be a “15-minute integration guide” in the repo, with a very small sample app or GitHub Actions example so builders can test it without reading too much documentation.

      Your comment is useful because it points to the real adoption risk: governance has to feel lightweight at the start, not like enterprise overhead.

  4. 1

    Interesting approach. The governance-as-middleware
    pattern makes sense for traditional AI apps where you
    control the request pipeline.

    I'm solving a related problem but from the opposite
    direction. I built a blockchain where AI governance
    lives at the protocol level instead of the application
    level. Every AI entity on the chain has a capability
    bitfield and an autonomy mode enforced by the
    dispatcher before any transaction is routed. The chain
    itself answers "is this an AI, what can it do, what
    has it done" without needing an external governance
    layer.

    The trade-off is scope. Your approach works for any AI
    app talking to any model provider. Mine only works for
    AI agents operating on-chain. But the enforcement is
    stronger because the governance rules live below the
    agent, not beside it. The agent can't bypass them.

    The trace ID concept maps well to what we do with
    signal commitments. Every AI output on NOVAI is a
    typed signal indexed by issuer and block height.
    Native audit trail, no external logging needed.

    Different problem spaces but the same core insight:
    production AI needs governance that the AI itself
    can't override.

    1. 1

      This is a great framing — application-level governance vs protocol-level governance.

      I agree with your trade-off point. NEES is intentionally middleware/runtime oriented because I wanted it to work across normal AI apps, hosted backends, assistants, agents, and different model providers without requiring the whole system to be on a specific protocol.

      But your protocol-level enforcement point is strong: if governance lives below the agent, the bypass risk is much lower. That is a real advantage for on-chain agents where the execution environment itself can enforce capability boundaries.

      The way you describe capability bitfields and autonomy modes sounds close to what I think production AI needs conceptually:

      • what is this AI entity?
      • what is it allowed to do?
      • what mode is it operating in?
      • what has it already done?
      • can its actions be audited later?

      NEES approaches this from the app/runtime side with policy, identity rules, memory scope, runtime mode, and trace IDs. Your approach sounds like it pushes similar questions into the protocol layer.

      I like your line: production AI needs governance that the AI itself can’t override.

      That’s exactly the core insight.

      Different environments, but same direction: AI systems need enforceable control planes, not just prompts or logs.

      1. 1

        That's a good insight. "AI entities as first-class
        citizens" is landing better than "Layer 1 in Rust"
        even though it's the same thing underneath. The
        identity blog going out today is exactly that reframe.
        Thanks for pushing me on the copy angle.

      2. 1

        Your five questions are exactly the right framework.
        What is it, what can it do, what mode is it in, what
        has it done, can it be audited. That's the checklist
        regardless of whether you enforce it at the app layer
        or the protocol layer.

        The interesting space is probably where both layers
        work together. Protocol enforcement for on-chain
        agents, runtime governance like NEES for off-chain
        agents, and some bridge between them for agents that
        operate across both.

        Good luck with the developer preview. Will keep an
        eye on it.

  5. 1

    Interesting governance layer concept! We run a content platform with heavy AI integration mostly via GitHub Actions workflows. Currently wrestling with complex model routing/fallback logic and custom observability.

    The governance angle is compelling - we don't have policy enforcement for content consistency.

    Main question: how well does NEES work with batch/pipeline scenarios vs. interactive applications? Most of our AI calls are in automated workflows rather than user-facing.

    Would love to see docs on Actions integration patterns if you're considering that use case.

    1. 1

      Thanks — this is a very useful use case.

      NEES started from interactive governed AI flows, but batch/pipeline scenarios are definitely relevant, especially for content platforms, automation workflows, and GitHub Actions-style systems.

      In a batch workflow, I think the governance layer should behave less like a chat middleware and more like a policy + trace checkpoint around each AI step.

      For example:

      • route model/provider based on task type or risk level
      • enforce content policy before/after generation
      • attach trace IDs to each generated artifact
      • record which mode/policy/model path was used
      • apply fallback rules when a provider fails
      • flag outputs that need human review
      • keep governance metadata with the pipeline result

      So instead of only governing user-facing responses, NEES could govern automated AI tasks like:

      GitHub Action → NEES governed call → model provider → governed output + trace metadata → workflow continues or pauses for review

      You’re right that Actions integration docs would be useful. A practical next doc could be: “Using NEES Core Engine in GitHub Actions for governed AI workflows.”

      That would probably include examples for content generation, review gates, fallback routing, and trace logging.

      Your comment is helpful because it expands the developer preview beyond chat apps into pipeline governance. I’ll add this use case to the roadmap.

  6. 1

    Hi Anna, your approach to adding a governance layer for production AI via NEES Core Engine is very timely, especially regarding memory boundaries and traceability. Since you've just opened the developer preview and are looking for feedback from builders, would you be interested in scaling the distribution of your engine to reach more AI developers and enterprises?

    1. 1

      Thanks, WenWang. Yes, I’m open to exploring distribution and partnership conversations, especially if the focus is reaching serious AI builders, dev-tool users, and teams working on production AI systems.

      Right now my priority with the developer preview is to validate a few things with real builders:

      • whether the governance layer is useful in actual AI app workflows
      • whether the API/docs are clear enough
      • what traceability fields developers expect
      • how teams think about memory boundaries and runtime policy
      • what kind of SDK or integration path would make adoption easier

      So I’m open to scaling distribution, but I’d like to keep it aligned with meaningful developer feedback rather than just broad visibility.

      Happy to connect and understand what kind of distribution channel or partnership you have in mind.

  7. 1

    Interesting approach most apps skip this layer.
    Traceability and identity consistency sound especially useful.
    Curious how you handle memory boundaries.
    Overall, promising idea.

    1. 1

      Thanks, Maria — memory boundaries are one of the main reasons I started thinking about NEES as a governance layer instead of just another AI app.

      The basic idea is that memory should not be treated as unlimited context that automatically flows into every response.

      In NEES, memory is meant to be governed by scope and intent. For example:

      • what belongs only to the current session
      • what can be reused across sessions
      • what should require explicit consent
      • what should never influence a response
      • what needs to be traceable when used

      So the goal is not just “give the AI more memory,” but to control when memory is used, why it is used, and how that usage can be reviewed.

      Still early, but that boundary between helpful continuity and unsafe overreach is exactly the area I want developer feedback on.

  8. 1

    This is exactly the kind of infrastructure that becomes essential once you move past prototype phase. I'm running into the same pattern with ClipForge (AI video repurposing) — the first version just calls the model directly, but as soon as you have multiple users with different contexts, you need governance boundaries.

    The trace ID piece is what I'd find most valuable. When an AI response goes wrong in production (and they will), being able to trace back through the exact prompt, model version, and governance rules that applied at that moment is the difference between fixing in hours vs days. Without it you're debugging blind.

    One question: how are you handling the latency overhead of the governance layer? Adding a proxy between app and model inevitably adds some milliseconds — is it noticeable in chat-style use cases where users expect sub-second responses? I'd be interested in seeing benchmarks if you have them.

    1. 1

      Thanks — this is exactly the pattern I’m trying to validate with the developer preview.

      I agree with you on trace IDs. Once AI starts serving real users, debugging cannot stay at the “what prompt did we send?” level. Teams need to know what policy, mode, memory scope, identity rules, and provider/model path were active for that specific response. Otherwise production debugging becomes guesswork.

      On latency: yes, adding a governance layer does introduce overhead, so I’m treating latency as a core product constraint, not an afterthought.

      The current approach is to keep the runtime layer lightweight and separate the governance checks into practical stages:

      • fast request validation
      • mode / policy resolution
      • memory scope decision
      • model call
      • response metadata / trace generation

      For chat-style use cases, the model call is usually the dominant latency cost, but the governance layer still needs benchmarking across different request sizes and modes. That is one of the things I want early builders to test and push on.

      I haven’t published formal benchmarks yet, but I agree that they should be part of the developer preview roadmap. A useful benchmark set would probably include:

      • direct model call vs NEES-routed call
      • simple chat request
      • request with memory/context
      • request with stricter policy checks
      • concurrent requests

      There is also a live sample app connected to the governed runtime here:
      https://naina.nees.cloud

      It is useful for seeing the governed response flow in a real app interface, while the GitHub repo is mainly for API docs, quickstarts, and developer preview access.

      Your ClipForge example is a good use case because multi-user context and content workflows are exactly where direct model calls start becoming hard to manage. If you ever test a governed flow around video repurposing, I’d genuinely like to hear what kind of trace/governance fields would be most useful for your workflow.

    2. 1

      This comment was deleted 3 days ago.

  9. 1

    Most AI infra products stop at orchestration.

    The harder layer is making model behavior auditable once AI starts touching real users, real decisions, and real risk.

    That’s the right layer to build.

    The product feels heavier than the name though.

    NEES Core Engine explains what it is, but not what category it owns.
    It reads more like an internal system name than the control layer teams build around.

    If this keeps moving toward policy enforcement, traceability, and runtime governance for production AI, the naming should probably carry more infrastructure weight than “Core Engine.”

    Vroth.com would fit that direction much better.

Trending on Indie Hackers
Agencies charge $5,000 for a 60-second product demo video. I make mine for $0. Here's the exact workflow. User Avatar 87 comments I wasted 6 months building a failed startup. Built TrendyRevenue to validate ideas in 10 seconds. User Avatar 53 comments Your files aren’t messy. They’re just stuck in the wrong system. User Avatar 28 comments Built a tool that finds which Reddit/HN threads are making ChatGPT recommend your competitors User Avatar 26 comments Why Direction Matters More Than Motivation in Exam Preparation User Avatar 14 comments I built a health platform for my family because nobody has a clue what is going on User Avatar 13 comments