I kept running into the same wall while talking to developers at healthtech and fintech companies: they wanted to use LLMs to automate workflows, but their data had names, emails, Aadhaar numbers, PAN cards, SSNs in it. Sending that to OpenAI or Anthropic felt wrong — legally and ethically.
Most teams were either skipping LLMs entirely or hand-rolling their own scrubbers. Neither felt like the right answer.
So I built Armos.
It wraps the OpenAI and Anthropic Python SDKs. Before your prompt goes out, PII is detected locally (nothing leaves your machine during detection), replaced with reversible tokens. The LLM sees tokens, responds with tokens, Armos swaps real values back. Your app gets the original text. The model never does.
The entire integration is one line:
client = ArmosOpenAI(OpenAI())
Where I am:
What I'm looking for:
Still early and rough around the edges. Would love to connect with anyone
hitting this problem.
GitHub: https://github.com/armos-ai/armos-python
Docs: https://armos.dev
pip install armos
The technical wedge is right, but the actual buyer for this in healthtech and fintech is usually the compliance officer, not the developer. Lean your roadmap into the audit story: what got masked, when, who saw the tokens, and proof the model never saw the originals. Developers ship it. Compliance pays for it. Also worth talking to MSPs and consultancies serving regulated industries. They have the design partners and the deal velocity. Happy to introduce you to a couple if useful.
Really appreciate the framing — compliance officer as the buyer makes sense, and the audit trail is already on the roadmap for exactly that reason.
On the intros — would love that actually. Even if it's early for MSP partnerships, talking to people close to the problem in regulated industries would help shape the roadmap. If you're open to making a couple of intros, I'm at [email protected].
Privacy infra for AI is going to become massive as adoption scales. Open-source is a smart move here — builds trust much faster in security-related tooling.
Thanks! That's the moat. Would love for you to try it out and share your thoughts! And if you find it useful, starring the repo would mean a lot — it really helps with visibility.
The local-first part feels like a big deal here.
If a team is already nervous about sending sensitive data to an LLM provider, asking them to send it through another hosted tool would probably be a much harder sell...
Thanks! That is what i am trying to achieve with Armos! Let's see how it goes.
Would love for you to try it out and share your thoughts! And if you find it useful, starring the repo would mean a lot — it really helps with visibility.
Awesome idea brother
Thanks! Would love for you to try it out and share your thoughts! And if you find it useful, starring the repo would mean a lot — it really helps with visibility.
Would love to but for me currently not necessary. But if we extend i will come back to you
I like this idea.
Would love for you to try it out and share your thoughts! And if you find it useful, starring the repo would mean a lot — it really helps with visibility.
Sure thing.
really like the local-detection-first design, that's the part most hand-rolled scrubbers get wrong.
one thing worth being clear with your health/fintech design partners on: masking reduces the exposure but it doesn't take openai/anthropic out of their subprocessor chain. the api call still happens, so in a security review they'll still get asked "is anthropic a subprocessor, and is it disclosed?" the honest framing is risk-reduction, not "you don't have to disclose the llm anymore" - if a partner assumes the latter you'll both get burned in an audit.
also curious: where does the reversible token <-> real value map live? if it's persisted anywhere, that store kind of becomes the new crown jewel (and the new audit target). in-memory only?
Both points are exactly right and worth saying clearly.
On the subprocessor chain — yes, masking doesn't remove OpenAI or Anthropic from the picture. The API call still happens, they're still processing data, they still need to be disclosed. What changes is what they process: tokens with no intrinsic meaning instead of raw PII. "John Peter" becomes [PII:NAME:c4587843]. The model reasons over that token fine — but if that data ever leaked from the provider's side, there's nothing recoverable. That's risk reduction, not compliance elimination, and I should be more explicit about that framing.
On the vault — by default it's in-memory only, ephemeral per process. No persistence, nothing to audit. For multi-turn conversations where you need tokens to survive across requests, there's an optional Redis backend — but it's always the customer's own Redis, never ours (there is no ours). You're right that the Redis store becomes the new crown jewel: it holds the token→value map, so it needs to be treated like any other sensitive datastore — private network, auth, TLS, TTL on entries. Happy to document this more explicitly as a security consideration.
Good timing on this. We supply regulatory data (congressional bill tracking, vote alerts, hearing schedules) via goffer.ai webhook — and the compliance teams using it are exactly your target: fintech/legal teams that pipe bill summaries alongside client portfolios through LLMs. The PII bleed problem is real in that workflow — bill impact analysis often has client reference data in context. The reversible tokenization approach is the right call over regex scrubbers. For the data side: goffer.ai covers Congress.gov well; for state-level we layer in OpenStates. Anyone building LLM workflows specifically around FTC or SEC regulatory action feeds?
Good to know the problem is real in that workflow. The fintech/legal teams actually building those LLM pipelines are exactly who I'm looking for — if any of them are hitting this directly, I'd love an intro. [email protected]
Curious about two things on the reversible-token design. When the same name appears multiple times in a single prompt, do you assign one stable token per entity (so the model can correlate "TOKEN_PERSON_3 emailed TOKEN_PERSON_3 again") or fresh ones each call? And do you scan outbound text too, in case the model invents new entities in its reasoning? The round-trip semantics feels like what'll make or break day-2 usage on a real workflow.
Both are handled — or partially. Same entity always gets the same deterministic token across a prompt (it's an MD5 hash of the value, so "John Smith" → same token every time regardless of how many times it appears). The model sees one consistent entity, not three strangers.
Outbound scanning is the gap — right now we demask tokens we recognise but don't run detection on the model's response for newly invented entities. That's a real edge case worth solving, especially in multi-turn workflows. On the roadmap.
You're right that the round-trip semantics are what matter on day 2 — most masking tools don't think past the outbound pass.
PII handling in LLM pipelines is trickier than it looks — especially when users mix languages mid-sentence. What's your approach for named entities in non-English text? Curious if you've tested it with Spanish/multilingual inputs.
Not planned yet, but if you're hitting this with Armos I'd love to understand the use case — happy to explore it as we expand language support.
I love this..., if you are looking for high quality leads for your business and want help with scaling, you can send me a message on telegram @caseyimafidon, let's help you make money
nice that the integration is one line, that's usually where these things die so good call.
honestly the part i'd stress test is recall on detection. tokenizing what you catch is the easy bit, the real risk is the span you miss, a name in a weird format or an id that doesn't match a known pattern, and one miss means real pii goes to the model anyway. is there a way to fail safe on low confidence stuff, or at least surface what it almost flagged?
other thing, the token map is reversible so that mapping kind of becomes the new sensitive asset itself. where does it live and how long does it stick around? thats probably the first thing a security person at a healthtech would poke at.
Thanks for posting this!
On the fail safe part, let me come back on this, this is a good point which needs to be done.
On the vault: in-memory by default — lives in process RAM, gone when the process ends, nothing persists. For multi-turn conversations there's an optional Redis backend, but it's always the customer's own Redis instance, never mine (there's no Armos server). Default TTL is 24 hours. Either way, the mapping never leaves your infrastructure.
This is a strong wedge because you are not selling “LLM security” in a vague way. You are solving a specific blocker that sensitive-data teams already feel: they want LLM automation, but they cannot casually send names, IDs, tax data, health data, or legal records into external models.
The local detection plus reversible token layer is the right trust angle. I would make that the center of the positioning: Armos is not just a wrapper, it is the privacy boundary between regulated workflows and LLM APIs.
One thing I’d pressure-test before the HN post and design partner conversations is the name. Armos is decent, but for healthtech, fintech, legal, and HR developers, the brand has to immediately feel secure, technical, and serious. This is infrastructure sitting between sensitive data and foundation models, so the name carries trust before the docs even do.
Vroth .com would fit that layer better if you want it to feel like hard security infrastructure for LLM workflows, not just an open-source SDK. The product direction is strong enough that naming is not cosmetic here. It affects whether security-conscious developers read it as a real privacy layer or another early wrapper.
Really appreciate this — the "privacy boundary between regulated workflows and LLM APIs" framing is sharper than how I've been positioning it. Stealing that.
On the name — I hear you, and I don't disagree that names carry
trust in security infra. But I'd rather not sweat it at this stage.
No paid users, no enterprise contracts, nothing that makes a rebrand painful. If the
product earns trust with the right teams, Armos won't have been the thing that stopped them. I'll revisit naming seriously before any real scaling push.
What I'm more focused on right now is getting it in front of sensitive-data teams and letting them pressure-test the actual trust layer — the local detection, the reversible tokens, the zero PII to the model. That's where I want the feedback loop first.
Are you building in any of these spaces? Would love to hear where you'd see this fitting or breaking.
That makes sense. If there are no paid users or enterprise contracts yet, getting the trust layer pressure-tested matters more than renaming today.
I’m not building directly in healthtech/fintech/legal, but the strongest fit I see is sensitive-data workflows where teams already want LLM automation but cannot justify sending raw PII into external models.
Examples:
healthtech admin/support workflows
legal intake and contract review
fintech support/compliance notes
HR records and employee data
insurance claims
B2B SaaS tools handling customer records
Where I think it breaks is if Armos is framed as an SDK feature instead of a privacy boundary.
The sharper first-user angle is probably:
“Use LLMs in sensitive workflows without sending raw PII to the model.”
That is much easier for sensitive-data teams to understand than a generic “LLM security” pitch.
If useful, I can put together a quick GTM/outreach pack around this wedge: target profiles, 3 cold emails, 3 LinkedIn DMs, 3 follow-ups, and the cleanest positioning angle for getting design partners.
I’m doing a few quick ones at $49 to move fast. This one is a good fit because the pain is specific and the buyer profile is clear.
LinkedIn: https://www.linkedin.com/in/aryan-y-0163b0278/
Thanks for this — genuinely useful framing.
On the GTM pack — I appreciate the offer, but right now I'm not focused on outreach. The goal at this stage is developer adoption and finding 3–5 design partners who are actually hitting this problem, so I can let the roadmap be shaped by real use cases before I start selling anything. Happy to revisit when that changes.
If you're building something where this friction comes up, I'd love to hear what it looks like from the inside.
That makes sense. Design partners are the right step before selling this broadly.
I’d separate broad outreach from design-partner discovery.
For Armos, the useful motion is not “pitching buyers.” It is finding developers or teams already blocked by the exact workflow: they want LLM automation, but raw PII, compliance, or customer-record risk stops them from using external models safely.
So I would not target generic security teams first.
I’d target builders inside healthtech, legaltech, fintech, HR, insurance, and support-heavy B2B SaaS who have already tried to connect LLMs to sensitive internal workflows and hit the data-boundary problem.
That is where the conversation becomes product discovery, not sales.
The sharper ask is probably:
“Are you currently avoiding or limiting LLM automation because sensitive customer data would leave your environment?”
That gets you closer to the 3–5 design partners you actually need.
If you want to move faster on that, I can put together a small design-partner discovery pack: target profile, qualification angle, 3 discovery messages, 3 follow-ups, and the exact problem framing to use.
I’d keep it tight and practical, not broad GTM.