Building NoAIBills: A Chrome Extension That Runs LLMs Locally Using WebGPU, Transformers.js, and Chrome's Prompt API

TL;DR: I shipped NoAIBills, a Chrome extension that runs Llama, DeepSeek, Qwen, and other LLMs entirely in your browser. No cloud, no Ollama, no subscriptions. It's free—I'm collecting emails and learning what people actually use local AI for.

Try it: noaibills.app

The problem I kept running into

I wanted to use AI for everyday stuff—drafting emails, fixing grammar, summarizing docs, getting unstuck on code. But three things kept bugging me:

1. Privacy concerns. Every time I typed something even slightly sensitive into ChatGPT, I'd hesitate. Work stuff, personal notes, code I didn't want on someone else's server. I know I'm not the only one.

2. The "local AI" setup tax. Ollama is great, but try explaining it to a non-dev. "Open terminal, run this command, download a model..." Eyes glaze over. And if you're on a locked-down work laptop? Forget it.

3. Subscription fatigue. $20/month for something I use a few times a week felt like overkill. I didn't need GPT-4 for grammar checks.

So I thought: what if AI could run inside the browser? No installs, no servers, no monthly fees. Just a Chrome extension.

Turns out WebGPU makes this possible now. So I built it.

What I built

NoAIBills is a Chrome extension that runs open-source LLMs locally in your browser using WebGPU, Transformers.js, and Chrome's built-in Prompt API.

Models: Llama 3.2, DeepSeek R1, Qwen, Mistral, Gemma, Phi, SmolLM
Privacy: Everything stays on your machine. Messages stored in IndexedDB—export or delete anytime.
Offline: Once a model is downloaded, it works without internet.
No setup: Install the extension, pick a model, start chatting. That's it.

It's not trying to replace GPT-4 or Claude. It's for the 80% of tasks where you don't need a massive cloud model—drafts, summaries, quick coding help, grammar fixes.

Why I made it free

I went back and forth on pricing. Considered $1, $5, $9. Here's where I landed:

I don't know enough yet.

I don't know who actually wants this, what they use it for, or whether they'd pay. Charging upfront would reduce installs and slow down learning.

So the deal is simple: it's free in exchange for your email and anonymous usage stats (via GA4). Your conversations stay completely private—I never see them.

The goal right now is to build an audience and figure out:

Who are these users? (Devs? Writers? Privacy-conscious normies?)
What do they actually use it for?
Is there a segment that would pay for premium features later?

Maybe I'll monetize eventually. Maybe I won't. Right now I'm optimizing for learning, not revenue.

The honest positioning

Local AI is good enough for most things. I'm not claiming it replaces GPT-4. But for the 80% of tasks—drafts, summaries, quick coding questions—a 3B parameter model running locally is plenty.

Not positioned as a cloud LLM replacement—it's for local inference on basic text tasks (writing, communication, drafts) with zero internet dependency, no API costs, and complete privacy.

Core fit: organizations with data restrictions that block cloud AI and can't install desktop tools like Ollama/LMStudio. For quick drafts, grammar checks, and basic reasoning without budget or setup barriers.

Need real-time knowledge or complex reasoning? Use cloud models. This serves a different niche—not every problem needs a sledgehammer 😄.

The build

Tech stack for the curious:

Next.js front end with Vercel AI SDK
WebLLM (MLC) for WebGPU inference
Transformers.js for ONNX/CPU inference
Chrome Prompt API for built-in Gemini Nano (no download needed)
Dexie for IndexedDB storage
Zustand for state management

The tricky part was making it work as a Chrome extension—CSP restrictions, WASM loading, static export. Wrote a bunch of post-build scripts to handle the edge cases.

One codebase, one build process. Dev mode runs locally, production becomes an extension.

Launch strategy

I'm rolling this out gradually:

Week 1: Twitter, r/LocalLLaMA, r/SideProject, r/ChatGPT
Week 2: Hacker News (Show HN), r/selfhosted, r/privacy
Week 3: Indie Hackers, Dev.to
Later: AI directories

Spacing it out to avoid spam flags and to actually engage with each community.

What I've learned so far

1. Distribution is the hard part. Building it was fun. Getting people to find it? That's the real challenge. Chrome Web Store organic discovery is basically zero.

2. "Free" removes friction but adds noise. More installs, but harder to tell who's actually engaged vs. who installed and forgot.

3. Local AI is good enough for most things. A 3B parameter model running on your GPU handles 80% of what I used ChatGPT for. The other 20%? I still use Claude.

4. Be honest about limitations. My landing page has a "When NOT to use NoAIBills" section. I think that builds more trust than pretending it does everything.

What's next

Watch the usage data and talk to users
Figure out if there's a paid tier worth building (prompt library? Ollama integration? More models?)
Keep shipping and learning

Try it out

If any of this resonates, I'd love for you to give it a spin:

👉 noaibills.app

And if you have feedback, questions, or just want to chat about browser-based AI—I'm here. Hit me up in the comments.

What would you do differently? Charge from day one? Different positioning? I'm all ears.