4 Comments

Here’s how to price AI so your costs don’t explode

by Aytekin Tank

If your product uses AI, every click increases your costs.

Here's how to turn LLM pricing into plans, credits, and limits that won’t come back to bite you.

1. Forget tokens — think in “AI actions”

If you start from “tokens”, your brain melts. Start from what your user actually does.

Write down the main things in your product that use AI. For example:

“Ask my assistant a question”
“Summarize this document”
“Rewrite this text in a new tone”
“Run an agent over these items” (e.g., emails, leads, notes)

Each of these is a single AI action. You’ll want to break every workflow or feature into single actions.

2. Separate cheap models from expensive models

In your product, you will usually have two kinds of models:

Cheap models – fast, good enough for simple tasks, low cost per call
Expensive models – better reasoning, handle more context, much higher cost per call

Give them simple labels in your system:

Standard AI → cheap mode
Deep AI → expensive model

These labels are only for you and your team. Users don’t choose them.

From here on, we’ll talk about Standard vs Deep, not specific model names. That way this works no matter which AI provider you use.

3. Work out what each action costs

Use the same action list from step 1.

Now you want to know: “How much does one of these actions cost me, roughly?”

If you don’t have users yet

You can’t get real prices, but you can measure size.

For each action:

Input: short / medium / long?
Output: short / medium / long?
Calls: one call or many calls (agent)?

Use this:

Short = 1–2 sentences, or a short chat
Medium = a few paragraphs, about one page
Long = many pages, or a long chat history

Examples:

Chat reply → short in, short out, 1 call → very cheap
Summarize 3 pages → long in, short out → cheap
Summarize a long report → very long in → medium
Agent over ~100 items (e.g. 100 emails, 100 leads, 100 notes) → many calls → expensive

Now you know which actions are light and which ones are heavy.

If your product is live

Now you can get real numbers.

Track the action name For each AI call, try to record which feature it came from (e.g. “chat_reply”, “short_summary”, “agent_run”).
Look at one action at a time In your logs or analytics, filter by that action name and find:
- total cost for that action
- number of runs
Compute average cost

Cost per action ≈ total cost for that action ÷ number of runs

Do this for every action on your list.

At the end of this step you have:

A list of all actions
The average cost of one run of each action

You’ll use this in the next step to get cost per user per month.

4. Turn action costs into cost per user

You now know the cost of one run of each action.

Now ask: “For a normal paying user, how many times does each action run in a month?”

If you have users: use analytics to see how often each action is used per user.

If you don’t: make a simple and honest estimate for each action.

For each action: Cost per user for this action = (uses per month) × (cost per action)

Add all actions together: AI cost per active user per month.

You will use this number to:

Choose your prices
Set your credits and limits

5. Add your other costs and choose a safe margin

AI is not your only cost.

You also pay for things like:

Servers
Storage
Support
Tools and fees

If you have users:

Take all these non-AI costs for one month
Divide by the number of active paying users

That gives other cost per user.

If you are pre-launch: Make a simple estimate, and update it later when you have real data.

Then: Total cost per user = AI cost per user + other cost per user

Later, you’ll set your prices so they are well above this total cost.

Remember: As you grow, this number can change (for example, servers can get cheaper per user).

So check it again from time to time.

6. Turn your costs into credits and limits

You can charge in many ways:

Just a subscription
Subscription + included credits
Or pure usage (pay per use)

Here we’ll assume you use credits, either alone or on top of a subscription.

Give each action a credit cost

Use your cost per action from step 3.

Pick simple numbers like 1, 2, 5, 10 and assign them to actions:

1 credit → one chat reply
2 credits → one short document summary
5 credits → one deep analysis of a long document
10 credits → one agent run over ~100 items

More expensive actions should use more credits.

Decide how many credits are in a plan

From steps 4 and 5, you know:

How much AI you spend on a normal user
And your total cost per user

Decide how much AI cost you’re happy to include in this plan (for example, “about $5 of AI per user”).

Then:

Pick a number of credits so that a normal user can do their usual work
And your AI cost stays around that target

In the UI, you can show something simple like: “You’ve used 700 / 1,500 AI credits this month.”

7. Make the product waste less AI by design

You can also save a lot by changing how you call the model.

Here are a few simple tricks that don’t hurt UX:

Use the cheap model most of the time. Use it for normal chat, small edits, and short summaries. Most people won’t see the difference. They only care that it’s fast and clear.
Keep context small. Don’t send the whole chat every time. Make a short summary of old messages and send that instead. For documents, ask the user what they care about, and only send those parts.
Keep answers short. In your prompt, say the AI something like: “Keep the answer under 200 words.” “Give 3 bullet points only.”
Cache old answers. If people ask the same question many times, save the answer. Next time the question is the same, show the saved answer instead of calling the AI again.

These simple steps can cut your AI bill a lot.

Aytekin Tank

on February 25, 2026

Say something nice to aytekin…

Post Comment

1

This is really practical. The "think in AI actions" framing is exactly right. I run an AI calorie tracking app (Healthien) where users snap photos of food and get nutritional breakdowns. Early on I was trying to estimate costs per token and it was impossible to plan around. Once I reframed it as "one photo analysis = one AI action" everything clicked. I ended up with a free tier (limited scans per day) and subscriptions for unlimited use. The hardest part was figuring out the free tier ceiling. Too generous and you bleed money, too stingy and nobody converts. Still tweaking it honestly.

miadevelops

·
3 months ago
·
Reply
1

Solid framework. The action-based thinking is spot on - I build small business finance tools and had to work through this exact problem when adding AI categorization to a CSV processing workflow.

One nuance worth adding: for tools where the AI gets smarter per user over time (like learning their custom categories or preferences), your cost per user actually decreases the longer they stick around. First month might cost you 3x what month six costs because the system has learned their patterns and needs fewer expensive calls. That changes the math on how aggressively you can price early tiers to drive adoption.

The credit system approach works well for B2B tools but I have found that for prosumer or solo founder tools, simplicity wins over precision. Most solo founders would rather pay a flat $15/mo with generous limits than think about credits at all. Credits add cognitive overhead that can hurt conversion even if the economics are better for you.

Also agree hard on the cheap vs expensive model split. For something like transaction categorization, a smaller model with good few-shot examples outperforms a larger model with generic prompting almost every time. The expensive model is only worth it for ambiguous edge cases.

TaxSort

·
3 months ago
·
Reply
1

The "think in AI actions, not tokens" reframe is probably the most useful thing here. I spent way too long trying to estimate token costs per feature before realizing users don't think in tokens at all. They think in "I clicked the button and it did the thing." Pricing should match that mental model.

One thing I'd add to the caching point — semantic caching is a game changer if you haven't tried it. Instead of exact-match caching, you embed the query and check if there's a cached response within a similarity threshold. For something like a support chatbot where 40% of questions are slight variations of the same 20 questions, this alone cut our API costs by about a third.

The cheap vs expensive model split is something I wish I'd done from the start. I was running everything through GPT-4 class models for months before I actually benchmarked and realized that for 70% of my use cases, a smaller model produced identical user satisfaction scores. The remaining 30% where it mattered were the complex reasoning tasks — and those are exactly where users expect to wait a beat longer anyway.

Also worth mentioning: prompt caching on Anthropic and the batch API on OpenAI. If you have non-real-time workloads (nightly report generation, background analysis), the batch API gives you 50% off. That's not a small optimization.

trinhcuong_ast

·
3 months ago
·
Reply
1

ai pricing is genuinely the thing that keeps me up at night. running two ai-powered apps (astrologica for personalized horoscope podcasts, speakeasy for article-to-audio conversion) and the cost per user varies wildly depending on usage

the trick ive found is capping the expensive operations. astrologica generates one podcast per day per user - thats predictable. speakeasy lets you convert a few articles per month on the free tier. that way i can actually model my unit economics without getting destroyed by one power user converting 50 articles a day

the biggest mistake i made early on was pricing based on what competitors charge instead of what it actually costs me to serve each user. speechify charges 140/yr but they have massive scale advantages. as a solo dev i had to price differently

subscription with usage limits > pure usage-based pricing for consumer apps imo. users hate surprise bills

nimesh

·
3 months ago
·
Reply