Don't be an LLM wrapper

by Rohit

Or be a really good one, as the Replit founder suggested at a hackathon - "All software could be considered an AWS wrapper"

https://twitter.com/jumbld/status/1657273165393870850?s=20

I've built and launched an AI content writing app in my previous company which scaled to 100s of millions of generations. I learnt some lessons the hard way and wanted to share them fwiw.

These should be considered a checklist for going to production with your AI app. I try to highlight the mistakes I made and the solutions that worked.

🔴 Mistake #1: Didn't think about output validations
This can lead to having user cohorts who never like the output because their inputs are very different from the test users.

✅ Solutions:

Basic: Check for outputs below a character count or implement guardrails
Advanced: Relevance checks, rank your results, closed domain Q&A only
Expert: Model-based checks (“Are you sure?”)

🔴 Mistake #2: Thought who would DDoS me?
DDoS on a server means you might experience downtime, while DDoS on an LLM system would mean downtime + a huge cost.

✅ Solutions:

Basic: Add a captcha
Advanced: Rate-limit users, organisations
Expert: IP based monitoring and fingerprinting

🔴 Mistake #3: Didn't limit users
Similar to DDoS, some users would just consume a lot more than others and having a shared API key meant a few users having a worse experience.

✅ Solutions:

Basic: Client-side rate-limiting. Debounce.
Advanced: Users, organisations and segment based rate-limits
Expert: Dynamic rate limiting

🔴 Mistake #4: Didn't care about latency
Inference API latencies can almost break the snappy user experience an app delivers.

✅ Solutions:

Basic: Implement streaming
Advanced: Implement streaming properly, Caching
Expert: Fallbacks, Queues, Sem Cache

🔴 Mistake #5: Retro-fitted Datadog for logs & metrics
Datadog and similar tools aren't meant for large text logs OR probabilistic API outputs. Learnt this the hard way.

✅ Solutions:

Basic: Live with it
Expert: Build own monitoring layer with elastic/kibana/grafana

🔴 Mistake #6: Didn't bother about data privacy
When you're small, you may not worry about the fines but customers care a LOT about privacy. Think of this as a leaky bucket.

✅ Solutions:

Basic: 🙈
Advanced: Amend GDPR, Cookie and Privacy Policies and PII Masking
Expert: microsoft/presidio OR Azure service

https://twitter.com/jumbld/status/1655549041256861696?s=20

Getting prod-ready is a marathon not a sprint!

P.S. - I'm building a tool to help gen ai apps & features become prod-ready gaining from my experience (portkey.ai). Happy to give IH folks a demo.

Rohit

posted to

Artificial Intelligence

on May 24, 2023

Say something nice to jumbld…

Post Comment

1

the replit analogy is spot on. every SaaS is a wrapper around something — the value is in the workflow and the decisions you make on top of the raw API.

one thing I learned building on top of LLMs: the routing layer is where the real value lives. choosing which model handles which task, when to use the expensive one vs the cheap one, how to handle failover — that is genuine product logic, not just wrapping an API call.

the wrapper problem only applies when you are literally just proxying API calls with a UI on top. if you are making intelligent decisions about the infrastructure layer, that is a real product.

sophiaa

·
2 months ago
·
Reply
2

This is super interesting. This actually seems super relevant to enterprise customers. Have you thought about that?

MarcusKing

·
3 years ago
·
Reply
1. 2
  
  Yes, guessing this is useful to anyone in production with LLMs. Enterprises will need a lot more.
  
  That's why I tried adding the "Basic", "Advanced" and "Expert" categories.
  
  jumbld
  
  ·
  3 years ago
  ·
  Reply
1

Great checklist! I also faced the same learnings from mistakes 4, 5, and 6. I'm curious, where did you get your TOS and privacy policy from, and did you have these terms in place before you launched?

TheRealAneesh

·
3 years ago
·
Reply