1
0 Comments

The tech stack behind NLP Cloud

Hi everyone,

I founded NLP Cloud (https://nlpcloud.com) around 4 years ago. It's an AI API that I propose as an OpenAI alternative for users who want a special focus on privacy, ease of use, and quality of support.

I would like to tell you a bit more about my tech stack. This stack hasn't changed much since the beginning actually, except that I now handle hundreds of AI models instead of only a couple of models at the beginning.

The key here is container orchestration. My whole stack is based on Docker containers. Each AI model is deployed within a Docker container on a specific GPU server. My initial orchestrator was Docker Swarm. Now it's Kubernetes. I would not necessarily recommend Kubernetes for new startups though as it might be unnecessarily complex.

In front of all these containers, I use a load balancer based on Traefik to route the customer requests to the right AI models. Traefik's documentation is a bit cryptic, but apart from that I've never been disappointed by this tool!

In terms of billing, I propose both pre-paid and pay-as-you-go plans. Pay-as-you-go is quite a challenge as I need to carefully meter the users' consumption on my API (number of requests, model used, number of tokens per request...), whithout harming performance. I use a time-series database for that called TimescaleDB. This is basically a PostgreSQL DB optimized for time-series.

As far as the user interface is concerned, I use Python/Django and Go microservices. I also use HTMX on the frontend side.

I must confess that design has never been my main focus for NLP Cloud, but it never prevented the business from growing and today I have a lot of loving customers who don't really seem to care about design. My customers are developers after all: what they want above all is a robust and well documented API that's it.

When I started NLP Cloud, 4 years ago, infrastructure costs were not really a concern. Then LLMs appeared and today my infrastructure costs are huge...! That's why container orchestration and microservices are the key here: it's important to be cloud agnostic in order to get the cheapest GPU servers and remain competitive.

I will be more than happy to answer your questions about this tech stack if you have some!

on February 20, 2024
Trending on Indie Hackers
1 small portfolio change got me 10x more impressions User Avatar 30 comments AI Is Destroying the Traditional Music Business and Here’s Why. User Avatar 29 comments Fixing my sleep using public humiliation and giving away a Kindle User Avatar 23 comments A Tiny Side Project That Just Crossed 100 Users — And Somehow Feels Even More Real Now User Avatar 16 comments From 1k to 12k visits: all it took was one move. User Avatar 11 comments Retention > Hype: What Are We Really Chasing as Builders? User Avatar 9 comments