Stream LLM Responses Like Magic

Hey Indie Hackers, I have a new product/service which makes it easy to stream LLM response faster, and I'm looking to onboard a max of 5 people for private beta.

It's fast because it keeps a persistent connection between your user (or browser client) and my servers, and streams whatever data you send to it, to the connected clients.
It's different because I've built in a concept of session into it. Instead of creating channels, rooms, or group like some services give you, you just open a session, and use it throughout the lifetime of a Gen AI or LLM session. You can have concurrent user connection to a session, or have one session for one user.

It saves you the server load of maintaining persistent connection in your servers, and useful for serverless environment where persistent connection (e.g. websocket) are not supported.

Do you have a need for streaming responses or events in real-time, I'd like to hear from you and get you to try it. Feel free to comment below or email me at peter @ flycd .dev

I don't have a landing page yet, just talking to onboarded users and putting some finishing touches before official launch for self-serve public use. But I have one in development: https://streaming-llm-app.vercel.app/