Help understanding the process to put SaaS with ml pipeline to production

by thedan

Hello,

I am working on creating a SaaS web app, that deals with Natural Language Processing and a lot of text data, using a React front end, Firestore and a flask backend. I do not have any experience in DevOps.

The user will upload a CSV of text data on the front end, that I want to do some NLP on using the Flask backend. The text processing on the backend could require up to 10 GB of ram and could take up to 5 minutes to complete. However, initially, I could limit the size of the CSV file so that the memory required for processing would be limited to 2GB max.

Questions:

If I use Google App Engine automatic scaling and I have multiple users making requests to the backend at the same time, causing the memory to exceed the instance memory limit, will additional instances get automatically started temporarily to deal with the memory issues?

The max memory for an App Engine instance appears to be 2GB, if I want to be able to deal with more data, it appears that I will have to use GKE which allows for instances with much more memory, is this correct?

My current plan please correct me if you disagree:

Deploy frontend using Firebase hosting

Deploy flask backend on app engine, but limit the amount of data users can upload so that no more than 2GB of ram will be used processing the data.

If the SaaS shows promise, learn about Kubernetes then deploy backend using GKE instead where I can use instances with more than 2GB ram.

thedan

posted to

Developers

on June 21, 2021

Say something nice to thedan…

Post Comment

1

I'd recommend splitting up the webserver and the ML application, as they have different infrastructure requirements.

Have a smaller (cheaper) webserver that can handle all of your API calls and data uploads. Unless you have huge traffic this can take you pretty far without requiring scaling up.

Then you can have a server that focuses on long-running, high RAM operations. Since you're using python maybe celery tasks triggered from your flask app will do. This server isn't accessible to users like the API webserver is, but it talks to the same database.

10GB ram is a pretty big server to run all the time. If you're not getting much use out of it, maybe some sort of lambdas setup might work for you as well. Not familiar with what google has in this regard, I mostly use AWS/Render.

foliofed

·
3 years ago
·
Reply
1. 1
  
  Thanks. I'm going to try to get an MVP set up using Google cloud functions, which seems to be similar to Amazon Lamba, while I figure out a better method.
  
  thedan
  
  ·
  3 years ago
  ·
  Reply