15 million requests per month for a side project is difficult

by kf

Hello everybody,
I built a simple uptime monitoring tool, but things got messy. It was fun and learning new stuff, then friends and coworkers liked my tool. I decided to continue and build three servers around the world.
As I said, I didn't expect any revenue (I wouldn't say I like ads and spam emails). I just wanted to learn and improve my tech skills. For now, people added over 1700+ websites. I check their websites and services every 5 or 10 minutes. 106 Users, 20 Slack Users, and the rest of the 35 use web notifications or webhooks.

Every day, I send 600 emails. Every month it will be 20.000 emails (I am using a third-party service for email)
I have already hit 6.1 million log records on my database. Today I calculated how many requests I will send in November, and the numbers are a little bit huge.

Every 5 Minutes, I send 1 request.

There are 44,640 minutes in a month.

(44640 / 5) * 1704 = 15,516,864 :((((

15 million requests a month. Even don't know server can handle it.
Question:
What advice would you like to give? Don't say shut down your project because I don't want to do it.
Is there any way to continue my project?

My tech stacks : Java and VueJS

posted to

Solo Entrepreneurship

on November 2, 2022

Say something nice to kf…

Post Comment

2
5.6 requests per second, that's not a lot, a Raspberry can do it.
- If archiving is the problem, use a time series database, keep only the last 30 days, and summarize the rest. Or, instead, just keep the error logs by deleting successful responses. Remove duplication as much as possible, then rebuild the data just when it needs to be presented.
- Continue to use a VPS, possibly resize on a physical server. Avoid serverless because the expense will be stratospheric.
- Go will help you save a lot of RAM and is very mature for production, you can just develop the request loop in Go while keeping everything else in Java.
- Use proxies to rotate IPs.
I made a similar system in Python in the past, with less than 150MB of RAM and 1 vCPU I sent thousands of requests per second rotating IPs through proxies, and logging full responses. Just find ways to simplify duplications on the database.
davmuz

·
3 years ago
·
Reply
2

Scaling issues are good issues to have. For me the hard part is always getting users to pull out their credit card.

I solved very similar problems in the past with a lot more load than that (e.g. more than 100M req/day). Considering the nature of your use-case I would stay away from serverless infra, VPS (if done correctly) is a lot cheaper and is not hard to manage. If you want any help or exchange some ideias just let me know.

indiehacker@cypher-sys.com

andrembpontes

·
3 years ago
·
Reply
1. 1
  
  I am using 16 GB Ram,6 vCPU Cores ,400 GB NVMe and 64 TB monthly bandwidth . I hate AWS or Google Cloud. I have to be stay alive in the market and I can't afford it if I choose the Aws or Google Cloud.
  
  Some people contacted me after article and we calculated how many request does my server can handle it. Numbers says I can handle 7 million request per day. The hardest part is storage.
  
  If I reach over 1000 user, I will think about migration from VPS to cloud services.
  
  By the way, my yearly cost is only 72$ . Include VPS and 3 different servers.
  
  kf
  
  ·
  3 years ago
  ·
  Reply
  1. 1
    
    Those specs look OP, I was handling more than 100M/day with lower than that... Are those specs per node? Or in total?
    
    Storage doesn't need to be hard, you just need to be creative and design it as stateless as possible. Maybe you don't need to store positive request, you can just assume. Maybe you don't even need to store the failure ones, just the start and end time of a failure period. Maybe don't even that, you can simply notify the users (via email, or anything)
    
    I would design it using some kind of distributed log, detach the worker nodes and scale them according with your needs.
    
    1000 users == 1000 sites? Thinking around sites instead of users makes more sense for me, since is the one metric that impacts the entire system
    
    I saw a couple of folks recommending to change your stack. IMHO it doesn't really matter at all... You're in a IO/intensive workload, it's a lot more about how you design it than which tool you pick. I bet you could use Bash and resources would not be an issue if the design is correct
    
    IP banning can be solved (at first) with documentation - just ask users to whitelist your testers. If you want to be fancy you can use proxies but that's an additional layer of complexity
    
    Can you share your hosting provider? And your product webpage?
    
    andrembpontes
    
    ·
    3 years ago
    ·
    Reply
2

Why not using aws SES for emails ? it will be much cheeper .
Also coming from java world . java takes much more resources then something like : golang
consider to rewrite .
consider using serverless architecture
Question : Where do you host your add now ? how much it cost you ?

umen242

·
3 years ago
·
Reply
1. 1
  
  Amazon SES costs are higher than Zoho email. By the way, still I am using zoho email services. Because my costs still 0 $ for email.
  
  It's not easy to change tech stack from GoLang to Java. Also I have to be fast on development. Golang still too young for me.
  
  My yearly costs are only 72$. Include 4 different VPS servers.
  
  Serverless architecture not fit my bussiness modal. What if I run into unexpected bills in the end of the month?
  
  kf
  
  ·
  3 years ago
  ·
  Reply
2

Charge your users or add premium features and charge for them :)

Also, you didn’t state which technical issues you have with sending 15M requests per month - only your feelings about these numbers, which is not helpful.

Such services are easily scaled horizontally. I would rewrite worker nodes in Go and deploy to cheap VPS/cloud instances in different regions. Go uses less memory and works fast on small nodes, it’s also way easier to deploy across nodes. Nodes should queue up all the data before writing it into DB. Saying as someone who served 800k daily page views from two old servers

Stansm

·
3 years ago
·
Reply
1. 1
  
  VPS solutions great. Yes I didn't mention to my problem exactly. Problem is there are lot of request but I have a limited resources How can I continue this bussiness as a free service?. Second problem is, after a time my IP address getting banned from their services. I can't change VPS server everytime.
  
  15 Millions request = 15 Millions record in DB. I have to save all logs. What if 6 months later pass over 1 billion rows in DB? I don't have enough experience high rows in DB.
  
  kf
  
  ·
  3 years ago
  ·
  Reply
  1. 1
    
    Why do you have to store such a large volume of data? Limit to say last 3 months of details and summaries of older if at all.
    
    kkumarkg
    
    ·
    3 years ago
    ·
    Reply
  2. 1
    
    Sorry, but it seems like you want to solve problems which don’t need to be solved atm. Your IP getting banned is not your problem - they should allow it. 15M records in DB don’t say anything. Not sure what you are using for backend but maybe worth looking into TimescaleDB and learning how to setup replication. You still didn’t mention any technical issues you are having atm. Slow processing time? DB using too much CPU, etc
    
    Stansm
    
    ·
    3 years ago
    ·
    Reply
2

You've got the perfect service to spread your load out evenly.

Don't count the number of requests per month. Count the number of requests needed per second.

1704 servers spread over 5 minutes: 1704/(5*60)=5 requests per second.

If your server can't successfully perform five requests in a second, and your backend is written in Java, then you need to rewrite your code so that it can.

For reference, a Java server on a low-end VM should be able to perform somewhere between 200 and 2000 simple requests per second, if the server is written well. Presumably you're also logging each of those, so your database would need to support between 200 and 2000 writes per second as well, and you'd want to minimize any writes you don't really need, but that's really doable.

More scaling than that might mean that you shard your database (or choose one that's more write-optimized) and upgrade your VM to a higher tier. But you're at 5 requests per second right now, which means you've got room to grow 40x before you need to worry about it at all--assuming your server code is reasonable optimized.

One of the fastest Java http libraries that I'm aware of is Vert.x. I'm sure there are others. What you probably don't want to do is all of the queries synchonously, or to do all of the queries by forking processes. Instead you want a small number of threads performing a lot of queries asynchonously (in parallel). That's how you get the best performance out of your server.

TimMensch

·
3 years ago
·
Reply
1. 2
  
  Yes you are right, my java server can handle easily 1500~ request per second. Is there any solution for high tables ? I calculated and 6 months later I will reach 1 Billion record in my table. What should I do ?
  
  Currently I am using Spring Boot Framework. There must be some solutions to handle requests asynchonously. I will check the documents.
  
  Thanks for the comment and precious advices.
  
  kf
  
  ·
  3 years ago
  ·
  Reply
  1. 1
    
    I'd lose any data older than 30 days for anyone who isn't paying you money.
    
    Or at least lose things like "ping time" if you're collecting them; you could store a much smaller record of the downtime of their servers.
    
    Alternatively, if you really do want to keep older data around, then you probably want to partition by time. Either using an automatic partitioning feature of your database (PostgreSQL has such a feature) or by simply deleting (with or without backup) old records.
    
    Or maybe, if you're using PostgreSQL, adding the TimescaleDB plugin and use its "continuous aggregate" feature to keep all of the older statistics but delete the actual logging data. TimescaleDB is great for all kinds of reasons for time series data.
    
    Or some combination of the above.
    
    Good luck.
    
    TimMensch
    
    ·
    3 years ago
    ·
    Reply
2

Charge them money. Make it a business.

joshdance

·
3 years ago
·
Reply
1. 2
  
  Yes I will do it. But it's too early for now.
  
  kf
  
  ·
  3 years ago
  ·
  Reply
  1. 1
    
    Why too early? Honestly think about that. If you are scared about charging you should charge money.
    
    If your users find value in it, they will pay. Otherwise they won't and you can stop doing millions of server requests.
    
    joshdance
    
    ·
    3 years ago
    ·
    Reply
1

AWS Lambda + Queue will save you. I think 😇

trungpv

·
3 years ago
·
Reply
1. 1
  
  Lambda will shoot the bill skywards, no?
  
  kkumarkg
  
  ·
  3 years ago
  ·
  Reply
1

Be up front with your users, you should charge.

Based on this:
https://uptime.com/pricing

you have a lot of room to charge something and not lose customers to competition

LeroyK

·
3 years ago
·
Reply