5
0 Comments

How We Improved Our MongoDB Write Performance with AWS SQS

This post is more of a tech and engineering post, but as indie hackers and bootstrappers ourselves, we sometimes have trouble with scale so we share how we overcome these hurdles.

There will come a time when your database is going to be overloaded and making the server bigger is just not going to cut it. So we went around it with by using SQS.

Why SQS?

At [PurpleAds](https://purpleads.io we work mainly with AWS EC2. It was easier for us to start with, it keeps costs low compared to managed services and you get more control over your infrastructure albeit at the cost of rising complexity as you grow.
Since we like to keep things simple and use technologies we are familiar with - we use a 2x a1.xlarge mongodb replica set for all our database needs.

Running an ad network requires storing a lot of analytics and recently, our mongo servers were having a hard time keeping up with increasing numbers of write and update operations as our network grows.
After some research, looking a lot at outputs of mongostat and mongotop, it seems that mongo is not so great when it comes to updating hundreds of fields in hundreds of documents per minute. Imagine a daily analytics document keeping data per asset, per device, per geo, per os, etc.

So we decided to queue these ops and do periodical bulk writes instead, this would cut down the number of upsert operations by at least 10x.
Several solutions came to mind, among others we though about queueing operations in our Node application memory.
Finally we decided to go with something that we can implement quickly, easily and will give us peace of mind for a long period - AWS Simple Queue Service.

Having your main Node app write thousands of upsert operations to MongoDB not only overloads the database and creates locks, it also makes your Node app less responsive as it has a lot of open connections to your database.

The main idea was simple - send payloads to SQS and process it later in bulk with a separate worker application (a microservice if you will).

Sending Messages

The first and simple cost and performance optimization on SQS is queueing message sends.
We can send 10 messages per request, so we'll stack them up before sending them over to SQS.

const messages = [];
// setup a simple scheduler
setInterval(() => {
  _sendQueue();
}, 10 * 1000);
// use this when you want to send a JSON message
function queueSend(object) {
    messages.push(object);
    if(messages.length >= 10) {
      _sendQueue();
    }
}
// this should not be called directly
function _sendQueue() {
  if(!this.messages || !this.messages.length) {
    return;
  }
  // max batch size is 10
  const toProcess = messages.splice(0, 10);
  let i = 0;
  const entries = toProcess.map((mes) => {
    i += 1;
    return { Id: `${i}`, MessageBody: JSON.stringify(mes) };
  });
  sqs.sendMessageBatch({ QueueUrl, Entries: entries }).promise();
}

In case your payloads are small, you can really bring down your costs by “fooling” SQS and sending an array of payloads in a single message.

Pulling Messages

Now that you have your messages in your SQS queue, you need a worker, a micro-app that would process those messages and write them into MongoDB in bulk. SQS allows to pull up to 10 messages from the queue at a time, so you need to:

  1. concurrently pull 1000 or more messages in batches of 10
  2. process / transform / map those messages to mongodb update/upsert operations
  3. bulk write these operations to mongodb
  4. delete the messages from the SQS queue
function sqsPull() {
  return sqs.receiveMessage({
    QueueUrl,
    MaxNumberOfMessages: 10,
    WaitTimeSeconds: 20,
  }).promise();
}
const multiPulls = await Promise.all(
  Array(100).fill(0).map(() => sqsPull())
);
const messages = multiPulls.map((res) => res.Messages).flat().filter((i) => !!i) || [];
// you need these to later delete the messages from the queue
const receiptIds = messages.map((m) => m.ReceiptHandle);
const json = messages.map((m) => {
  try { 
    return JSON.parse(m.Body)
  } catch(err) {
    return null;
  }
});

Sometimes SQS will return less than 10 messages per request even if you have hundreds of messages in-flight, probably these are still being processed by SQS and not ready to be pulled.
Make sure to setup a “cooldown” mechanism to let the queue stack up on more messages before we process those once again:

let smallBatchCounter = 0;
do {
  const multiPulls = await Promise.all(Array(100).fill(0).map(() => sqsPull()));
  const messages = multiPulls.map((res) => res.Messages).flat().filter((i) => !!i) || [];
  ...
  if(message.length < 100 * 1000) {
    smallBatchCounter += 1;
  } else {
    smallBatchCounter = 0;
  }
  if(smallBatchCounter >= 10) {
    await new Promise((resolve) => setTimeout(resolve, 5 *60 * 1000));
  }
} while(true);

Delete Processed Messages

You would not want your messages to be processed more than once and save duplicate data. So after successfully processing messages, you should delete those.
SQS limits you to deleting up to 10 messages in a single request. So we added a helper to delete larger batches:

function deleteBatch(receiptIds = []) {
  if(receiptIds.length > 10) {
    throw new Error('max 10 ids per delete batch');
  }
  let i = 0;
  const entries = receiptIds.map((rId) => {
    i += 1;
    return { Id: `${i}`, ReceiptHandle: rId };
  });
  return sqs.deleteMessageBatch({ QueueUrl, Entries: entries }).promise();
}
function deleteLargeBatch(receiptIds = []) {
  const promises = [];
  for(let i = 0; i < receiptIds.length; i += 10) {
    promises.push(deleteBatch(receiptIds.slice(i, i + 10)));
  }
  return Promise.all(promises);
}

FIFO queues

First In First Out SQS queues guarantee that messages will be pulled in the order they were sent and will only be delivered once to the consumer/client.
The caveat with using FIFO queues are MessageGroupId. A message has to be associated with a group and the order of messages is only guaranteed relative to the group.
A single message group can have up to 20k messages in-flight or “waiting” to be pulled.
What if we have more than 20k? We would have to split those up to multiple groups, but then what would be the order of messages relative to other groups? We could not find an answer to this.

Conclusion

All-in-all AWS SQS is a cheap, simple, effective and scalable service. If you’re having trouble with write through-puts, you should consider using SQS to queue and batch these operations.
Hopefully this guide will help you get started and later on help you optimize your costs and efficiency.

We’ve also published our SQS helper library on npm with many of the operations described in this article for your convenience.

Trending on Indie Hackers
Getting first 908 Paid Signups by Spending $353 ONLY. 24 comments I talked to 8 SaaS founders, these are the most common SaaS tools they use 20 comments What are your cold outreach conversion rates? Top 3 Metrics And Benchmarks To Track 19 comments How I Sourced 60% of Customers From Linkedin, Organically 12 comments Hero Section Copywriting Framework that Converts 3x 12 comments Promptzone - first-of-its-kind social media platform dedicated to all things AI. 8 comments