6
25 Comments

Seeking advice: Will this backend architecture work?

I'm building an analytics service.

Every week, I'd like to retrieve data from the Google Analytics API, and then send that as an email to myself.

Since I don't have much experience setting up much complicated backend infra outside of simple server API's, I thought I'd run my plan by this lovely community before moving forward.


My thought was to create an AWS Lambda function, that would do most of the work: retrieve data and send the email.

This serverless function would get triggered weekly by a cron job.

And since there's a student discount for Mailgun, I think I'd just use that over Amazon SES or other options to send the final email to cap off this process.

Something like this:


Is this feasible, or am I completely missing something here?

And I'm wondering whether there would be issue with scalability (in case by miracle I get a lot of users to sign up for this service).

And is there a resource that I can consult about these types of infra questions going forward?

Any help would be greatly appreciated!
I'm gonna begin hacking away tonight either way, but thought I'd through this out there tonight :)

  1. 4

    If you wanted a quick and dirty MVP, you could do this very rapidly and simply using Google Apps Script: https://developers.google.com/apps-script

    It has self-contained cron job triggers and you wouldn't need MailGun either - they have a native GmailApp.sendEmail() method. And for calling the GA API, you would use their UrlFetchApp which is similar to Javascript fetch. The one limitation is there is a 6 minute limit per execution and 100k daily fetch request limit.

    Then eventually if it grows and you need higher limits, better logging, etc., you can migrate to Node.JS or something and use Lambda functions.

    1. 1

      Ah! This is new to me. Thanks for the suggestion, I will do some comparisons.

      1. 1

        No problem, give it a try. I am biased because I followed a very similar path a couple years ago (started hacking stuff together with GAS, then eventually graduated to Google Cloud Functions / Node.JS).

        If you're starting out as a new programmer, one thing you might find right away is just getting the f---ing environment set up so you can begin writing code can be incredibly difficult, even when you're following documentation step by step. You might be dealing with dependencies, incompatible library versions, command line stuff - it can be a mountain to climb if everything is new. This is ultimately a skill you will need to develop as a programmer (and is honestly in my experience a lot of programming) - but if you're brand new, you want to just be able to write code and see it do the thing as quickly as possible. Good luck.

  2. 2

    This is not bad design, the only thing I may change is I prefer to use Firebase Functions or(Google Cloud function) instead of AWS Lambda and use Google Cloud Cronjobs. Google Analytics API is bit easy to access from Google cloud compared to AWS Lambda. MailGun API is super easy to setup and use.
    For new devs AWS is a complex mess to deal with, Google Firebase and Azure are simple and easy to understand and use.

    1. 1

      Makes sense. Thanks for input! Personally just trying out AWS Lambda and its real simple so far
      (I've used Google App Engine and Cloud Build before, they honestly felt a bit hacky at times)

  3. 2

    Looks fine to start.

    I won't over-engineer things at the start to make things more scalable as you don't yet know the traffic patterns.

    1. 1

      I won't over-engineer things at the start

      Wiser words were never spoken

  4. 2

    Hey!

    I've recently built a serverless app for a client at work (rhymes with chapel, they sell shiny expensive laptops...)

    Each time a user needs this work flow to kick off a lambda is spun up (its basically a tiny little server that's is spun up in milliseconds) you can have up to 1000 concurrent users kick off this workflow at any given time. Keep in mind that after each work flow is complete that execution space will be available for your next customer so it's technically feasible in that context.

    For the scheduling you can do this right in AWS, You can use a Step Function and schedule it to handle the entire workflow (You probably want to use this anyway). A step function will orchestrate the entire workflow e.g. lambda is done, did It get the data? Okay now pass the data to SES...etc

    The lambda time limit (15 minutes) is a lot of time, although I'd hope your data doesn't take that long to fetch.

    In terms of scalability lambdas scale well, 1000 concurrent invocations for a what id assume to be a 3min max data fetching task is a lot of resource available. (Keeping in mind each lambda slot is available once its done). You can also increase from 1000 upwards by contacting AWS (you should be making some money at this point I'd hope).

    In short this will work and go for it. If you struggle to scale you have a great problem.

    I've answered this assuming you don't have other options, but AWS lambdas are great for early stage prototyping and market validation. You can build super fast and if you're doing good business you can always change architecture.

    1. 1

      Ps. Use 1 lambda to fetch the data and dump it somewhere e.g. s3
      use a second one to read from s3 and email it!

      Feel free to reach out if I can help!

      1. 3

        why 2 separate lambdas? Why not do it all in one?

        1. 1

          Hey!

          Answered that below :D

        2. 1

          Seconding this question ^

          1. 1

            If you're worried about scaling tying the whole process in one lambda means that lambda slot will be unavailable for any future invocations. Its usually best practice to give each lambda a single responsibility this is what they were designed for specifically.

            With two lambdas you now have at your disposal 1000 concurrent invocations for each task rather than 1000 concurrent invocations for the whole process.

            Hope that makes sense?

            1. 1

              Yeah makes sense, but why can't you just create another of the same lambda, so now you have 2000 instances? (Is that not a thing? I'm not too familiar with serverless on aws). Doesn't putting in s3 incur read/write costs? Is it still a good idea to split into two because of the execution time differences between fetching data and mailgun? (Maybe if it is possible that you can start sending emails as the data fetching isn't even finished yet)

              1. 1

                Hey!

                So ultimately it depends how you want to architecture the solution. S3 does incur read and write costs but its super cheap but the main benefit of this is you can 'archive this data in case of a failure in your workflow and re use it'. So its not necessary if you don't want it to. But yes this helps control your lambdas 100%, this is why its always best to split your tasks between lambdas.

                So you you're 100% correct about:

                "Is it still a good idea to split into two because of the execution time differences between fetching data and mailgun?"

                You don't want two lambdas doing the same thing.

                So in summary the process you want is:

                Lambda 1 is triggered and it fetches the data. When data is received it puts that data in S3 with a unique id. (This can happen up to 1000 times simultaneously). Remember after it puts the data in S3 it is immediately available to be re used because it has no other tasks to do its finished its job. So 1000 is more than enough.

                When data arrives in S3 Lambda 2 finds the file by the unique id and emails it. Lambda 2 is now finished and immediately available to do this again. Up to 1000 concurrent times.

                The tricky part is how to let lambda2 know which unique id to get. This is where the step function comes in handy because it can tell lambda 2 what that unique id is.

                This is the right way to do it, its scalable and efficient + easy to debug. Ultimately this is "best practice" you can do it anyway you like.

                You can do it all in one lambda function and not worry about it, but as your solution grows and becomes more complex you will have to refactor it.

                Sorry for the long reply but I hope this was helpful! :D

                1. 1

                  Hmmm.. I've seen some people online advising to chain two lambda functions using Amazon SNS?

                  Might be another viable solution to keep the code modular n scalable.

                  Have you tried it?

                  But I think I'm going to just put the fetch and email send in the same function now just for simplicity/speed...

                2. 1

                  Very helpful. Thank you.

  5. 1

    As mentioned by others, AWS Lambda (or similar offerings from Google Cloud) would definitely work. Just be aware of the lambda limitations, e.g. 15min max duration, and also the features, e.g. scheduler built-in.

    ... to myself.

    Sure just hard code the Lambda inputs (i.e. GA account, email address to receive report) and you're done.

    However once thing where serverless often falls apart is when you try to scale functionality.

    users to sign up for this service

    Your lambda won't help with any of that. You'll need a web app for users to sign up, configure their accounts (GA account, email, settings etc). And now you need to get all that stuff inside your lambda as parameters and of course make sure each user gets theirs only run once.

    At that point I recommend to move to a full blown web framework that you feel comfortable with, e.g. Rails or Django. It will have everything you need to built both - the web UI and background jobs to process reports on a schedule.

    EDIT: regarding scale, don't worry about that - if you can charge users for it you can just throw some money on servers, also most background job queuing frameworks lets you simply queue thousands of jobs and plow through them

    Long story short: Lamda is nice to validate your idea and build it for yourself. Once you have that you might want to build a product around that (and might have to replace the actual report generation if you switch languages)

    1. 1

      Your lambda won't help with any of that. You'll need a web app for users to sign up, configure their accounts

      Yep I already have the marketing page up using next + react over at magic-ga.com. Will probably use the same stack when I begin the actual application interface, probably just app.magic-ga.com.

      Thanks for the input, and suppressing my urge to think about scale immediately!

  6. 1

    FWIW, I run all of Alchemist Camp's transactional email through SES and it only costs a few cents a month. They're really hard to beat on price.

    1. 1

      Yeah I have heard/seen its really cheap.

  7. 1

    My knowledge about aws is pretty limited, so please correct me if I am wrong.
    AWS lambda support scheduled expressions, so the HTTP request should not be necessary.
    You say that this function gets your analytics data, does sth. and then sends an email to yourself. But later on you talk about users signing up for this service. Where do they sign up?
    If only you should get an email or only a few people you can send the email via smtp as mailchimp would be overkill.

    1. 1

      I'll definitely take a look at the scheduled expressions. Thanks for the heads up.

      I didn't talk about the frontend in this post, but users would just schedule email reports through the interface.

      1. 1

        Then you would also need some kind of user management as @axelthegerman already pointed out

Trending on Indie Hackers
After 10M+ Views, 13k+ Upvotes: The Reddit Strategy That Worked for Me! 42 comments Getting first 908 Paid Signups by Spending $353 ONLY. 24 comments I talked to 8 SaaS founders, these are the most common SaaS tools they use 20 comments What are your cold outreach conversion rates? Top 3 Metrics And Benchmarks To Track 19 comments Hero Section Copywriting Framework that Converts 3x 12 comments Join our AI video tool demo, get a cool video back! 12 comments