I will be referencing a few tools without much elaboration. If you have any questions about them, please ask!
This is more of a cautionary tale than anything else. The lessons are obvious in hindsight: “use what you know” and “be careful of new technology”. What I want to do is explain how I fell into making these mistakes so others can avoid them.
So, I wanted the infrastructure for my night-and-weekends project to be reasonably lean in terms of cost and complexity. For reference, I use a lot of AWS at my day job and it’s not cheap. I wanted to explore other cloud providers and had the following wish list:
DigitalOcean caught my attention for a few reasons. First, it is half the cost of AWS. Droplets are cheap and provide remarkably good compute value. Second, they fulfill all the technical requirements above and are actually the only small cloud provider that offer managed databases as a service. Third, as an indie developer it feels more indie-like to use a smaller provider.
The documentation was straight forward and I was able to get a DigitalOcean fully environment running in a handful of evenings. Most of the time was actually spent setting up the image with Packer. It is not soon after that I started having problems.
I ended up having significant issues with DigitalOcean’s database and load balancer services. Droplets had 50% chance of not automatically being given network access to the database which meant I needed to do it manually each time. Worse yet, the load balancer started to give me more and more 503s as time went on as well. My guess was it was not properly removing droplets after they were being deleted (I was constantly creating and deleting droplets as part of my releases). Plus, there were other bugs too! I didn’t want to spend my precious time debugging all of this with their support.
As much as I want to blame everything on DigitalOcean, it was my fault as well. I am skeptical of using a brand new library in my code and I did not have that same caution when it came to my infrastructure. Their database service became “generally available” less than a year ago! To their credit, DigitalOcean has been fairly aggressive in the past couple of years to add and improve their non-droplet services. Maybe in a year DigitalOcean’s newer services will become more reliable, but I don’t have a year to wait.
Back to AWS’ well-monied arms I went.
Getting everything running again on AWS cost me a weekend. I was already familiar with it and nothing about my DigitalOcean infrastructure was proprietary. For example, all I needed to do on Packer is change the build target from DigitalOcean to AWS and a few small script changes. Using non-proprietary technology (e.g. Terraform, Packer) gives you an exit strategy for free. Generally having an exit or alternative strategy whenever you are exploring something unknown is a good idea.
Using technology you already know (in this case, AWS) is a huge time saver and even though DigitalOcean is cheaper, at my current stage of my life my time is more expensive.
Admittedly, I think what made me truly stray away from AWS was the thrill of learning something new. The dark truth was that I didn’t really have to use DigitalOcean. It is pretty to find yourself with a few thousand dollars of AWS credits for a couple of years if you know where to look. Plus, it’s still possible to learn new things with technology you already know! When I migrated to AWS I decided to try setting up an ALB for the first time and bam, I learned something new in a smaller, safer way!
In hindsight, I should have started on AWS and only transitioned to DigitalOcean once cost became an issue. It was a classic case of premature optimization.
I have an innocent question: how do you justify using these cloud-based services except that you know them?
Are you asking about all cloud-based services or just AWS/DigitalOcean in particular?
All of them. I wonder if it's really necessary to go in there from the beginning on, or scale to the cloud when you need to.
Just to be clear, the architecture described in this post is intense for a small project! Most projects would do perfectly fine with a single host if you're comfortable with SSHing into a box or you could use a PaaS like Heroku (Heroku is pretty damn expensive but it is really low effort to maintain).
I think starting with the cloud makes sense in most cases for the following reasons:
I hope that answered your question!
Yep it makes sense. Thanks for that!
The 7$ Heroku option is considered expensive compared to a 5$ droplet?
As soon as you leave the $7 tier the pricing jumps up dramatically fast (1gb memory starts at $25). Other hosting services scale more gently.
You're right though! The $7 tier might be sufficient for some (or even most on Indie Hackers) projects, and if so, that's great!
This. Heroku is crazy expensive. We moved early from Heroku and based on our calculations our servers costs would have been triple on heroku for same performance.
How server intense is your application?
Thanks, I hope to get to the higher tier someday, it will mean I am more profitable :)
Is there anything comparable to heroku on the AWS side (with reduced costs) at the higher tier if I do consider changing?
And how can I know when/if I need to "tier up" (at heroku)?
Usually it is very obvious when you need to "tier up". Things start breaking from too much load or you are working on a feature that requires more resources (e.g. batch processing of any kind).
Realistically if you're comfortable with Heroku and don't have the time or desire to learn DevOps, just stay on Heroku as long it is feasible. That is the moral of the article I wrote!
Also, you are allowed to use multiple providers for one-off things. For example, if you need to generate PDFs from a webpage (which is very memory intensive) that can be done using a Lambda function. No need to drop Heroku as your main API just for that.
My infrastructure for this project was optimized to avoid data loss (even from downtime) because I felt that was important for a journaling app. Honestly, I was being indulgent.
I have another small project I want to do that will likely just be a Netlify frontend with handful of Lambda API endpoints.
I'm curious, from your experience, if there is anything comparable to heroku at aws?
I think AWS ElasticBeanstalk maybe the nearest thing to Heroku in AWS
Not that I know of. Something like https://github.com/apex/up which uses lambda would be the closest analog?
Hi Sasha 👋... I head up product management for developer experience at DigitalOcean. If you're up for it, I'd love to better understand some of the database and load balancer bugs you hit. If you drop me an email using my first initial, last name at digitalocean maybe we can get some improvements going.
Sure thing, I'll send you an email!
nice tips Sasha, you said "It is pretty to find yourself with a few thousand dollars of AWS credits for a couple of years if you know where to look."
where should I look?
A few names were dropped in this thread, but generally most orgs or products that are meant to help new tech companies start will tend to have some AWS credits available to them. E.g. https://www.producthunt.com/ship gives you $5000 credits by signing up (it's not free per say, but the credits cost negative dollars).
Right now I'm using DO and AWS
I use S3 and cloudfront for obvious static assets.
I use ECR for the only private reliable (easy honestly too) container repo.
I use Route53 and CloudFlare to manage my traffic in a reliable way.
All I put on DO really is the IAM, LB, VPC, Firewall, and Kube. I'd happily run containers on AWS or GCP, I have the certs and many professional years experience do it would be faster, but I need close to 100 instances for my stack and as you know the cost boundary for AWS and GCP is (generously) 20% of this.
I don't have much inbound traffic, all that compute is the app doing it's thing, so eventually I'll also need to scale customers activity at which point I'll have cash flow and might then be in a position to go back to AWS or GCP for compute too
We total trust digital ocean for four years running. We don't use the DBaaS and LB services. We use Databaselabs.io inside DO NYC2.
If you're looking for hosting infrastructure for host side-projects I'd recommend https://packetriot.com.
I built it so of course I'd recommend it :) That aside, I built this for devs (including myself) that want/need hosting but the project is not large enough or warrants the price that public cloud providers.
DO is amazing value btw and I host Packetriot with them. Their fixed costs make it easy to figure out how much your project will cost you.
There's a free tier on Packetriot for evaluating. The idea is that once your project takes off and it needs more reliable infrastructure you move it to the cloud. Until then, you get more value. Paid plans which pretty much cover most people start at $5/mo.
I'm curious as to what type of application you're hosting here. I realize it's mostly irrelevant to this article, I'm only asking because I'm looking to do the same thing with one of my projects which is essentially a single server LAMP stack that needs to start scaling. Update: what caught my attention here is that I'm currently on DO and was about to configure a load balancer and switch to a managed database. Sounds a little sketchy and this could save me some time!
I think what broke DigitalOcean was me constantly creating/destroying parts of my infrastructure.
I still would recommend trying out the managed database service.
The load balancer I would be more skeptical about. A droplet running HAproxy may do you better. That said, it is fast enough to get going that it is worth trying out and see if you have a better experience than me.
Some other things I encountered with DO:
Did you check Amazon Lightsail? I am using it for my own projects and is cheaper than DigitalOcean.
I avoided lightsail because I read some negative reviews of it a while ago. Things may have changed since then! If you are having success with it, that's cool to hear!
Where do you find AWS credits? I know Product Hunt Founders Club but the credit will only last a year.
I got $5000 for two years after buying this: https://appsumo.com/startups/
Are there any limits for those $5000 credits (for example RDS is not included but ECS is) or it is really for all AWS service?
Here's what AWS tells me:
Below are a list of services that can be used with this specific credit.
AWS Amplify
AWS AppSync
AWS Backup
AWS Budgets
AWS Certificate Manager
AWS Cloud Map
AWS CloudHSM
AWS CloudTrail
AWS CodeCommit
AWS CodeDeploy
AWS CodePipeline
AWS Config
AWS Cost Explorer
AWS Data Exchange
AWS Data Pipeline
AWS Data Transfer
AWS DataSync
AWS Database Migration Service
AWS Device Farm
AWS Direct Connect
AWS Directory Service
AWS Elemental MediaConnect
AWS Elemental MediaConvert
AWS Elemental MediaLive
AWS Elemental MediaPackage
AWS Elemental MediaStore
AWS Elemental MediaTailor
AWS Firewall Manager
AWS Global Accelerator
AWS Glue
AWS Greengrass
AWS Ground Station
AWS Import/Export
AWS Import/Export Snowball
AWS IoT
AWS IoT 1 Click
AWS IoT Analytics
AWS IoT Device Defender
AWS IoT Device Management
AWS IoT Events
AWS IoT SiteWise
AWS IoT Things Graph
AWS Key Management Service
AWS Lambda
AWS OpsWorks
AWS RoboMaker
AWS Secrets Manager
AWS Security Hub
AWS Service Catalog
AWS Shield
AWS Snowball Extra Days
AWS Step Functions
AWS Storage Gateway
AWS Storage Gateway Deep Archive
AWS Systems Manager
AWS Transfer for SFTP
AWS WAF
AWS X-Ray
Amazon API Gateway
Amazon AppStream
Amazon Athena
Amazon Chime
Amazon Chime Business Calling a service sold by AMCS LLC
Amazon Chime Call Me
Amazon Chime Dialin
Amazon Chime Voice Connector a service sold by AMCS LLC
Amazon Cloud Directory
Amazon CloudFront
Amazon CloudSearch
Amazon Cognito
Amazon Cognito Sync
Amazon Comprehend
Amazon Connect
Amazon Detective
Amazon DocumentDB (with MongoDB compatibility)
Amazon DynamoDB
Amazon EC2 Container Registry (ECR)
Amazon EC2 Container Service
Amazon ElastiCache
Amazon Elastic Compute Cloud
Amazon Elastic Container Service for Kubernetes
Amazon Elastic File System
Amazon Elastic Inference
Amazon Elastic MapReduce
Amazon Elastic Transcoder
Amazon Elasticsearch Service
Amazon FSx
Amazon Forecast
Amazon GameLift
Amazon GameOn
Amazon Glacier
Amazon GuardDuty
Amazon Inspector
Amazon Kendra
Amazon Kinesis
Amazon Kinesis Analytics
Amazon Kinesis Firehose
Amazon Kinesis Video Streams
Amazon Lex
Amazon Lightsail
Amazon MQ
Amazon Machine Learning
Amazon Macie
Amazon Managed Apache Cassandra Service
Amazon Managed Blockchain
Amazon Managed Streaming for Apache Kafka
Amazon Mobile Analytics
Amazon Neptune
Amazon Personalize
Amazon Pinpoint
Amazon Polly
Amazon Quantum Ledger Database
Amazon QuickSight
Amazon Redshift
Amazon Rekognition
Amazon Relational Database Service
Amazon Route 53
Amazon S3 Glacier Deep Archive
Amazon SageMaker
Amazon Simple Email Service
Amazon Simple Notification Service
Amazon Simple Queue Service
Amazon Simple Storage Service
Amazon Simple Workflow Service
Amazon SimpleDB
Amazon Sumerian
Amazon Textract
Amazon Transcribe
Amazon Translate
Amazon Virtual Private Cloud
Amazon WorkDocs
Amazon WorkLink
Amazon WorkSpaces
Amazon Zocalo
AmazonCloudWatch
AmazonWorkMail
CloudWatch Events
CodeBuild
CodeGuru
Comprehend Medical
Contact Center Telecommunications (service sold by AMCS, LLC)
DynamoDB Accelerator (DAX)
Can you believe Amazon even offers this number of services...crazy
Yeah I was quite blown away sifting through all of it. I haven't used AWS for several years now and even though it's obviously very powerful, It's incredibly complicated to use, they use so many acronyms and terms I'm not familiar with making it hard to use.
Thank you!
this sounds like a good deal!
Yeah, I'm happy with it, I don't really use so much resources so I probably won't use it all in these two years. But I think the only catch is that I think you can't apply if you already are established with AWS. Not sure though how it works exactly.
Startup School by YC Combinator offers $3000 in AWS credits for participants.
I am using the Startup School AWS credit right now. It lasts for two years. Startup School is a pretty good bundle of deals tbh.
A lot of local institutions also provide credits. As an example, when I used to work out of a WeWork, they had a deal that was $2000 for 2 years. Not sure if that is still available given that WeWork imploded. Local incubators/accelerators/whateverators tend to have much better amounts.
They recently sent me a $300 credit o_o I didn't pay for anything yet, just started using AWS free tier.
If you participate in Pioneer App or start your company through Stripe Atlas, you get credits too.
I'm a fan of AWS & a serverless architecture. I use stackery.io for deploying to AWS. Serverless architecture costs nothing at rest with no usage. DynamoDB, Lambdas, API Gateway, Etc.
Serverless is great, but it is not applicable for every project.
Tendship uses a lot of relational data and that means using a SQL database such as Postgres. You really don't want to use SQL with Lambda:
Interesting point about VPC and Lambda. I use serverless framework these days so I’m no longer “in the weeds” of networking so to speak. Although, I’ve always been curious how the network interfacing worked. I just read this which describes how AWS hyperplane might solve the issue you described? Anyways, I enjoyed your post, some good insights here.
The hyperplane looks really encouraging! The latency is still fairly high (~1 second), but it's better than 10 seconds I saw before. I think in a year or so AWS will have cracked this nut.
Thanks for sharing!
I was wondering, is it justifiable to replace an Nginx server with API Gateway or/and Lambda Edge? The cost would be higher but I don't have to maintain it.
I wouldn't know! You still may want a reverse proxy to deal with gzip and other matters. If you're using language/framework that needs a web server like Flask, you definitely want to keep it.