Developers March 29, 2020

Why and how I moved from DigitalOcean to AWS

Sasha Sirotkin @sirotkin

I will be referencing a few tools without much elaboration. If you have any questions about them, please ask!

This is more of a cautionary tale than anything else. The lessons are obvious in hindsight: “use what you know” and “be careful of new technology”. What I want to do is explain how I fell into making these mistakes so others can avoid them.

Objectives

So, I wanted the infrastructure for my night-and-weekends project to be reasonably lean in terms of cost and complexity. For reference, I use a lot of AWS at my day job and it’s not cheap. I wanted to explore other cloud providers and had the following wish list:

  1. It should have Terraform support. This enables disposable staging environments which is a great way to reduce costs. Terraform generally makes setting up infrastructure faster and safer as well. Plus, I have previous experience with Terraform.
  2. It should support creating images through Packer. Each release is going to be image/AMI with production code baked into it. I use Docker at my day job, but it adds too much overhead due to orchestration and images are portable enough. I have previous experience with Packer as well.
  3. It should offer a managed load balancer service that can be configured via Terraform. This makes it easy to add/remove instances without downtime which is useful since releases are Packer-created images (this is essentially blue-green deployments).
  4. It should offer a managed database service. If something terrible happens there should be a replica at the ready and at least a week’s worth of backups. I don’t want to worry about accidentally losing data.
  5. It should be significantly cheaper. This eliminates the other major AWS-like cloud providers.

DigitalOcean caught my attention for a few reasons. First, it is half the cost of AWS. Droplets are cheap and provide remarkably good compute value. Second, they fulfill all the technical requirements above and are actually the only small cloud provider that offer managed databases as a service. Third, as an indie developer it feels more indie-like to use a smaller provider.

Stormy Waters

The documentation was straight forward and I was able to get a DigitalOcean fully environment running in a handful of evenings. Most of the time was actually spent setting up the image with Packer. It is not soon after that I started having problems.

I ended up having significant issues with DigitalOcean’s database and load balancer services. Droplets had 50% chance of not automatically being given network access to the database which meant I needed to do it manually each time. Worse yet, the load balancer started to give me more and more 503s as time went on as well. My guess was it was not properly removing droplets after they were being deleted (I was constantly creating and deleting droplets as part of my releases). Plus, there were other bugs too! I didn’t want to spend my precious time debugging all of this with their support.

As much as I want to blame everything on DigitalOcean, it was my fault as well. I am skeptical of using a brand new library in my code and I did not have that same caution when it came to my infrastructure. Their database service became “generally available” less than a year ago! To their credit, DigitalOcean has been fairly aggressive in the past couple of years to add and improve their non-droplet services. Maybe in a year DigitalOcean’s newer services will become more reliable, but I don’t have a year to wait.

Back to AWS’ well-monied arms I went.

Sturdy Ships

Getting everything running again on AWS cost me a weekend. I was already familiar with it and nothing about my DigitalOcean infrastructure was proprietary. For example, all I needed to do on Packer is change the build target from DigitalOcean to AWS and a few small script changes. Using non-proprietary technology (e.g. Terraform, Packer) gives you an exit strategy for free. Generally having an exit or alternative strategy whenever you are exploring something unknown is a good idea.

Using technology you already know (in this case, AWS) is a huge time saver and even though DigitalOcean is cheaper, at my current stage of my life my time is more expensive.

Admittedly, I think what made me truly stray away from AWS was the thrill of learning something new. The dark truth was that I didn’t really have to use DigitalOcean. It is pretty to find yourself with a few thousand dollars of AWS credits for a couple of years if you know where to look. Plus, it’s still possible to learn new things with technology you already know! When I migrated to AWS I decided to try setting up an ALB for the first time and bam, I learned something new in a smaller, safer way!

In hindsight, I should have started on AWS and only transitioned to DigitalOcean once cost became an issue. It was a classic case of premature optimization.

  1. 3

    I have an innocent question: how do you justify using these cloud-based services except that you know them?

    1. 1

      Are you asking about all cloud-based services or just AWS/DigitalOcean in particular?

      1. 3

        All of them. I wonder if it's really necessary to go in there from the beginning on, or scale to the cloud when you need to.

        1. 1

          Just to be clear, the architecture described in this post is intense for a small project! Most projects would do perfectly fine with a single host if you're comfortable with SSHing into a box or you could use a PaaS like Heroku (Heroku is pretty damn expensive but it is really low effort to maintain).

          I think starting with the cloud makes sense in most cases for the following reasons:

          1. It's cheaper than running a box at home. For example, an instance on Vultr starts at $2.50/month which is dirt cheap. Lambda can be even cheaper. (The time cost of maintaining something you built from scratch is non-trivial as well.)
          2. It reduces the amount of things you need to learn. For example, one of my requirements was to have a managed database because I don't want to learn how to be a proper DBA. I am essentially paying to have someone else do that work for me. That frees up time to do other stuff.
          3. It is a common and transferrable skill. What I learn at my day job I transfer to my hobby work. What I learn at during my hobby work I transfer to my day job.

          I hope that answered your question!

          1. 1

            Yep it makes sense. Thanks for that!

          2. 1

            The 7$ Heroku option is considered expensive compared to a 5$ droplet?

            1. 4

              As soon as you leave the $7 tier the pricing jumps up dramatically fast (1gb memory starts at $25). Other hosting services scale more gently.

              You're right though! The $7 tier might be sufficient for some (or even most on Indie Hackers) projects, and if so, that's great!

              1. 1

                This. Heroku is crazy expensive. We moved early from Heroku and based on our calculations our servers costs would have been triple on heroku for same performance.

                1. 1

                  How server intense is your application?

              2. 1

                Thanks, I hope to get to the higher tier someday, it will mean I am more profitable :)

                Is there anything comparable to heroku on the AWS side (with reduced costs) at the higher tier if I do consider changing?

                And how can I know when/if I need to "tier up" (at heroku)?

                1. 3

                  Usually it is very obvious when you need to "tier up". Things start breaking from too much load or you are working on a feature that requires more resources (e.g. batch processing of any kind).

                  Realistically if you're comfortable with Heroku and don't have the time or desire to learn DevOps, just stay on Heroku as long it is feasible. That is the moral of the article I wrote!

                  Also, you are allowed to use multiple providers for one-off things. For example, if you need to generate PDFs from a webpage (which is very memory intensive) that can be done using a Lambda function. No need to drop Heroku as your main API just for that.

                  My infrastructure for this project was optimized to avoid data loss (even from downtime) because I felt that was important for a journaling app. Honestly, I was being indulgent.

                  I have another small project I want to do that will likely just be a Netlify frontend with handful of Lambda API endpoints.

                  1. 1

                    I'm curious, from your experience, if there is anything comparable to heroku at aws?

                    1. 2

                      I think AWS ElasticBeanstalk maybe the nearest thing to Heroku in AWS

                    2. 1

                      Not that I know of. Something like https://github.com/apex/up which uses lambda would be the closest analog?

  2. 2

    Hi Sasha 👋... I head up product management for developer experience at DigitalOcean. If you're up for it, I'd love to better understand some of the database and load balancer bugs you hit. If you drop me an email using my first initial, last name at digitalocean maybe we can get some improvements going.

    1. 1

      Sure thing, I'll send you an email!

  3. 1

    nice tips Sasha, you said "It is pretty to find yourself with a few thousand dollars of AWS credits for a couple of years if you know where to look."

    where should I look?

    1. 1

      A few names were dropped in this thread, but generally most orgs or products that are meant to help new tech companies start will tend to have some AWS credits available to them. E.g. https://www.producthunt.com/ship gives you $5000 credits by signing up (it's not free per say, but the credits cost negative dollars).

  4. 1

    Right now I'm using DO and AWS
    I use S3 and cloudfront for obvious static assets.
    I use ECR for the only private reliable (easy honestly too) container repo.
    I use Route53 and CloudFlare to manage my traffic in a reliable way.
    All I put on DO really is the IAM, LB, VPC, Firewall, and Kube. I'd happily run containers on AWS or GCP, I have the certs and many professional years experience do it would be faster, but I need close to 100 instances for my stack and as you know the cost boundary for AWS and GCP is (generously) 20% of this.
    I don't have much inbound traffic, all that compute is the app doing it's thing, so eventually I'll also need to scale customers activity at which point I'll have cash flow and might then be in a position to go back to AWS or GCP for compute too

  5. 1

    We total trust digital ocean for four years running. We don't use the DBaaS and LB services. We use Databaselabs.io inside DO NYC2.

  6. 1

    If you're looking for hosting infrastructure for host side-projects I'd recommend https://packetriot.com.

    I built it so of course I'd recommend it :) That aside, I built this for devs (including myself) that want/need hosting but the project is not large enough or warrants the price that public cloud providers.

    DO is amazing value btw and I host Packetriot with them. Their fixed costs make it easy to figure out how much your project will cost you.

    There's a free tier on Packetriot for evaluating. The idea is that once your project takes off and it needs more reliable infrastructure you move it to the cloud. Until then, you get more value. Paid plans which pretty much cover most people start at $5/mo.

  7. 1

    I'm curious as to what type of application you're hosting here. I realize it's mostly irrelevant to this article, I'm only asking because I'm looking to do the same thing with one of my projects which is essentially a single server LAMP stack that needs to start scaling. Update: what caught my attention here is that I'm currently on DO and was about to configure a load balancer and switch to a managed database. Sounds a little sketchy and this could save me some time!

    1. 1

      I think what broke DigitalOcean was me constantly creating/destroying parts of my infrastructure.

      I still would recommend trying out the managed database service.

      The load balancer I would be more skeptical about. A droplet running HAproxy may do you better. That said, it is fast enough to get going that it is worth trying out and see if you have a better experience than me.

      Some other things I encountered with DO:

      • The LB particularly doesn't handle terraform very well due to how SSL certificates resource ids are managed. Once a SSL certificate rotates, it permanently the state file when it comes to that resource. I believe there is a PR open to fix this.
      • I never confirmed this, but the DB might have race condition issues? I ran two queries in a row and it didn't seem like the second query waited until the first query resolved. This was an issue I didn't witness on my local dev environment or AWS. Again, this may be an issue with my DB driver with DO's particular version of Postgres.
  8. 1

    First, it is half the cost of AWS. Droplets are cheap and provide remarkably good compute value.

    Did you check Amazon Lightsail? I am using it for my own projects and is cheaper than DigitalOcean.

    1. 1

      I avoided lightsail because I read some negative reviews of it a while ago. Things may have changed since then! If you are having success with it, that's cool to hear!

  9. 1

    Where do you find AWS credits? I know Product Hunt Founders Club but the credit will only last a year.

    1. 3

      I got $5000 for two years after buying this: https://appsumo.com/startups/

      1. 1

        Are there any limits for those $5000 credits (for example RDS is not included but ECS is) or it is really for all AWS service?

        1. 1

          Here's what AWS tells me:

          Below are a list of services that can be used with this specific credit.

          AWS Amplify
          AWS AppSync
          AWS Backup
          AWS Budgets
          AWS Certificate Manager
          AWS Cloud Map
          AWS CloudHSM
          AWS CloudTrail
          AWS CodeCommit
          AWS CodeDeploy
          AWS CodePipeline
          AWS Config
          AWS Cost Explorer
          AWS Data Exchange
          AWS Data Pipeline
          AWS Data Transfer
          AWS DataSync
          AWS Database Migration Service
          AWS Device Farm
          AWS Direct Connect
          AWS Directory Service
          AWS Elemental MediaConnect
          AWS Elemental MediaConvert
          AWS Elemental MediaLive
          AWS Elemental MediaPackage
          AWS Elemental MediaStore
          AWS Elemental MediaTailor
          AWS Firewall Manager
          AWS Global Accelerator
          AWS Glue
          AWS Greengrass
          AWS Ground Station
          AWS Import/Export
          AWS Import/Export Snowball
          AWS IoT
          AWS IoT 1 Click
          AWS IoT Analytics
          AWS IoT Device Defender
          AWS IoT Device Management
          AWS IoT Events
          AWS IoT SiteWise
          AWS IoT Things Graph
          AWS Key Management Service
          AWS Lambda
          AWS OpsWorks
          AWS RoboMaker
          AWS Secrets Manager
          AWS Security Hub
          AWS Service Catalog
          AWS Shield
          AWS Snowball Extra Days
          AWS Step Functions
          AWS Storage Gateway
          AWS Storage Gateway Deep Archive
          AWS Systems Manager
          AWS Transfer for SFTP
          AWS WAF
          AWS X-Ray
          Amazon API Gateway
          Amazon AppStream
          Amazon Athena
          Amazon Chime
          Amazon Chime Business Calling a service sold by AMCS LLC
          Amazon Chime Call Me
          Amazon Chime Dialin
          Amazon Chime Voice Connector a service sold by AMCS LLC
          Amazon Cloud Directory
          Amazon CloudFront
          Amazon CloudSearch
          Amazon Cognito
          Amazon Cognito Sync
          Amazon Comprehend
          Amazon Connect
          Amazon Detective
          Amazon DocumentDB (with MongoDB compatibility)
          Amazon DynamoDB
          Amazon EC2 Container Registry (ECR)
          Amazon EC2 Container Service
          Amazon ElastiCache
          Amazon Elastic Compute Cloud
          Amazon Elastic Container Service for Kubernetes
          Amazon Elastic File System
          Amazon Elastic Inference
          Amazon Elastic MapReduce
          Amazon Elastic Transcoder
          Amazon Elasticsearch Service
          Amazon FSx
          Amazon Forecast
          Amazon GameLift
          Amazon GameOn
          Amazon Glacier
          Amazon GuardDuty
          Amazon Inspector
          Amazon Kendra
          Amazon Kinesis
          Amazon Kinesis Analytics
          Amazon Kinesis Firehose
          Amazon Kinesis Video Streams
          Amazon Lex
          Amazon Lightsail
          Amazon MQ
          Amazon Machine Learning
          Amazon Macie
          Amazon Managed Apache Cassandra Service
          Amazon Managed Blockchain
          Amazon Managed Streaming for Apache Kafka
          Amazon Mobile Analytics
          Amazon Neptune
          Amazon Personalize
          Amazon Pinpoint
          Amazon Polly
          Amazon Quantum Ledger Database
          Amazon QuickSight
          Amazon Redshift
          Amazon Rekognition
          Amazon Relational Database Service
          Amazon Route 53
          Amazon S3 Glacier Deep Archive
          Amazon SageMaker
          Amazon Simple Email Service
          Amazon Simple Notification Service
          Amazon Simple Queue Service
          Amazon Simple Storage Service
          Amazon Simple Workflow Service
          Amazon SimpleDB
          Amazon Sumerian
          Amazon Textract
          Amazon Transcribe
          Amazon Translate
          Amazon Virtual Private Cloud
          Amazon WorkDocs
          Amazon WorkLink
          Amazon WorkSpaces
          Amazon Zocalo
          AmazonCloudWatch
          AmazonWorkMail
          CloudWatch Events
          CodeBuild
          CodeGuru
          Comprehend Medical
          Contact Center Telecommunications (service sold by AMCS, LLC)
          DynamoDB Accelerator (DAX)

          1. 1

            Can you believe Amazon even offers this number of services...crazy

            1. 1

              Yeah I was quite blown away sifting through all of it. I haven't used AWS for several years now and even though it's obviously very powerful, It's incredibly complicated to use, they use so many acronyms and terms I'm not familiar with making it hard to use.

          2. 1

            Thank you!

      2. 1

        this sounds like a good deal!

        1. 1

          Yeah, I'm happy with it, I don't really use so much resources so I probably won't use it all in these two years. But I think the only catch is that I think you can't apply if you already are established with AWS. Not sure though how it works exactly.

    2. 1

      Startup School by YC Combinator offers $3000 in AWS credits for participants.

      1. 1

        I am using the Startup School AWS credit right now. It lasts for two years. Startup School is a pretty good bundle of deals tbh.

        A lot of local institutions also provide credits. As an example, when I used to work out of a WeWork, they had a deal that was $2000 for 2 years. Not sure if that is still available given that WeWork imploded. Local incubators/accelerators/whateverators tend to have much better amounts.

    3. 1

      They recently sent me a $300 credit o_o I didn't pay for anything yet, just started using AWS free tier.
      If you participate in Pioneer App or start your company through Stripe Atlas, you get credits too.

  10. 1

    I'm a fan of AWS & a serverless architecture. I use stackery.io for deploying to AWS. Serverless architecture costs nothing at rest with no usage. DynamoDB, Lambdas, API Gateway, Etc.

    1. 1

      Serverless is great, but it is not applicable for every project.

      Tendship uses a lot of relational data and that means using a SQL database such as Postgres. You really don't want to use SQL with Lambda:

      • If you care at all about security, you will put your database behind a VPC with security groups. This breaks lambda slightly because it means that every time an instance is created it needs to be allocated an internal ip within the VPC which adds seconds to its boot time.
      • SQL databases provide limited number of connections. With Lambda every request is one connection. Aurora Serverless, at least when I checked in early 2019, is kind of garbage. It is 10x slower and does not play well with existing DB drivers. RDS Proxy might be the solution to this problem.
      1. 2

        Interesting point about VPC and Lambda. I use serverless framework these days so I’m no longer “in the weeds” of networking so to speak. Although, I’ve always been curious how the network interfacing worked. I just read this which describes how AWS hyperplane might solve the issue you described? Anyways, I enjoyed your post, some good insights here.

        1. 1

          The hyperplane looks really encouraging! The latency is still fairly high (~1 second), but it's better than 10 seconds I saw before. I think in a year or so AWS will have cracked this nut.

          Thanks for sharing!

      2. 1

        I was wondering, is it justifiable to replace an Nginx server with API Gateway or/and Lambda Edge? The cost would be higher but I don't have to maintain it.

        1. 1

          I wouldn't know! You still may want a reverse proxy to deal with gzip and other matters. If you're using language/framework that needs a web server like Flask, you definitely want to keep it.