April 15, 2019

How important is A/B Testing?


This is more or less a “best practice” among tech giants but I’ve found that most startups don’t actually implement these.

It’s seen as a “nice to have” given the seemingly endless pile of priorities and tasks.

Is this something you’ve used? Is it effective? How did you build it?

  1. 2

    I find it useful for growth which is a later stage than problem solution fit and also product market fit. It's mainly for optimizing your funnel, but you need a funnel to begin with. If you're early stage, focus on getting customers and save the A/B testing for once you've figured out the hard parts that make your business model work. <- From my experience. Curious if anyone has a differeing opinion though.

    1. 1

      I couldn't agree more! A/B Testing in that context won't really get you to product/market fit but it's also not the only value of A/B Testing.

      There are a few more advantages to A/B Testing aside from optimizing conversion rates, etc. The main one being "best practices" for feature deployments. Not only can you assess if your new features are having the effect you desire, but you also mitigate risk in deployments.

      Let's say, you rollout a new feature to 10% of your audience. You check the metrics, everything looks good, and you bump up that % to your full user base. This allows you to:

      1. Test new features in a production environment
      2. Quickly assess quality of new code
      3. Assess effectiveness of new feature
      4. Decrease the risk of deployments


      1. 1

        Yeah, agree with you. Hadn't considered that progressive rollouts are quite like A/B tests in and of themselves. For A/B testing things like copy etc, if you have enough users for your tests to have meaningful results, you're likely far enough along to benifit from them.

        1. 1

          Do you use progressive rollouts in your deployment process?

          1. 1

            Not in this sense, but I'd like to. Right now I'm using Kubernetes which does rolling updates on deployment, mitigating deployment risk.

            Hypothetically, I can push one pod at a time to progressively rollout (ie: have 10 pods, but only replace one of them do a 10% deploy), but I'm not currently doing that.

            1. 1

              That's definitely a solution but lacks the targetting and analytics that you would probably want.. I've used something like that in the past and it gets the job done (to an extent).

              I'm curious what's holding you back from implementing a more robust solution for progressive rollouts?

              1. 1

                Yeah, totally! Think it just hasn't been a priority yet. Need a validated business model first! 😋

                1. 1

                  Got it. First things first! haha

                  How big is your team by the way?

                  There are a few solutions similar to what I'm talking about but they are geared toward enterprise companies. I want to bring these best practices to the typical startup but it's often not a high priority like you mentioned (which not good for my business model).

                  I'm wondering what would need to change in order for this to become a priority for you?

                  1. 1

                    Off the top of my head, I think some tipping point # of paying customers to price point ratio. So for an expensive product, would need 5-10 customers before I started to think of A/B testing as a priority. For inexpensive products, would need more. 50-100 paying customers.

    2. 1

      Sounds like good advice to me. I dabbled in A/B testing but struggled and it just didn't feel right, in hindsight it was probably because we were doing it too early with too many other factors going on.

      Also, as a young company it was too much like chasing our tails with all the things that there are to do. Often we would do A/B tests, but then not have a proper chance to look at the results. Doh.

      1. 1

        I've definitely been here before. A/B tests can get tricky (and time consuming), especially for startups. I've actually implemented a solution to this for several companies and I'm considering productizing it for other startups to take advantage. I'm not talking about A/B Tests simply as a means of optimizing conversion rates, but actually as a means of implementing "best practices" for feature deployment. If you're interested, check out my description below!


        I’m building a tool that lets developers and PMs deploy new features the same way FB, Twitter, and AirBnB do.

        It will be extremely easy to toggle or split test new features, taking nearly one line of code to get started and incredibly powerful from a dashboard. I’m talking about:

        • A/B Testing
        • Controlled Rollouts
        • Testing in production (forget expensive and inconsistent staging environments)
        • Beta releases
        • Location targeting
        • Feature kill switch (in the case of bugs)
          and more...

        I’ve built this solution for several companies already and know firsthand how valuable it can be in your feature release process. Not only can you push new feature with less risk, but you can test their effectiveness as well. All without adding complexity to the actual implementation.

        I believe it’s something all devs teams should implement whether they use my tool or not. The problem is that most startups simply don’t have time to add this to their workflow. I want to solve this.

  2. 1

    I've worked at small companies, and we always tested my software before deployment into production. It doesn't make sense to push changes to the live server only to make it crash. If you don't test your software your customers will, and if they can't use it they'll move on to something else.

    1. 1

      Woah! I hope everyone's testing their code haha. Definitely not advocating that we stop testing code BUT is your local or staging environment always an exact replica of production?

      The usage, data, and general infrastructure often have some inconsistencies so a better practice than just "pushing changes to the live server" is to wrap your new feature in a feature flag first.

      You can then rollout that feature to, say, 10% of users to make sure it works in an actual production environment. If something goes wrong, simply turn it off. Otherwise, scale it up! It's a safe practice for launching new features and mitigates risk.

  3. 1

    Google's Optimize product is pretty great. Since starting to use it I find that I build new features with that in mind (being able to easily switch between variants with a css selector, for example).

    You have to know what you're measuring first, though. If you don't know what your funnel looks like you're not going to get much out of A/B testing.

    1. 1

      Google's Optimize is a bit limited to switching between variants and css. If you wanted to update full product features or workflows (not just your landing page), then it falls shorts

  4. 1

    For most SaaS businesses, it's pointless. You don't have the traffic to get meaningful or significant results in a reasonable amount of time. You're better off doing Qual research most of the time.

    1. 1

      100%. A/B Testing is small bananas compared to qual research for most SaaS Businesses BUT it's also not the only value of A/B Testing.

      Aside from optimizing conversion rates (which isn't really useful for a lot of SaaS companies), the other main is for safer feature deployments. Not only can you assess if your new features are having the effect you desire, but you also mitigate risk in deployments.

      Let's say, you rollout a new feature to 10% of your audience. You check the metrics, everything looks good, and you bump up that % to your full user base. This allows you to:

      1. Test new features in a production environment
      2. Quickly assess quality of new code
      3. Assess effectiveness of new feature
      4. Decrease the risk of deployments
      5. Speed up feature releases


      1. 2

        We do this a lot at EmailOctopus, though we would call this 'feature flags' rather than A/B testing.

        Would recommend the tool Unleash if you're looking to do this:

        1. 1

          I've seen unleash! Unfortunately it wasn't a good fit for my team, so we ended up building our own solution. I'm sure it could be helpful for some other teams though! Specifically if you:

          1. Don't need extensive targetting rules
          2. Willing to take on maintenance / hosting costs
          3. Don't need a distributed CDN solution (increased latency to determine if a flag is on/off)

          I'd love to learn more about how long it took you to setup Unleash and how it's been working for you! Just sent you an email!

  5. 1

    It's useful, but you REALLY need the data throughput. I've made the mistake in the past of basing decisions of A/B tests with hundreds of samples, but you really need thousands to make it authentic.

    A coin will land heads 10 times in a row from time to time, after-all.

    1. 1

      Have you used A/B Testing (split testing) as a means of safer deployments? Often A/B tests aren't effective in the context you're talking about, BUT they are excellent tools in your deployment process.

      Let's say you have a new refactor or feature build. You want to push it to production but your staging environment isn't fully consistent. You want to deploy it, but it's a bit risky.

      Try out an A/B test! Roll it out to 10% of your users, your most loyal users, or your beta users. Test it (in production) on a subset of your user base and then roll it out to everyone once you see it working in action! It's a best practice used by the likes of FB, Twitter, and AirBnB and I'm actually building a product that allows developers & PMs to implement this with one line of code and completely controllable from a dashboard.

      Let me know if you want to try it out!

      1. 1

        I haven't, but truthfully I've never worked in an environment where we'd have enough users to break them into a meaningful cohort to try something like that.

        But I totally support that approach. It's pretty rad.

        1. 1

          Yeah definitely. Maybe I can run a different scenario by you. This one is one of my personal favorites :)

          Staging environments are often inconsistent with production but you want your new features to be tested in an environment as close to production as possible.

          Why not just test it in production? Target the feature to only yourself, your team, or maybe even your ip address so the whole office can test with you but hide it from your users entirely. Test IN PRODUCTION, then roll it out to everyone (from a dashboard).

          I love this approach since it drastically speeds up the release cycle. Don't worry about your staging environment being different. Deploy. Test. Release (from a dashboard).

          p.s. you can also kill your feature in real-time if your code has any bugs.

          I've used this solution before and I've never felt safer pushing to production

  6. 1

    In the early days definitely focus on customer development and getting customers in the first place.

    But A/B testing is super important as well! Especially when you are running social media ads or ads in general then A/B testing is the only way to go. You can't find a good ad thats profitable to you without testing a bunch of different things.

    And even for landing pages and stuff like this it's the difference between getting 2% conversion or 8% - that could be a massive change for your business.

    But as I said - only if you have some leads coming your way anyway.

    1. 1

      100% couldn't agree more. At least in the context of A/B Testing. I see additional value though, more in the feature release process than the actual conversion optimization.

      I've built this solution a few times for previous companies. They needed feature flags, controlled releases, and split testing. So they can rollout new features to, say, 10% of users. Or their beta users. Or simply our engineering team to test new code in production. Then scale it out to the rest of your user base from a dashboard.

      Every time I've thought to myself "This isn't our core business", "This is just software plumbing", "We should just buy this". But all the solutions were way too expensive for startups.

      I'm trying to validate the market need for this internal tool. Coincidentally the same type of tool used by Facebook, Twitter, and Airbnb to launch new features.

      Hoping to get some thoughts from developers, product managers, and anyone that works in the startup space!

  7. 1

    This comment was deleted a year ago.