14
36 Comments

What do you use to protect against bots?

My use case is authentication.

Assume throttling etc on server but looking to limit hits in particular for email login / form (aka magic link) as concerned random emails could be triggered just for fun.

  • Do you use Cloudflare, Google captcha?
  • Is there a credible indie or open source solution?

Thanks

  1. 4

    There's a lot of things you can do, but realistically, for starters, I would suggest against just having an "enter email" for a magic link. These can break, except when it actually comes to real usage. There's a write-up from not that long ago about not using magic links.

    Additionally, Captcha's are a bit dumb, and they are a terrible user experience, so I general avoid using those.

    Cloudflare, or really any CDN is great for this, most offer something like a WAF, to block attackers making requests. This is additionally fantastic because this protects your whole API not just one url, since hidden fields suck at that. Additionally, hidden fields don't get you anywhere, because if I use web automation, I can just pull the value out from the web form before making the submission, so you aren't blocking anything.

    There are two additional strategies for this:

    • Use an IAM as a service provider, we offer one and have things like WAF as well as other automated protections in place to prevent service abuse.
    • Just don't worry about this problem, sure it's the same as saying "Have you tried not having this problem". But what's the problem with allowing bots to request things, if it is really a problem, I would look into preventing the "why". Why is this an issue for you?

    Most of the time the solution is actually "Well it's a good a thing to do right?" And in truth, sure, but it's also really expensive to implement. So what value are you getting out of that. Most services don't actually benefit from this, and with any amount of traffic + scale, this is going to be so much lower than real users usage.

    1. 1

      #7 from your link captures why I asked. Other points not so much. Agree with your "is it really a problem worth addressing now" comment. Thanks.

  2. 2

    Cloudflare + a honeypot field such us:

    <input type="hidden" name="type_here_please" value="" aria-hidden="true">

    This should limit bot submissions etc.

    1. 3

      I have something similar as my honeypot - worth noting it's important to put in the relevant accessibility or else anyone using a screen reader gets labelled as a bot.

      1. 2

        Good point Ryan - I've updated the snippet.

  3. 1

    I would recommend three things:

    Cloudfare
    Use of Google reCaptcha
    Use of HTTPS

  4. 1

    Reposting from a previous thread

    I had the same problem when I launched my previous SaaS: automated signups from what seemed like stolen emails originating from residential IP addresses (probably breached IoT devices and whatnot).

    I hate Google's captcha, so I wanted to try something different first.

    I ended up using a Ruby gem called invisible captcha, which uses heuristics such as honeypot fields and time-sensitive submissions.

    Roughly speaking, if someone (1) fills an invisible form field (with a random name so that it won't be populated by password managers) OR (2) submits a form too quickly (let's say within 4 seconds of opening a page), they're probably a bot, and their input should be ignored. You can optionally inform folks to retry the request if they submitted it too fast.

    It was working great - not a single bogus signup after I implemented it. It won't fly if bots are using headless browsers, but most bots (and their operators) aren't sophisticated enough to pull that off.

    If your language doesn't have a similar library, it won't be that hard to write a middleware replicating this functionality.

    1. 1

      Cool. Very useful to hear your positive experience and the solution seems straightforward. Thanks.

  5. 1

    We use Hcaptcha. Free and works well.

    1. 1

      Looks the business + what a great idea! Thanks.

  6. 1

    Here's my story: https://www.indiehackers.com/post/whats-your-anti-spam-playbook-ff3b94468f

    Months later, I can say the problem is under control. All tools I've used for fighting spam are open source.

    1. 2

      Useful tactics. Thanks for sharing.

  7. 1

    Here's my very hacky solution:

    1. On the FE, I have an unused field <input ref="loginName" type="text" name="name" />. I hide it with CSS, not with the hidden tag, figuring this will catch more bots.
    2. Also on the FE I have a piece of JS set loginName to 214 after two seconds (or however fast a human could possibly fill out your form).
    3. On the BE, if name !== 214, return a 503.
    1. 1

      That's sneaky. Remind me never to play you at poker. Thanks!

  8. 1

    Honeypot field similar to already mentioned and my rule of thumb to load forms just by javascript because you don't need them for SEO. Best case is to show them as a modal on click, even as a full page modal if you don't like the modal design.

    1. 1

      load forms just by javascript
      Best case is to show them as a modal

      Neat ideas! Thanks.

  9. 1

    I use CloudFlare and Google ReCaptcha for my projects/websites, seems to do the trick for me.

    1. 1

      Solid choices. Thanks.

  10. 1

    Honeypot + throttling.

    For a magic link, you could also throttle the emails.

    1. 1

      Solid enough for my use case I would think. Thanks.

  11. 1

    Using firebase and 🙈 for now
    On other projects depending on the severity of issues I :

    • Honeypot
    • Block all major server networks (AWS was like 99% of bots)
    • CrCF was it? Unique Id a request..
    • Block silly useragents like curl, python and anything with bot in it, Yea these legit happen...
    • Specific lookups and blocks
    1. 1

      Can you expand on each of these. How are you honeypotting? What technique are you using to block server networks?

      1. 1
        • honeypotting - https://www.projecthoneypot.org/ in full platforms like wp, you'd find plugins you can use that are ready-made, but otherwise it's just putting a few links around and importing the block list, there are instructions for pushing it to the HTTP servers like nginx in place of doing it in app..
        • Block all major server networks - just download some server lists, do mind it if your after google indexing for example not to exclude that, I don't recall where exactly I got good lists but some random example https://docs.aws.amazon.com/general/latest/gr/aws-ip-ranges.html https://gist.github.com/n0531m/f3714f6ad6ef738a3b0a
          again better to push it to nginx or something
          Also note this might block some VPN/proxy services
        • csrf - more involved if you need to build your own, if it's a full system you might find a made solution or a lib, but basically it's generating a unique id for the form and expecting it back on submit, just an HTML 'hidden' field that's changing
        • Block silly user agents - https://www.cyberciti.biz/faq/unix-linux-appleosx-bsd-nginx-block-user-agent/
          can try this https://github.com/mitchellkrogza/nginx-ultimate-bad-bot-blocker
          Also note not to block legit crawlers you might want like search index bots, you might want to whitelist them like google,mns/bing/whatev they call it today, Yandex maybe...
          You can scan you analytics for them...
        • Specific lookups and blocks - it possibly won't show in 'normal' analytics but it nginx server logs and aggregators you can see who's pulling dispropotional amount of calls by different slices and overtime get efficent at this with ready-made queries/alerts..
    2. 1

      Ha. 🙈 works for me. csrf probably. Can see it becomes involved. Going to look to hand-it off as much as a I can. Think I see a way for the email magic-link option. Thanks.

  12. 1

    I use a self-built honeypot, Google Invisible Recaptcha and email verification for sign-ups but although this stops most bots, some clever bots will get past all of these.

    I have tried Cloudflare for a client whose site was getting hammered because he didn't moderate comments on his blog. It was useless and slowed the site to a crawl - something like 15 seconds to show a page. Cloudflare also do really weird stuff like sending the massive HEAD responses of megabytes rather than a few bytes. I ended up finding a different solution.

    I had a load of bots from users at glitch.me signing up to Downtime Monkey to ping their sites and keep their free servers online all the time (they usually spin up only when in use). After battling this for a week I ended up blacklisting glitch users and auto-deleting accounts when someone tried to set up monitor for glitch.

    What's quite amusing is that the bots that are clever enough to get past all the recaptchas aren't clever enough to stop trying when they hit air. Months later I still get a few attempts each day.

    1. 1

      Interesting. Accept it's impossible to stop all. Thanks for input.

  13. 1

    Following 🙋‍♀️

    Google captcha at this point, still in testing phase. Still a bit shocked about how captcha works in terms of user privacy. Haven´t made my mind up yet about the ethical part of it.. I´ve been hearing some great things about Cloudfare though.

    1. 2

      Hey. Yeah doesn't surprise me. I don't know internals of Captcha but all Google freebies are there to track, even those wonderful fonts we all love to use. Check FriendlyCaptcha in this post, although one reply seems to be challenging it.

  14. 1

    I suppose this is the time to plug the product I helped build: Friendly Captcha as the privacy-friendly alternative (no cookies, no tracking, it works a bit differently).

    It's open core (i.e. the SaaS around it is not open source, but the building blocks are open source, as is the widget/code you would put into your website).

    1. 1

      Absolutely the time to plug. Looks great! Really good. Will play with it to check but on first viewing looks like just the kind of thing I was looking for.

    2. 1

      Seems to be similar to geetest.com.

      During scraping websites with this protection we just triggered a lambda function that executed that crypto puzzle 🤷

      1. 1

        I think that's fair :)

        We work hard to make sure it is as effective as possible while not compromising on privacy+accessibility, but no captcha will keep out all spammers/scrapers.

        You can buy thousands of solves for $1 for normal captchas, or spend time+resources solving crypto puzzles. There is no perfect captcha that will protect against everything reliably. Most people use captchas against untargeted abuse (e.g. scripts submitting to any internet form with some adult ad text/email), not targeted attacks from those who are willing to spend actual money/time (I would argue a reasonable amount of automated scraping is not an attack anyway).

        We have some more advanced stuff running in the backend too: we adapt the difficulty of the crypto puzzle based on some signals (a straightforward one being if you made many requests recently it gets more difficult).

        1. 1

          Yup, I get it, there is no solution that would prevent bots from scraping a website. Tho this is a really interesting field to work with.

          Another popular approach to detect bots is analyzing browser fingerprints. AFAIR distil networks provide some decent bot detection solutions

  15. 1

    I've used a captcha on a Rails site to protect a form submission. I guess that wouldn't work for a magic link though.

    1. 1

      Hi. Could use on the form to avoid malicious triggering of email send. So yeah relevant. Have updated post to be clearer. Was it the Google captcha you used. Thanks

Trending on Indie Hackers
Feedback on my (not yet published) about page 24 comments Vegans, vegetarians, and anyone with an allergy, food intolerance, or just a preference, I need you! 13 comments Open Sourcing my SAAS Starter Kit 10 comments A house in Germany is being sold as an NFT 9 comments Nerdogram - A photo sharing app for Github nerds 5 comments Free Python Books Went Viral on Hacker News 5 comments