1
0 Comments

I got tired of the puppeteer-extra + stealth + 2captcha setup. Built a CLI with indexed elements, 3 browser modes, and built-in solve.

Every AI agent I've built that touches the web ends up reinventing the same stack:

  • puppeteer-core or playwright for automation
  • puppeteer-extra-plugin-stealth for bot detection
  • 2captcha / anti-captcha for CAPTCHAs
  • Your own cookie jar for session persistence
  • A manual CSS/XPath selector layer, because the agent can't reliably generate selectors
  • When the stealth plugin breaks: fallback to real Chrome via CDP

A day of plumbing before the agent logic even starts. The real problem isn't any single plugin — it's that every project rebuilds the same glue.

So I built BrowserAct, a single CLI that collapses all of it.

What's actually in it (everything below is shipped, no roadmap)

1. Indexed interactive elements — the feature I missed most

browser-act --session agent state
# Returns URL, title, and numbered interactive elements:
# 0: <a href=...>Sign in</a>
# 1: <input placeholder="email">
# 2: <button>Continue</button>
browser-act --session agent input 1 "[email protected]"
browser-act --session agent click 2

The agent doesn't generate CSS selectors. It reads an indexed list and picks a number. This alone killed most of the brittle-selector problems I had with raw Puppeteer.

2. Three browser modes, one CLI

  • browser open <browser_id> — managed stealth browser with proxy built in (--dynamic-proxy with region selection, or --custom-proxy)
  • browser real open <url> — your logged-in Chrome via CDP auto-discovery
  • browser real open <url> --ba-kernel — bundled Chromium (no host-Chrome dependency)

Switch modes with a flag, not a different library.

3. Stealth and CAPTCHA built in

browser-act stealth-extract <url>              # one-shot anti-detection extract
browser-act solve-captcha                       # auto-solve on current page
browser-act human-assist-url --objective "..."  # zlink URL for human-in-the-loop

No separate 2captcha API key, no stealth-plugin version drift.

4. LLM-friendly page data

browser-act --session agent get markdown       # page as markdown — feed straight to the model
browser-act --session agent network requests --type xhr --status 200
browser-act --session agent network har start

get markdown is what I needed to stop shipping 500KB HTML blobs into prompts.

5. Session persistence across CLI calls

browser-act --session agent1 navigate https://site.com/login
# ... login flow ...
browser-act --session agent1 cookies export cookies.json
# Later, same session still logged in:
browser-act --session agent1 navigate https://site.com/dashboard

On the agent side

My AI agents are in Python, Go, and sometimes Rust. A CLI means the same five verbs (state / click / input / get markdown / eval) work from any language — no bindings, no version alignment.

Where I've used it in production

  • Three agents running concurrently via --session a1/a2/a3
  • solve-captcha cleared Cloudflare Turnstile and hCaptcha on a site I was scraping daily
  • get markdown + network requests cut my LLM token bill roughly in half on research agents

Looking for

  1. Sites where your current stealth setup keeps failing — I want to test against them
  2. Feedback from anyone else who's been stuck rebuilding the same Puppeteer + stealth + captcha triangle
  3. Contributors — source is open

GitHub: https://github.com/browser-act/skills/tree/main/browser-act

Curious how others have solved this. Still on puppeteer-extra? Switched to Playwright? Wrote your own wrapper? Would love to compare notes.

posted to Icon for group Show IH
Show IH
on May 6, 2026
Trending on Indie Hackers
Agencies charge $5,000 for a 60-second product demo video. I make mine for $0. Here's the exact workflow. User Avatar 132 comments I've been building for months and made $0. Here's the honest psychological reason — and it's not what I expected. User Avatar 77 comments I wasted 6 months building a failed startup. Built TrendyRevenue to validate ideas in 10 seconds. User Avatar 59 comments Your files aren’t messy. They’re just stuck in the wrong system. User Avatar 29 comments This system tells you what’s working in your startup — every week User Avatar 25 comments Why Direction Matters More Than Motivation in Exam Preparation User Avatar 14 comments