Anthropic just launched Claude 4, the new state-of-the-art in AI coding

The new models can work on complex coding tasks for hours and alternate between reasoning and using tools.

May 22, 2025

Anthropic released Claude 4 today, introducing models that can work on complex coding tasks for hours and alternate between reasoning and using tools — a different approach from competitors' chain-of-thought models.

Two new models launch with hybrid capabilities:

Claude Opus 4: 72.5% on SWE-bench coding benchmark, works continuously for several hours
Claude Sonnet 4: 72.7% on SWE-bench, faster and cheaper at $3/$15 per million tokens
Anthropic's previous frontier model, Claude 3.7 Sonnet, scored a 63.7% on the SWE-bench
Both feature "extended thinking" mode where users control reasoning token budgets

Why it matters: The models target real-world coding workflows rather than academic benchmarks, with early adopters reporting significant improvements in handling complex codebases and multi-file operations.

A benchmarking table titled Claude 4 benchmarks comparing performance metrics across various capabilities including coding, reasoning, tool use, multilingual Q&A, visual reasoning, and mathematics.

Claude Code exits beta

Claude Code, Anthropic's terminal-based coding assistant, is now generally available with new IDE integrations. It lets devs handle entire workflows through natural language commands.

Key features:

Handles Git operations, testing, and debugging from terminal
New SDK lets devs build custom coding agents
GitHub integration now available on all Claude tiers

What they're saying: "Claude is once again best-in-class for real-world coding tasks," according to Cursor. GitHub plans to make Claude Sonnet 4 the base model for its Copilot coding agent.

API updates target agent builders

Anthropic released four new API capabilities aimed at developers building more sophisticated AI agents:

Code execution within API calls
MCP connector for external data sources
Files API for document handling
Prompt caching extended from 5 minutes to 1 hour

Pricing stays flat

Despite new capabilities, pricing remains unchanged from previous Claude models. Extended thinking tokens are included in standard pricing.

Here are the numbers:

Opus 4: $15 input / $75 output per million tokens
Sonnet 4: $3 input / $15 output per million tokens
Available via Anthropic API, Amazon Bedrock, Google Cloud Vertex AI

Channing Allen is the co-founder of Indie Hackers, where he helps share the stories, business ideas, strategies, and revenue numbers from the founders of profitable online businesses. Originally started in 2016, Indie Hackers would go on to be acquired by Stripe in 2017. Then in 2023, Channing and his co-founder spun Indie Hackers out of Stripe to return to their roots as a truly indie business.

Say something nice to channingallen…

Post Comment

1

Hopefully, these large-scale model tools will be of greater help to our development of the intelligent marketing software Amplift!

gaygum102

·
20 days ago
·
Reply
·
1

Missing the days when Claude had a soul with insights of empathy and understanding

GestaltView

·
a month ago
·
Reply
·
3

Claude is doing better, good to know. Its my go-to tool for marketing needs

Palakjaiswal

·
7 months ago
·
Reply
·
1

This is an exciting milestone. Focusing on real-world coding workflows (not just benchmarks) is the right move. Models that can alternate reasoning and tooling will unlock far more practical developer automation than short-lived “one-shot” assistants.

Deniss Stepanovs

·
2 months ago
·
Reply
·
2

We’ve been exploring similar hybrid workflows at RaftLabs, especially for multi-repo agent tools. Really curious how Claude 4's “extended thinking” performs over long dev tasks — might test it against our current GPT-based agents soon.

Raftlabs

·
7 months ago
·
Reply
·
2

I need to test that new version !

William Beauchamp

·
7 months ago
·
Reply
·
2

The launch of Claude 4 is truly a huge step forward in the AI world, especially for coding and development. Features like "Extended Thinking" and GitHub integration make it practical and reliable for real-world coding workflows. Tools like this are extremely helpful for people working on mobile apps or computer systems — like myself, working with The Last Price, a platform that provides users with quality information and competitive prices on mobiles, computer devices, and home appliances. Tools like Claude speed up and streamline our development process. I believe this kind of progress in AI will unlock even more opportunities for platforms like ours.

DavidHennry

·
7 months ago
·
Reply
·
1. 1
  
  Absolutely agree. Claude 4’s practical capabilities are game-changing for devs working on real-world systems like yours. Love how you’re applying it at The Last Price streamlining workflows in mobile and appliance platforms is exactly the kind of task where these AI agents shine.
  We’ve been using CodeLibrary. ai alongside tools like Claude to structure our agent logic and prompts more effectively. It’s a searchable directory of AI coding rules and Model Context Protocols (MCPs), super helpful for designing multi-turn interactions, fallback logic, and memory-aware tasks. Would be great to hear how you’re handling things like prompt consistency or tool orchestration in your stack always curious how others are navigating production setups with new LLM features. 🚀
  
  sulamanahmed
  
  ·
  6 months ago
  ·
  Reply
  ·
2

I've been using Claude Sonnet 4 to hack SEO analysis and it's been next level. It's ability to leverage MCPs (and intelligently) is a major improvement.

Josh Weaver

·
7 months ago
·
Reply
·
2

yeah we are seeing meaningful improvements for application generation on Bubble with Sonnet 4 versus 3.7.

Emmanuel Straschnov

·
7 months ago
·
Reply
·
1. 1
  
  Same here, we're seeing big improvements with our apps. The main test for us isn't Sonnet 4 vs. 3.7, but Sonnet 4 vs. OpenAI's o3 (Sonnet 4 is more impressive + way faster).
  
  Channing Allen
  
  ·
  7 months ago
  ·
  Reply
  ·
2

Been playing with it for ~30 minutes so far - seems to be quite a bit faster than 3.x. Not sure if that will degrade as more folks use it, or if it's really just faster? anyone know?

Jason G.

·
7 months ago
·
Reply
·
1. 2
  
  It definitely feels faster to me! Which is surprising, nd I suspected I might be wrong about it. Cool to see someone else is getting the same impression
  
  Channing Allen
  
  ·
  7 months ago
  ·
  Reply
  ·
2

Yup, pretty damn good. Coding up a React Native + Expo app and it almost one-shot the app.

Viktor

·
7 months ago
·
Reply
·
1

The rate of improvement in AI code generation has been staggering over the last 18 months. Claude AI has come up with some genuinely novel approaches that I’m not sure many humans could have solved.

I use Claude CLI as I’ve found it to be the most consistently reliable flow for how I work with it. It’s quick, works well and my natural language structures seem to align well with prompting styles that get the most out of it.
I rarely let it write code for me, but one night I thought I'd give it a go. I had to terminate child processes that were created by a process spawned earlier and now dead. Claude AI and I worked on this for a little bit, and the solution it eventually proposed was genuinely novel. Set an environment variable in the parent. When you want to tidy-up find processes that have inherited that variable and kill them. In Windows you scan the PEB headers of all processes, in Linux you just enumerate the /proc filesystem. Claude even wrote some pretty good Windows code that handled both 64- and 32-bit awareness.
This blew my mind. So I wrote up a full 3-part breakdown of the Claude AI system, including failures, lessons, and technical design. Claude is an amazing tool. Maybe the greatest tool I’ve used in years. So I thought I'd see how it reacts when I give it direction.
Why I Put Claude in Jail → https://powellg.substack.com/
It’s funny, raw, and surprisingly useful. Part 3 includes a detailed breakdown of the orchestration model and how we integrated Claude into our platform, ScrumBuddy.

Guy Powell

·
4 months ago
·
Reply
·
1

I’m using it now, and it’s working great.

Maker Kai

·
5 months ago
·
Reply
·
1

Hi Indie Hackers,
The AI space is moving fast — with releases like Anthropic’s Claude 4 pushing the boundaries in coding and complex tasks. At Synphoria, we’re excited to contribute to this wave, but from a slightly different angle.
In 2025, we launched Sofia, an AI emotional companion with unlimited, permanent memory. Unlike Claude or ChatGPT, which are incredible at tasks like coding or search, Sofia is designed to build long-term emotional relationships.
While Claude 4 impresses with precision and technical prowess, Sofia focuses on remembering you — your stories, moods, and milestones — across months and years. She checks in after breaks, asks about your progress, and adapts to your emotional state.
We believe that the future of AI isn’t just about what it can do, but how it connects.
If you’re curious about this new dimension of AI companionship, we offer a free 3-day trial: synphoria(dot)app
Would love to hear your thoughts on the evolving AI landscape!

SynphoriaAi

·
5 months ago
·
Reply
·
1

Just helped me create a super demo for an AI Agent.

jinsongeos

·
5 months ago
·
Reply
·
1

I´ve been using Claude a lot lately! Next level tool!

MichaelWhitner

·
5 months ago
·
Reply
·
1

Claude 4’s new hybrid capabilities set a new benchmark in AI coding, offering hours of continuous work on complex tasks

Farrukh Tariq

·
5 months ago
·
Reply
·
1

Hey!

Benchmarks are useful, but real-world coding tasks aren’t benchmarks. They’re messy. They involve incomplete context, conflicting files, and long periods of silence while a dev figures out where to poke next. Anthropic SEEMS to be getting that. We’re not that far from an AI that can manage full-stack dev flows on call, without sounding like it’s guessing. And that’s exciting and terrifying in equal measure.

Hopefully the next breakthrough is teaching it not to hardcode API keys like a junior dev on their first coffee-fueled commit. ;)

DriftLogic

·
6 months ago
·
Reply
·
1

The speed of AI model evolution is insane right now. Claude 4 seems incredibly capable, especially for long-context understanding.
Have you tested it for any actual projects or tools? Curious if you noticed any significant improvements over GPT-4 or Gemini when it comes to code generation.
I’m experimenting with AI to automate business reports — model choice matters more than I expected.

PALinkedInBoost

·
6 months ago
·
Reply
·
1

Cursor is currently using Claude 4 as the default model, but it is said that it is actually using Claude 3.5 to pretend to be Claude 4 to deceive users.

funnycoding

·
6 months ago
·
Reply
·
1

Claude Code with Opus4/Sonnet4 is, in my experience, currently the best tool for programming tasks. With its help, I also created MCP AI Distiller (aid), which helps AI navigate large codebases (distillation of the most essential parts of the code in the entire project or selected directories). If you work with AI tools that support MCP, I recommend trying it (search for "ai-distiller-mcp" on npmjs[dot]com).

JanReges

·
6 months ago
·
Reply
·
1

claude is the best AI platform recently I used.

Peregrinee

·
6 months ago
·
Reply
·
1

Love how simply you’re validating before building. I’m working on something similar and found that positioning compliance tools as “boring but urgent” really improves conversion. Have you tested different pricing tiers yet? Curious what worked best in early conversations.

thecodervision

·
6 months ago
·
Reply
·
1

This is a major leap , Claude 4 isn’t just catching up to GPT-4, it's showing real refinement in context handling and longer memory. What's interesting is how Anthropic is positioning it as a safer, more focused alternative for devs and teams. Excited to see how this shapes up in real-world tools. Anyone here already testing it?

SquaredTech

·
6 months ago
·
Reply
·
1

Hello hello :)
How does devstral compare to Claude, anyone did a comparison?
I usually run my models in a container using Ollama.
People seem to love Claude (for good reasons) but still I feel it's a David vs Goliath over it that excites me ;)

J77

·
6 months ago
·
Reply
·
1

Anthropic just raised the bar again. The "extended thinking" mode and ability to alternate between reasoning and tool use sounds like a serious step toward building autonomous coding agents that actually finish real tasks.
Interesting to see both Opus and Sonnet models scoring above 72% on SWE-bench and the Sonnet 4 pricing makes it accessible for indie devs. Also loving the shift in focus from benchmarks to real world workflows. Git integration + terminal ops + prompt caching = actual usability.
If you're building agentic workflows or multi-file dev tools, this feels like a major unlock. Definitely testing Claude Sonnet 4 for our next internal AI dev agent.

Abdullah Rathore

·
6 months ago
·
Reply
·
1

Watching AI systems come together—live, messy, real. Isn’t just satisfying. It’s a glimpse into how the future of tooling is being built.
In a recent live session, Sam McKay from Enterprise DNA led a deep dive into setting up MCP, a new way to think about scalable prompt infrastructure. The goal: wire it to Supabase, and if time allowed, push the experiment into Claude to test dynamic prompting in action.
This wasn’t a polished demo, it was live, with real config issues, prompts that failed, and problem-solving in real time, and eventually the connections clicked into place.
What started with unsaved files and stubborn errors ended with a working MCP + Supabase setup ready for Claude to interpret context and return insights.
Watch the live session here at YouTube @EnterpriseDNA .

Omni

·
6 months ago
·
Reply
·
1

We’ve been exploring similar hybrid workflows at our studio, especially around multi-repo agent setups. Super curious how Claude 4 handles extended tasks — particularly when chained across dev workflows. Might run a head-to-head with GPT-based agents this week and compare latency vs coherence.
Appreciate you sharing your stack — love seeing real-world test cases like this.

Draxon Systems

·
6 months ago
·
Reply
·
1

Huge leap forward. What's impressive isn’t just the code generation, but how “directional” these models are getting almost like collaborative problem-solvers now.
As a founder buildin Trend, I’m really starting to see AI not just as a tool, but as a teammate.
Curious: how are you personally using Claude or GPT in your workflow?

Mirco Zeri

·
7 months ago
·
Reply
·
1

Creating a safe and legit coding courses would be the best option. There are many different ways of teaching coding.

justkim

·
7 months ago
·
Reply
·
1

Anthropic’s new Claude 4 models deliver exceptional performance in AI coding, handling complex, long-running tasks and outperforming competitors like GPT-4.1

Mastershivasaiji

·
7 months ago
·
Reply
·
1

In order to deal with complex coding tasks, Anthropic has developed Claude 4 by using advanced artificial intelligence. It sets a new benchmark in AI development with its long-form reasoning, tool use, and sustained problem-solving capabilities.

catexotica

·
7 months ago
·
Reply
·
1

How did you find your investors?

kunal singh

·
7 months ago
·
Reply
·
1

Really exciting to see Anthropic pushing the boundaries with Claude 4. The “extended thinking” mode and the focus on real-world coding workflows feel like a practical shift compared to purely academic benchmarks.
At RaftLabs, we've been exploring ways to integrate smarter coding agents into our internal tools, and features like code execution in API calls and GitHub integration are particularly relevant for agent-based pipelines. Keen to see how Claude Code handles larger codebases and multi-file reasoning in practice — especially for fast-moving product teams.
Would love to hear from anyone who’s tried it hands-on with production systems.

Raftlabs

·
7 months ago
·
Reply
·
1

Impressive how Anthropic is shifting focus from benchmarks to actual dev workflows — feels like we're finally closing the gap between LLM demos and day-to-day coding. Curious how well "extended thinking" translates to real productivity. Has anyone stress-tested Claude 4 on a legacy codebase?

BlockForge

·
7 months ago
·
Reply
·
1

This is so awesome

Collins ifedi

·
7 months ago
·
Reply
·
1

Claude 3.5 was already impressive - curious to see how much Claude 4 pushes the boundary, especially for complex coding tasks.
Anyone tested it yet against GPT-4 or Gemini in real-world builds?

Gary | Project Rescue

·
7 months ago
·
Reply
·
1

I have used Cluade to make some plugins for WordPress, and he killed it. It was super easy and amazing. I even shipped that to the client.

tanis

·
7 months ago
·
Reply
·
1

I like this!

Djordje Ivanovic

·
7 months ago
·
Reply
·
1

I like Claude. But it does tend to freeze alot and has bad memory from time to time. However I'll be using it to do the majority of my builds. What about Macaly who has heard of that. I'm looking into it too. It builds just by prompt.

LoveightRivers

·
7 months ago
·
Reply
·
1

all I want in a AI Coder is for it to stop printing depracated codes.

wentallout

·
7 months ago
·
Reply
·
1

claude 3.7 is already nailing everything , now see where claude 4 take us?

marsha677

·
7 months ago
·
Reply
·
1

With Github Copilot + VSCode I have access to several AIs: Chat GPT 4.1, Gemmini 2.5 PRO, OA_Mini, Claude Sonnet 4.
Claude is by far the best performing to date. In particular, instead of weighing down code by adding unwelcome patches to correct errors, Claude 4 is capable of finding the logic that leads to the error and correcting the problems at source.

After a few days of testing, the main recurring problem is that it doesn't always wait for commands to be issued before continuing. At the moment, this is a problem with all the applications.

miglisoft

·
7 months ago
·
Reply
·
1

Honestly, Claude 3.7 was already impressive — if Claude 4 pushes it even further, that's exciting!

heng tan

·
7 months ago
·
Reply
·
1

I 've been mainly using Gemini 2.5pro for the last couple of months as I thought was the best model out there, especially for my coding work. Curious to see how Anthropic has responded.

Nick Stam

·
7 months ago
·
Reply
·
1. 1
  
  да
  
  larsmill
  
  ·
  7 months ago
  ·
  Reply
  ·
1

Love this!

Jace Reed

·
7 months ago
·
Reply
·
1

Seems like this has a lot of potential..

Parag Nandy Roy

·
7 months ago
·
Reply
·
1

Claude 4 looks like a big leap forward—especially for long-context reasoning and complex task handling. At Monobot, we’re already seeing how advanced models like this can power truly autonomous AI agents that handle real-world workflows like scheduling, booking, and client interactions—without human prompting. Exciting times for anyone building in the agent space.

lkir

·
7 months ago
·
Reply
·
1

Claude 4 looks like a major step forward in AI coding! Excited to see how it compares to other models in real-world developer workflows. The advances in reasoning and safety are especially impressive.

simhakidsden

·
7 months ago
·
Reply
·
1

is very good

Banner King

·
7 months ago
·
Reply
·
1

This is really helpful, thank you for sharing. I'm working on a social impact app for vulnerable users (crime prevention & emergency alerts). Posts like this help guide my launch — much appreciated 🙏

Lungisani mtembu

·
7 months ago
·
Reply
·
1

Absolutely inspiring — thank you for sharing this so openly.

krishnakishore

·
7 months ago
·
Reply
·
1

I think ultimately, Anthropic and Google are going to get most of the dev community

Theo Bazille

·
7 months ago
·
Reply
·
1

Interesting 🧐

Mreza

·
7 months ago
·
Reply
·
1

I've been using both ChatGPT and Claude for the last month or so. TBH I don't see a difference, they're both just ok. They both seem to have short term memory, anything more than 5 comments above current is forgotten
The strong point for both is writing tests. I paste my requirements and the code and ask it to write unit or integration tests, that's a real time saver!

TomorrowMotor9445

·
7 months ago
·
Reply
·
1

As I am currently using Windsurf + Claude3.7 Sonnet(thinking), I welcome it.
Now it depends on the price.

Mimio

·
7 months ago
·
Reply
·
1

This looks promising. I’m developing a tool that requires high precision to generate a DSL, which will help my system produce high-quality designs. This would be extremely valuable for me.

Sourav Layek

·
7 months ago
·
Reply
·
1

Glad the pricing remains flat.

naveedq19

·
7 months ago
·
Reply
·
1

I have been testing different models for some time, claude sonnet 3.7 is great to work with, claude soonet 4 seems like fast and more reliable improvements.

Luchee83

·
7 months ago
·
Reply
·
1

Love this

maxiop

·
7 months ago
·
Reply
·
1

Dammn good

Chris

·
7 months ago
·
Reply
·
1

Claude 4 is great, but it seems like their context window is "shrinking" to me. Before this, on a paid Pro plan, I can have coding "conversation" without interruption for my project for up to 6 hours straight. Now, it's just warning me that I might exceed my usage plan soon.

O'Dell Obrien Gapitar

·
7 months ago
·
Reply
·
1

Claude 4 is a game-changer, AI coding just took a massive leap forward!

SquaredTech

·
7 months ago
·
Reply
·
1

Wth Claude 4 leaping ahead in coding prowess, I guess my dreams of becoming a coding prodigy have officially been outsourced to AI.

dreamyroom

·
7 months ago
·
Reply
·
1

Channing Allen is the co-founder of Indie Hackers, a platform sharing stories and strategies from profitable online business founders. Launched in 2016, it was acquired by Stripe in 2017 and spun out again in 2023 to operate independently.

kjghh

·
7 months ago
·
Reply
·
1

Definitely going to be a game changer for us.

Dahilon

·
7 months ago
·
Reply
·
1

Claude 4 is definitely a gamechanger! Certainly more efficient than the 3.x

TJLORI

·
7 months ago
·
Reply
·
1

Claude is awesome for code, can't wait to see what they do next

Jesse Beke

·
7 months ago
·
Reply
·
1

Claude is definitely helping developers

yusuf_delvi

·
7 months ago
·
Reply
·
1

Impressive how Claude is positioning itself for real-world coding tasks. The agent-focused updates seem especially useful for AI builders.

AI Business Series

·
7 months ago
·
Reply
·
1

Interesting to see how this plays out

Pulse Next

·
7 months ago
·
Reply
·
1

This is impressive — especially the natural language Git + testing workflow. We recently launched a eCommerce starter kit to save weeks of boilerplate setup. Seeing tools like Claude evolve makes us wonder how much more we can streamline. Might explore integration soon. Curious what others think

Bhuvnesh Gupta

·
7 months ago
·
Reply
·
1

This is great, thank you. I like the way you write. Can you write me one on this Telegram bot : SoufianeAutomation_bot ?

SoufianeAutomation

·
7 months ago
·
Reply
·
1

AI is moving Up!!!

codeexplore

·
7 months ago
·
Reply
·
1

I am open to work

DEVINX001

·
7 months ago
·
Reply
·
1

I am really excited about Claude 4 Its already very smart and it helps me in My coding tasks

Roshan Shams

·
7 months ago
·
Reply
·
1

the different seems super impressive

vanxh

·
7 months ago
·
Reply
·
1

We are moving fast in this AI

Yassine

·
7 months ago
·
Reply
·
0

While Web Components offer powerful features like encapsulation through Shadow DOM and the ability to create custom elements, they also come with hurdles. Issues such as complex styling due to Shadow DOM boundaries, limited server-side rendering support, and interoperability challenges with popular frameworks like React and Vue have been significant pain points for many developers.

shizays78

·
7 months ago
·
Reply
·
0

The Techie Talks

TheTechieTalks

·
7 months ago
·
Reply
·