I tested three AI models on the same data analysis task in VS Code with GitHub Copilot. Same prompt, same dataset, same questions.
Gemini was fastest but hallucinated conclusions. It claimed rush hours were dangerous when the data showed late night was worse.
Codex was accurate but slow and surface level.
Opus struggled with VS Code notebook tools at first, but produced the best output by far. Thorough cleaning, real insights, conclusions I could verify.
For Python engineers doing ad hoc analysis, this workflow is worth exploring.
What setups are you using for AI assisted data analysis? Curious what's worked and what hasn't. I work with a lot of data at work, so I'm quite interested in your perspective.
Full comparison with methodology on my blog.
https://blog.jackcreates.com/gpt-5-2-codex-vs-claude-opus-4-5-vs-gemini-pro-which-model-actually-codes-best/
Opus 4.5 looks like a real step forward — tighter Jupyter support always feels like finally unlocking smoother workflows instead of wrestling with clunky hacks.How it handles larger pipelines and versioning without slowing things down.