Skip to main content
All posts
February 14, 2025·8 min read

I Tested GPT-5 vs GPT-4o After OpenAI's Disastrous Rollout

Marcus RodriguezMarcus Rodriguez

OpenAI really stepped in it last month.

They rolled out GPT-5, which—to be fair—is impressive. Then they immediately removed GPT-4o from everyone's model picker. No warning. No explanation. Just... gone.

Reddit lost its mind. And honestly? I get it.

What happened

On August 7th, 2025, OpenAI released ChatGPT 5. Cool. Big moment. Then they decided to "clean up" the interface by removing GPT-4o from the dropdown entirely.

The backlash was instant and brutal. Within hours, the subreddit was flooded with complaints. People missed GPT-4o's conversational style. They found GPT-5 "stiff" and "robotic." Some claimed their entire workflow was broken.

OpenAI backpedaled fast. GPT-4o was back in the model picker within a day or two. But the damage was done—trust was broken, and people started paying closer attention to what they actually preferred about each model.

So I ran my own tests

I've been going back and forth between GPT-4o and GPT-5 for about three weeks now. Here's what I actually found—not vibes, not Reddit complaints, actual usage.

Coding: GPT-5 wins (by a lot)

This isn't even close. GPT-5 scores 74.9% on SWE-bench Verified. GPT-4o? 30.8%. That's not a marginal improvement—that's a generational leap.

In my own testing, GPT-5 catches bugs that GPT-4o misses. It suggests better architecture. It understands complex codebases faster. If you're a developer, GPT-5 is the obvious choice.

Math and reasoning: GPT-5, again

GPT-5 scored 94.6% on the AIME 2025 math exam. GPT-4o got 71%. For anything involving logic, calculations, or multi-step reasoning, the new model is significantly better.

Conversation and personality: GPT-4o (surprisingly)

Here's where it gets interesting. When Surge AI ran preference tests between the two models, evaluators slightly preferred GPT-4o overall. The breakdown was 48% preferred GPT-4o, 43% preferred GPT-5, 9% said it was a tie.

Their characterization is perfect: "GPT-4o is a sycophantic friend; GPT-5 is a polite professional."

And honestly? That tracks with my experience. GPT-4o feels warmer. It picks up on tone better. When I'm just chatting or brainstorming, I actually prefer it.

GPT-5 is more capable, but it's also more... clinical? It reminds me of a really smart colleague who's great at their job but not someone you'd grab a beer with.

Long-form writing: It depends

For business writing, reports, documentation? GPT-5 is more thorough and accurate.

For creative writing, casual emails, anything with personality? I still reach for GPT-4o.

The real lesson here

The backlash to GPT-4o's removal wasn't just people being resistant to change. It was users correctly identifying that capability isn't everything.

We've spent years assuming newer = better = use it for everything. But that's not how this works anymore. These models have genuine personality differences. Different strengths. Different vibes.

I use GPT-5 for coding and analysis. I use GPT-4o for brainstorming and drafts. Neither is "the best"—they're different tools for different jobs.

What about GPT-5 Pro?

Quick note on the $200/month tier. Is it worth it?

If you're doing heavy coding, running complex analyses, or need the absolute best performance on reasoning tasks—yeah, probably. The context window is huge and there's no rate limiting.

For most people? Probably overkill. The standard GPT-5 handles 90% of use cases just fine.

My setup now

After all this testing, here's where I landed:

  • Coding/debugging: GPT-5
  • Math/analysis: GPT-5
  • Writing first drafts: GPT-4o
  • Quick questions/chat: GPT-4o
  • Everything else: Honestly, I switch between them based on mood

That might sound like a lot of model-switching. It is. Which is exactly why I use LazySusan—one interface, all models, no subscription juggling.

The bottom line

GPT-5 is more capable than GPT-4o by basically every benchmark. But benchmarks aren't everything. Sometimes you want an AI that feels like a thinking partner, not just a high-powered answer machine.

Don't let anyone tell you GPT-4o is "obsolete" or that you're wrong for preferring it. Use what works for you. That's the whole point.

Have you switched to GPT-5 full-time, or are you still bouncing between models? I'm curious what other people's setups look like.

Stop juggling AI subscriptions

50+ models including ChatGPT, Claude, Gemini, and more.

Get 7 Days Full Access – $2