GPT-5: Hype, Reality, and the Gaps in Between

The arrival of GPT-5 on August 7, 2025, came with enormous expectations. OpenAI framed it as a “significant step along our path to AGI” and claimed it had “PhD-level” expertise. The promise: a smarter, more reliable, and more useful AI than anything before it.

Now that the initial excitement has passed, a more nuanced picture is emerging — one that mixes genuinely impressive capabilities with avoidable mistakes, user frustration, and performance gaps that the launch event didn’t exactly highlight.

The “Chart Crime” Moment

One of the most talked-about moments from the launch happened before anyone had even tried the model.

During the livestream, OpenAI showed benchmark charts meant to prove GPT-5’s superiority. The problem? They were riddled with obvious visual errors.

A bar labeled 69.1% was drawn the same height as one for 30.8%. Another — on a slide ironically about deception — showed 50.0 shorter than 47.4 (proof).

The mistakes were so glaring they became an instant meme. CEO Sam Altman later admitted it was a “mega chart screwup,” and a staff member issued a public apology. In a final bit of irony, when users uploaded the chart to GPT-5 itself, the supposedly “PhD-level” model didn’t catch the mismatch between numbers and bar heights.

Not a Bigger Brain — A Smarter Router

In past GPT releases, the leap in performance often came from scaling up — more parameters, more training data, more compute. GPT-5 is different.

It’s not a single, massive model that dwarfs GPT-4 in size. Instead, it’s a unified architecture that routes your request to the right specialized system — whether that’s a fast reasoning engine, a code-focused model, or something else — without you having to pick.

From a business perspective, this is clever. It improves efficiency and lowers costs. From a technical perspective, it means GPT-5’s improvements aren’t about sheer model size, but about coordination between models.

Benchmarks: The Story Behind the Scores

OpenAI’s slides showed GPT-5 excelling in academic tests like AIME 2025 (math) and SWE-bench (coding). But the broader data paints a more mixed picture:

Mixed Results: On “SimpleBench,” GPT-5 placed 5th.
Strong Competition: xAI’s Grok 4 still beats GPT-5 on the complex ARC-AGI-2 reasoning benchmark. Grok costs more to run, but it proves GPT-5 doesn’t dominate every measure of capability.
Market Skepticism: Prediction markets noticed. During the launch event, the odds of OpenAI having the best-performing model by the end of the month plunged from ~80% to under 20%, while Google’s odds shot up above 77%.

The User Backlash

Some of the loudest criticism hasn’t been about benchmarks at all — it’s been about the user experience.

ChatGPT Plus subscribers, who pay $20/month, now have a stricter cap of 80 messages every 3 hours, which was increased to 160. For occasional users, that’s fine. For those who rely on it heavily for research, content, or workflows, it feels like a downgrade. Many see it as a cost-saving measure designed to push power users toward the $200/month Pro tier or the pay-as-you-go API.

A Reddit thread titled “GPT-5 is horrible” quickly gained thousands of upvotes, full of longtime subscribers saying they feel shortchanged.

Real-World Flaws and Bugs

While OpenAI calls GPT-5 its “strongest coding model to date,” early use shows it’s still far from perfect:

Reasoning Slips: On launch day, GPT-5 made a basic decimal subtraction error — a small but telling reminder it’s not infallible.
Bugs and Latency: Users have reported intermittent API errors, truncated outputs, latency spikes, and even corrupted responses, all tracked on a live bug list.
Coding Inconsistencies: Even in common programming tasks, it can return outdated or incorrect syntax, requiring manual review.

My Take: Evolution, Not Job Elimination

GPT-5 is clearly a capable model. It’s more versatile, generally more accurate, and cheaper to run thanks to its unified router architecture. Its lower API pricing ($10 per million tokens vs. Claude’s $75) is a major market move.

But it’s not a giant leap in intelligence — and it’s not going to take everyone’s jobs overnight. It’s a tool that works best in the hands of someone who already knows their craft.

The road to AGI is still a long one. GPT-5 is another milestone on that journey, and I don’t think GPT-5 is going to take my job until GPT-6 obliterates humanity.

AI, OpenAI, ChatGPT, Web Development, Layoffs

“The arrival of GPT-5 on August 7, 2025, came with enormous expectations. OpenAI framed it as a “significant step along our path to AGI” and claimed it had “PhD-level” expertise. The promise: a smarter...”

Kunal Kumar

Founder & CEO, TripleHash