Machine‑Readable Internet

Grok 4, ChatGPT Agent, and the browser wars signaling that humans are no longer the web’s main audience

Jul 23, 2025

The rate of AI progress has reached a tempo that feels less like a product cycle and more like a performance benchmark leaderboard getting re-sorted hourly. The last two weeks' announcements & releases once again blurred the line between chatbot and agent, redefined what a browser is, and marched us towards the end of the human-first web.

M&A Musical Chairs

Before we dive deeper into the models and new releases, can we pause for the absolute circus that was last week's acquisition drama?

In case you missed it, Windsurf's deal with OpenAI? Dead. ☠️ Stone cold dead. Instead, founders and key researchers have been poached by Google (for reportedly $2.4B) and the company – whatever's left of it – is being acquired by Cognition, the makers of Devin.

(And remember a lot of this centers on the Microsoft <> OpenAI relationship heading to the gutter – there had been a lot of drama with Microsoft because of perceived competition with Copilot, which has been raising tensions between the two companies!!).

The timeline here is bonkers: from what I could follow, OpenAI's exclusivity expires on a Friday. Google swoops in with a reverse acqui-hire that same day. By Monday morning, Cognition has signed papers to buy the remaining company. That's one weekend, folks. One. Weekend.

There were a lot of feelz over this. Some people were very upset at the founders when they heard about the Google reverse-acquihire deal & thought they essentially left 250 people holding a bag of smelly 💩💩. Garry Tan wrote a very graceful defense of the founders and received some pushback. Thankfully Cognition came to save the day for the rest of the company. To hear directly from the source, Russell Kaplan (Cognition) wrote an overview on the “why of the purchase” We also learned in this “why” post that Windsurf was at $82M ARR (as a reminder, Cursor is at $500M 💵). And for a full play-by-play account, here is the full overview from Jeff Wang (Windsurf).

Oh, and ICYMI, Meta sneakily acquired PlayAI while everyone was watching the Windsurf drama. Because why not? It's acquisition season, apparently! 🛒

Apple Is "Seriously Considering" Things

In other "things that make you go hmm" news, Apple is reportedly "seriously considering" buying Mistral.

Look, I love my iPhone as much as the next person, but at this point, Apple's AI strategy is giving "kid who forgot about the group project until the night before" energy. Everyone else is shipping models left and right, and Apple's over here "seriously considering" things.

Tim, buddy, the train has left the station. It's doing 200mph. Maybe stop considering and start... you know... doing? Just a thought!

Meanwhile, Grok Goes Nuclear ☢️

While everyone was distracted by the Windsurf drama, xAI released Grok 4 and basically said "hold my beer" to the entire AI industry. (Look at that intelligence index!!!)

The numbers are staggering:

ARC-AGI-2: 15.9% (nearly double the nearest competitor)
GPQA Diamond: 88% (new record)
AIME 2024: 94%
Artificial Analysis Intelligence Index: 73 (beating o3 and Gemini 2.5 Pro's 70)

But here's what's really interesting – Grok 4 Heavy runs multiple agents in parallel like a "study group" checking each other's work. It's like having a bunch of smart kids in a room arguing about the answer before giving it to you. Democracy in action, AI style!

And yes, before you ask, there's a $300/month "SuperGrok Heavy" tier. Because of course there is. Welcome to 2025, where your AI subscription costs more than your gym membership, Netflix, Spotify, and therapy combined. 💸

The Great AI Coding Agent Price Wars of 2025

Speaking of pricing going bonkers, can we talk about what's happening in the coding agent space? John Hwang from NextWord wrote a great article about the simultaneous pricing changes across ALL major AI coding platforms:

Remember when Cursor users woke up to surprise $1,000+ bills? Yeah, that was fun. (They refunded them, but still – yikes!)

Here's the new playbook everyone's running:

Ditch request-based pricing (too expensive when agents run forever)
Move to token-based billing
Shrink free tiers to basically nothing
Launch an eye-wateringly expensive "Ultra/Max" tier
Use private-label models so you can't tell what you're actually using

Translation: The hobbyist era is over. These companies are going full enterprise, and your wallet better be ready. My partner Dan has started a series called Tokenomics that are a must read for anyone interested in AI-era pricing!

Perplexity Enters the Browser Wars

AI, AI everywhere! Perplexity launched Comet – a whole browser built around AI. Not an extension, not a plugin, a full Chromium-based browser where AI is the main character.

The kicker? It's $200/month for Perplexity Max subscribers. Because apparently, we're just printing money in 2025?

The timing is interesting. OpenAI's reportedly building their own browser, and now everyone's racing to own the entire browsing experience. Can't wait to pay separate subscriptions for my browser, my coding assistant, my search engine, and probably my keyboard at this rate.

Olivia Moore wrote a great side-by-side comparison between Comet and Dia (another AI-based browser), I recommend the whole thread!

And speaking of AI and the Internet!

We at Decibel are super excited to announce our investment in Scrunch! As we noted in our thesis:

“Traditional web search is on the decline as internet use shifts from traditional browsers to AI-driven interfaces, creating a huge challenge for websites not built for today’s LLM-driven world. AI agents don’t "browse"— they want content they can easily query and chunk in order to generate responses in products like ChatGPT and Perplexity. Scrunch makes it easy to deliver compressed, structured content specifically for AI agents and crawlers in real-time, giving every website or content creator the opportunity to measure and improve their content in AI-driven searches.”

Read more below from my partner Jon ⬇️

ChatGPT Goes Full Agent Mode

OpenAI’s big release last week: ChatGPT Agent. This isn't your chatbot's chatbot anymore – we're talking about a unified system that can actually operate a computer.

Sam Altman called it a real "feel the AGI" moment. Greg Brockman framed it as a big step toward their 10-year goal of creating an agent that can use a computer like a human.

What can it actually do? Browse the web, use a terminal, write code, analyze data, create reports, spreadsheets, and slides. It's marrying Deep Research's analytical chops with Operator's hands-on capabilities. The hybrid approach lets it switch between visual browsing (for clicking buttons) and text-based scanning (for bulk data processing).

The benchmarks are solid – 41.6% on a rigorous exam (roughly 2x prior models) and 27.4% on math puzzles when given terminal access. Not bad for a system that's basically learning to use a computer from scratch.

But here's the kicker: latency can stretch tasks to 15-30 minutes. They're literally encouraging users to go make coffee while their AI assistant books dinner reservations. We've gone from "instant" gratification to "eventually" gratification. Progress! (I guess? Is waiting 30 minutes for AI to book dinner progress? I digress.) 😅

Aaaand what OpenAI didn’t release

No score with OpenAI releasing an open weights model for now.

From Sam:

And from Aidan Clark, OpenAI’s VP of Research:

Closed AI! Closed AI! Closed AI!

I get it, safety and all, but sigh….

Kimi K2: China's Open-Source Challenger

While the West debates open weights, Moonshot AI in China released a trillion-parameter open-source model that beats GPT-4.1 on key coding and math benchmarks. Kimi K2 is optimized for agentic workflows, built on a mixture-of-experts architecture, and features a custom optimizer (MuonClip) that helps stabilize training at scale – no small feat in a world where model collapse can cost millions.

The stats:

SWE-bench Verified: 65.8%
LiveCodeBench: 53.7%
MATH-500: 97.4% (!!!)

And they're undercutting everyone on API pricing. This is giving serious DeepSeek vibes – remember when that shook everyone up? Yeah, buckle up for round two.

China’s not playing around! Read the recently released tech report here. And for more technical reading, Sebastian Raschka wrote a great post on modern LLM architecture design, from DeepSeek-V3 to Kimi K2.

The Pentagon Enters the Chat

Oh, and just when you thought things couldn't get more serious: the U.S. Department of Defense awarded contracts of up to $200 million each to four leading AI firms: Anthropic, Google, OpenAI, and xAI.

AI Supremacy had a great download on the awards and more broadly on how AI will disrupt the Pentagon, read it here. From the overview:

This pushes the AI race into the arena of military automation: we've gone from "AI will help you write emails" to "AI will help with warfighting" in record time. The National Defense community is seeking to push agentic AI from lab to battlefield, only weeks after giving OpenAI the same mandate.

Final Thoughts

The lines are hardening: open vs closed, agents vs chatbots, human UX vs machine-readable content. Underneath it all, one shift feels most permanent: AI is no longer simply using the web. It's becoming the primary audience for it.

We're watching the internet get rewritten in real-time; not for human eyes, but for silicon ones. The acquisition frenzy shows how high the stakes have become. The DoD contracts show where this might all be heading. And the pricing changes show that we are firmly in serious enterprise territory now.

P.S. – Next time someone tells you AI progress is plateauing, just show them the last two weeks’ news. Between the model releases, acquisitions, and military contracts, I need a spreadsheet just to keep track. And yes, I'm having ChatGPT Agent build it for me. Meta. Meta? (Meta!). ;)

AI Pioneers at Work

Discussion about this post