← Home AI in 15

AI in 15 — June 24, 2026

June 24, 2026 · 12m 38s
Kate

Picture a model that reads the code behind cURL and the Linux kernel, finds the way in, writes the patch, and tests it for you. Now picture that you only get to use it if you pass a background check. That's OpenAI's bet this week — and it lands less than two weeks after Washington switched off a rival for almost the same capability.

Kate

Welcome to AI in 15 for Wednesday, June twenty-fourth, 2026. I'm Kate, your host.

Marcus

And I'm Marcus, your co-host.

Kate

We touched the OpenAI cyber story yesterday, Marcus, but the full release dropped a partner roster and a benchmark number that change the picture. That's our lead. Then the Anthropic ban — Trump just said the quiet part out loud at the G7.

Kate

SpaceX quietly becomes a six-billion-dollar landlord, renting out chips by the month.

Kate

Humanoid robots stop demoing and start clocking in.

Kate

And a free Chinese model nudges past a US flagship — not on coding this time, but on real work.

Kate

Lead story, Marcus. Yesterday we covered the Linux kernel findings. What's genuinely new today?

Marcus

The whole apparatus around it, Kate. On June twenty-second OpenAI shipped the full release of GPT-5.5-Cyber and called it, their words, its "strongest model yet for finding and helping patch software vulnerabilities." Here's the number that matters: on CyberGym, a vulnerability benchmark, it scores eighty-five-point-six — up from eighty-one-point-eight for the general model. And per the newsletter feeds, that's ahead of Anthropic's now-banned Mythos at eighty-three-point-eight. So the model the US government left running just topped the model it switched off, on the exact task that got the other one switched off.

Kate

That's a remarkable sentence. What else came with it?

Marcus

Two things, Kate. A Codex Security plugin that pulls vulnerability detection and patch validation straight into a developer's coding workflow. And a Daybreak Cyber Partner Program — the launch partners are a who's-who of enterprise security: Accenture, Cisco, CrowdStrike, IBM, Okta, Palo Alto Networks, Sophos, and Wiz. This isn't a research demo anymore. It's a go-to-market.

Kate

And we mentioned Patch the Planet yesterday — the open-source push. Any fresh findings?

Marcus

More of them, Kate, and they're sobering. Beyond the Linux kernel bugs we covered, Daybreak has now surfaced thirty-four vulnerabilities in FreeBSD, five exploitable bugs in Chrome's V8 engine, and ten in Apple's Safari. Thirty-plus open-source projects have signed on — cURL, Go, Python, Sigstore, the cryptography libraries. This is the plumbing of the internet getting audited by a machine.

Kate

So why isn't this terrifying in the same way Mythos was?

Marcus

Because OpenAI gated it, Kate — and that's the whole strategy. Access stays locked to what they call "trusted defenders," through a verification program. You don't download GPT-5.5-Cyber; you apply for it and you get vetted. Sam Altman's framing was, "AI is already good and about to get super good at cybersecurity… we'd like to start working with as many companies as possible now." The bet is that if you arm every verified defender fast enough, defense outruns offense.

Kate

But the capability is the same as the banned model. We said that yesterday too.

Marcus

It is, Kate, and that's the open question hanging over the whole thing. Having forced Anthropic to switch off the weaker Mythos, does Washington let OpenAI keep shipping the stronger one? The difference so far isn't capability — it's posture. One lab pointed it at offense and got embargoed. The other wrapped it in a partner program and a background check and called it defense. The technology doesn't know the difference. The regulators might decide they don't either.

Kate

Quick hits. And the first one is the other half of that story, Marcus. The Anthropic ban — there's movement, and it came from the President himself.

Marcus

There is, Kate, and it's worth being precise about what changed and what didn't. Quick recap for anyone just joining: on June twelfth the administration ordered Anthropic to block every foreign national from using Fable 5 and Mythos 5, and rather than wall off its own non-citizen staff, Anthropic disabled both models worldwide. Now — Trump told Axios this week he no longer views Anthropic or Dario Amodei as a national-security threat. His exact line was, "Well, not now, but a week ago, maybe." That softening came after he met Amodei at the G7 in France on June seventeenth.

Kate

So the models are back?

Marcus

No — and that's the lesson, Kate. The softening is purely rhetorical. As of late June the Commerce directive is still legally in force, and both models remain suspended worldwide. Warm words from the President didn't restore a single account. What might actually flip the switch is paperwork: Anthropic's own identity-verification system, reportedly due July eighth, could let it confirm US citizenship and quietly restore Fable to domestic users — without the ban ever being formally lifted.

Kate

Why pull the weaker model and leave the stronger one running? That's the part I can't square.

Marcus

Because this was never purely about capability, Kate. If it were, GPT-5.5-Cyber — which now scores higher — would be the bigger worry. The honest read is that leverage and politics did at least as much work as the benchmark. Anthropic got caught in an export fight with a disputed jailbreak as the trigger. OpenAI volunteered itself as the defensive good guy with a partner program attached. Same dangerous capability, opposite government treatment — and the difference is mostly framing and timing.

Kate

Next, Marcus, the picks-and-shovels story keeps getting more concrete. SpaceX just signed a tenant for its data center, and the numbers are eye-watering.

Marcus

They really are, Kate. An open-weights startup called Reflection — founded by two ex-DeepMind researchers, Nvidia-backed, valued around twenty-five billion dollars — has agreed to pay SpaceX roughly a hundred and fifty million dollars a month for access to Nvidia's GB300 chips at the Colossus 2 data center in Memphis.

Kate

Wait — a hundred and fifty million a month?

Marcus

A month, Kate. The deal runs from July first this year through the end of 2029, worth up to six-point-three billion dollars, with a ninety-day exit clause after the first three months. And it's the third big tenant. SpaceX has already leased Colossus capacity to Anthropic — around one-and-a-quarter billion a month for over two hundred twenty thousand GPUs — and to Google, about nine hundred twenty million a month for a hundred and ten thousand. Plus Cursor.

Kate

So SpaceX is basically a compute landlord now.

Marcus

An "Oracle of compute," Kate — a neocloud renting GPUs by the month, and it's becoming one of the most reliable recurring-revenue businesses in the whole AI cycle. The detail I'd flag: Reflection is an open-weights lab. People assume open weights means cheap. It doesn't. Training the thing still costs you one of the largest infrastructure commitments on record. Giving the model away and spending six billion to build it are not in tension — they're the same strategy.

Kate

Let's get physical, Marcus. Automate 2026 is on in Chicago this week, and humanoid robots got their own spotlight for the first time.

Marcus

They did, Kate — an NVIDIA-sponsored pavilion, twenty-plus humanoid organizations on the floor. But the real shift is that these robots have stopped auditioning. Boston Dynamics has begun commercial shipments of its electric Atlas — fifty-six degrees of freedom, lifts a hundred and ten pounds, swaps its own battery. And its entire 2026 production run is already committed, to Hyundai and Google DeepMind.

Kate

Committed before it ships. So this is real demand, not a wish list.

Marcus

Real demand, Kate. And it's not just Boston Dynamics. Agility Robotics' Digit is on paid shifts — Toyota's plant in Woodstock, Ontario signed a robots-as-a-service contract for seven Digits on the RAV4 line, and Digit has already moved over a hundred thousand totes at a GXO warehouse.

Kate

What changed to make the bodies finally useful?

Marcus

The brains caught up, Kate. Vision-language-action models — NVIDIA's Isaac GR00T is the headline one — let a single robot follow plain-language instructions and chain multi-step tasks. So instead of hand-coding one body for one job, you retrain the same body for many jobs. That's the unlock. Mass-produced bodies, plus robot foundation models, plus paying customers. "Physical AI" is quietly turning into an actual industry.

Kate

Last hit, Marcus, and it's a quick update on a model we covered Monday. GLM-5.2 — there's a new benchmark result.

Marcus

Right, Kate — Monday it was beating GPT on coding at a sixth of the cost. What's new is it's now leading on general real-world work, not just code. On a leaderboard called GDPval, which scores models on actual job-shaped tasks, GLM-5.2 lands third — edging GPT-5.5, and trailing only Claude's Fable 5 and Opus 4.8. A freely downloadable model now beats a US flagship on general work, not just narrow coding benchmarks.

Kate

That's a bigger deal than the coding win, isn't it?

Marcus

It is, Kate, with one honest caveat I want on the record. These top scores lean on maximum-reasoning settings that cost more time and money than anyone uses day to day, and an Elo rating on a fixed set of tasks isn't the same as economic value in your actual workflow. So — genuinely impressive, and read the fine print. The thing about open weights, though, is anyone can independently verify the claim. With a closed model, you're taking the lab's word for it.

Kate

One to watch tomorrow, Marcus.

Marcus

July eighth, Kate — Anthropic's identity-verification system. If it ships, Fable could quietly come back online for US users without Washington ever formally lifting the ban. That becomes the template for how every lab lives under export controls.

Kate

Agree, or counter?

Marcus

Agree it's the one to watch — but my counter-watch is whether the government turns its attention to OpenAI's GPT-5.5-Cyber next. It now out-benchmarks the model that got banned. If the rule is about capability, OpenAI is the bigger target. If they leave it alone, then we've learned the rule was never really about capability at all.

Kate

That's your AI in 15 for today. See you tomorrow.