Google Developer Trust and the Cost of Gemini 3.5 Flash

Google has all the parts.

Models. Infrastructure. Talent. Distribution. Research. A cloud business. A developer ecosystem big enough that even its half-finished ideas get instant attention.

And yet the developer story keeps coming out wrong.

Today, Gemini 3.5 Flash launched at Google I/O 2026 with the kind of benchmark sheet that looks engineered to end arguments. Fast. Agent-focused. Pushed across Google’s developer and consumer AI surfaces, including Gemini API and Antigravity. The generous read is simple: Google finally has a Flash model built for the agent era.

Fine.

Now price the thing.

Cost comparison

Sticker price tripled. Real task cost got worse.

5.5x real task cost
Real task cost (AA Intelligence Index)
Gemini 3 Flash1x
Baseline
Gemini 3.1 Pro3.1x
About 3.1x the baseline
Gemini 3.5 Flash5.5x
$1,552 to run the AA Intelligence Index
Sticker price per 1M tokens
ModelInputOutput
Gemini 3 Flash$0.50$3.00
Gemini 3.5 Flash$1.50$9.00
3xhigher sticker price than Gemini 3 Flash, before extra task cost shows up

Gemini 3.5 Flash is listed at $1.50 per million input tokens and $9 per million output tokens. That is already a strange sentence for a Flash model. Then the effective cost gets worse, because the model burns enough tokens that Artificial Analysis found it more than 5x costlier to run its Intelligence Index than Gemini 3 Flash and 75% costlier than Gemini 3.1 Pro.

That is the trick.

A model can be cheaper per token and still more expensive per task. A 2026 paper calls this the price reversal phenomenon: lower listed prices can lose to higher reasoning-token consumption. Developers do not buy tokens in the abstract. They buy completed work. If the cheap-looking model chews through more output to reach the answer, the sticker price is bait.

Speed does not save that story.

If a model spits tokens fast while spilling too many of them, you did not get efficiency. You got a faster leak.

The CLI Betrayal

The Gemini CLI story is uglier because it is not about one model price.

It is about trust.

Google’s own transition post says Gemini CLI had over 100,000 GitHub stars, 6,000 merged PRs, and hundreds of contributors. That is not a toy. That is community investment.

Tool migration

From community gravity to product gravity

Gemini CLI
Open tool earns developer routines

Stars, pull requests, extensions, scripts, and muscle memory pile up around the old surface.

May 2026
Antigravity becomes the destination

Google frames the move as one agent-first platform with a new terminal experience.

June 18
Consumer access gets cut over

Pro, Ultra, and free individual users lose the old request path and get pushed toward the new one.

Then Google announced the transition to Antigravity CLI. On June 18, 2026, Gemini CLI and Gemini Code Assist IDE extensions stop serving requests for Google AI Pro, Ultra, and free individual users.

You can dress that up as unification. Developers will read it as a rug pull.

The old tool was open, familiar, and already wired into people’s workflows. The new direction pushes Pro, Ultra, and free individual users toward Antigravity, a broader product surface with different limits, different assumptions, and less community control.

That is how large companies burn goodwill. Not in one dramatic act. In a memo that says the replacement is better for you.

Maybe Antigravity becomes great. Maybe Gemini 3.5 Flash becomes the right model for some agent workloads. The positive case exists, and Google is not wrong to aim at agents instead of chatbots.

But developer trust is not measured in launch demos. It is measured in what happens after people build on your thing.

Google keeps asking developers to trust the new surface while retiring the surface they already trusted.

That math does not work.

Railway Is The Part You Cannot Hand-Wave Away

The Railway outage is where the argument stops being about taste.

Outage chain

How one account state became a platform outage

May 19, 22:20 UTC
Google Cloud suspends the production account

Railway says the account was incorrectly placed into suspended status.

Cache expiry
The blast radius leaves Google Cloud

Route caches expired, so workloads on Railway Metal and AWS became unreachable too.

May 20, 06:14 UTC
Incident moves to monitoring

Roughly eight hours later, the platform was no longer in active outage recovery.

On May 19, 2026, Google Cloud incorrectly placed Railway’s production account into suspended status. Railway says the outage lasted from 22:20 UTC on May 19 to about 06:14 UTC on May 20. That is roughly eight hours of platform-wide damage from an account state that should never have happened.

Railway is not blameless.

Its own incident report makes the architecture clear: a Google-hosted control plane supported the dashboard, API, and parts of the network infrastructure. Once route caches expired, workloads outside Google Cloud could become unreachable too.

That matters. If you sell a platform, customers do not care which dependency betrayed you first. They care that your platform went down.

But the dependency still matters.

Google Cloud incorrectly suspended the account. That is the blast cap. Railway’s architecture spread the fire. Both facts can be true, and the second one does not excuse the first.

A cloud provider should not be the thing that randomly turns off a company spending serious money and operating real infrastructure. Once that happens, every “build on us” pitch gets quieter.

This Was Not A One-Off

Google Cloud has history here.

In 2024, Google Cloud deleted UniSuper’s private cloud subscription after an inadvertent provisioning misconfiguration. UniSuper was not a hobby project. It was a major Australian pension fund, and the failure affected hundreds of thousands of members.

That is the kind of cloud incident that should haunt every architecture review.

The uncomfortable lesson is not “never use Google Cloud.” Plenty of teams will use it and survive just fine.

The lesson is harsher: Google can be too big, too automated, and too internally fragmented to behave like a dependable partner when the failure mode gets weird.

That is exactly the fear developers have around Google’s AI tools too.

The model can look great. The CLI can look promising. The cloud primitive can look perfect for your architecture. Then the product surface shifts, the account system fires, the team changes, the community path closes, or the tool you trusted gets folded into a different strategy.

Trust fracture
Model

Fast benchmark story, messy task economics.

Gemini 3.5 Flash looks quick until the task bill arrives.

Tool

Community tool, product migration.

Gemini CLI trust gets routed into Antigravity.

Cloud

Infrastructure promise, account-level failure.

Railway and UniSuper make the fear concrete.

Good parts. Bad machine.

Benchmarks Cannot Buy Back Trust

Google keeps showing numbers.

Some of the numbers are good. Gemini 3.5 Flash is fast. It is agent-focused. It belongs in the conversation.

But numbers do not repair a trust problem.

If the model looks cheap until task cost shows up, developers notice. If the open tool gets replaced by a closed funnel, developers notice. If a cloud account can be suspended incorrectly and take a platform with it, developers notice.

This is the bill for treating developers as distribution instead of partners.

Open-source contributors are not a launch asset. They are not a migration funnel. They are people who gave you code, bug reports, workflows, and credibility. When you retire their path into a closed product, you do not get to act surprised when the room gets colder.

Cloud customers are not test cases for account automation. They are companies with customers of their own. When your system suspends the wrong account, the damage does not stay inside your dashboard.

AI users are not impressed forever by benchmark charts. They eventually run the thing, pay the bill, and decide whether the workflow got better.

That is where Google keeps losing the plot.

The Exit Door Is Getting Easier To See

The case against Google’s developer AI stack is not that every product is bad.

That would be too easy, and it would be false.

The case is that Google has become hard to trust as the owner of the whole experience. The model story, the CLI story, and the cloud story all point in the same direction: impressive components trapped inside a machine that keeps making developer-hostile choices.

The competition does not need to be perfect to benefit from that. OpenAI does not need to be saintly. Anthropic does not need to be cheap. Cursor does not need to own the whole stack.

They just need to feel less likely to waste your trust.

Google can still fix this. It has the people. It has the infrastructure. It has the research bench. It has more raw capability than almost anyone.

But that has never been the problem.

The problem is what happens after the good people build something worth trusting.

Too often, Google gets its hands on it.