16 June 2026 · 1 min read
When a $12B Clinical AI Ties Google's Free Answer Box
A Nature Medicine study found OpenEvidence, a clinical AI now valued at $12 billion, scored no better than free Google Search on blinded real-world physician queries, while general frontier models led. The sharper lesson for anyone buying clinical AI: which benchmark you trust decides the winner you see.
Author
Share
A medical AI now valued at $12 billion, free for verified US clinicians, scored no better than the free AI box at the top of a Google search on a real-world clinician-query test.
That is OpenEvidence, in a Nature Medicine study published Friday. NYU Langone researchers tested two specialist clinical tools, OpenEvidence and UpToDate Expert AI, against three general frontier models: GPT-5.2, Gemini 3.1 Pro Preview and Claude Opus 4.6.
The frontier models won across the study's three stages. The specialist tools, built for doctors, finished in the bottom tier next to Google's free Overview on the real clinical queries test.
Now it gets interesting: is "better" actually better?
The authors did something with the benchmark hierarchy that vendors rarely do. They flagged that HealthBench, where the gap looked widest, was built by OpenAI, and treated it as supplementary. They elevated their hardest test instead: 100 de-identified physician queries pulled from a live clinical environment, scored blind by 12 clinicians.
On that test the lead is real but narrow. Gemini 3.62, OpenEvidence 3.24, Google's free Overview 3.27 on a 4-point scale. Safety and hallucination flags showed no significant difference between models.
OpenEvidence is contesting the study publicly, alleging methodological flaws and an undisclosed conflict of interest. Worth watching how that resolves.
If you buy clinical AI, really make sure to ask which benchmark the vendor quotes, who built it, and whether anyone outside the company has run the tool on real queries from your own clinic.
And the foundation models vs. dedicated players fight has only just started.
Related insights
14 May 2026
Where Healthcare Capital is Actually Moving in 2026
8 Jul 2025
Is AI a better diagnostician than your doctor? A recent Microsoft study suggests so, but the rea...
9 Jun 2026
Claude Just Matched 40 Years of Chemistry Software
12 May 2026
Isomorphic about to raise $2bn - and why I'm not sure that's a good thing