19 February 2026 · 1 min read
Clinical LLM benchmarks: why SNOMED CT mapping is a real-world test
The AI-in-healthcare debate keeps swinging between “AI is going to take over everything” and “AI is useless in a medical setting.” Neither is useful. What the field actually needs are external neutral benchmarks on specific clinical tasks. Congrats to Rory Davidson and team for bringing one to market.
Filed under Clinical AI
TL;DR
In a world where we constantly either read “AI is going to take over everything” or “AI is useless in a medical setting”, it’s crucial to have external neutral benchmarks evaluating LLM performance for specific clinical contexts — like helping structure data from unstructured sources to SNOMED CT, the leading international clinical data standard. Congrats to Rory Davidson and team for bringing this to market.
In a world where we constantly either read “AI is going to take over everything” or “AI is useless in a medical setting”, it is crucial to have external neutral benchmarks evaluating the performance of LLMs for a specific context: helping structure data from unstructured sources to SnomedCT, the leading international clinical data standards. Congrats to Rory Davidson and the team for brining this to the market!
Key takeaways
- The public debate on AI in healthcare swings between two unhelpful extremes: total takeover or total uselessness. Neither helps practitioners decide anything.
- The corrective is external, neutral benchmarks on specific clinical tasks. Without them, evaluation stays at the level of vibes.
- Mapping unstructured clinical data to SNOMED CT is a concrete, high-value use case where LLM performance can and should be measured rigorously.
- SNOMED CT remains the leading international standard for clinical data, which makes it a meaningful benchmark target.
- Recognising teams that build this kind of benchmark infrastructure matters. It signals that rigorous evaluation is valued by the field.
Related insights
20 Jan 2026
I just peer reviewed a paper on healthcare AI, and I wasn’t allowed to use AI.
21 Oct 2025
Many think GenAI will replace data standards. They're wrong. In fact, for AI in healthcare to b...
16 Sept 2025
Healthcare AI agents: why evaluation must move from answers to actions
20 Mar 2025
We all start to increasingly rely on AI and LLMs in the medical setting, whether it is replacin...