# Clinical LLM benchmarks: why SNOMED CT mapping is a real-world test

> The AI-in-healthcare debate keeps swinging between “AI is going to take over everything” and “AI is useless in a medical setting.” Neither is useful. What the field actually needs are external neutral benchmarks on specific clinical tasks. Congrats to Rory Davidson and team for bringing one to market.

URL: https://www.ch-healthtech.com/insights/world-where-we-constantly-either-read-ai-going-take-over-everything-ai-useles
Markdown: https://www.ch-healthtech.com/insights/world-where-we-constantly-either-read-ai-going-take-over-everything-ai-useles.md
Published: 2026-02-19
Updated: 2026-05-06
Author: Christian Hein
Tags: technology/artificial-intelligence, technology/digital-health, technology/foundation-models, function/innovation-management, function/regulatory-compliance, geography/europe

---


## TL;DR

In a world where we constantly either read “AI is going to take over everything” or “AI is useless in a medical setting”, it’s crucial to have external neutral benchmarks evaluating LLM performance for specific clinical contexts — like helping structure data from unstructured sources to SNOMED CT, the leading international clinical data standard. Congrats to Rory Davidson and team for bringing this to market.

In a world where we constantly either read “AI is going to take over everything” or “AI is useless in a medical setting”, it is crucial to have external neutral benchmarks evaluating the performance of LLMs for a specific context: helping structure data from unstructured sources to SnomedCT, the leading international clinical data standards. Congrats to Rory Davidson and the team for brining this to the market!

## Key takeaways

- The public debate on AI in healthcare swings between two unhelpful extremes: total takeover or total uselessness. Neither helps practitioners decide anything.
- The corrective is external, neutral benchmarks on specific clinical tasks. Without them, evaluation stays at the level of vibes.
- Mapping unstructured clinical data to SNOMED CT is a concrete, high-value use case where LLM performance can and should be measured rigorously.
- SNOMED CT remains the leading international standard for clinical data, which makes it a meaningful benchmark target.
- Recognising teams that build this kind of benchmark infrastructure matters. It signals that rigorous evaluation is valued by the field.