21 July 2025 · 1 min read
Is bigger always better when it comes to data in AI?
I explore why bigger isn't always better when it comes to data in AI — and why, in healthcare especially, the real problem is data quality, not quantity. Curated, structured, interoperable data is what will make trustworthy clinical AI possible.
Author
Last updated
6 May 2026
Is bigger always better when it comes to data in AI?
Not according to Scott Wu (CEO, @Cognition) on his recent podcast with Harry Stebbings of 20VC: He predicts the future isn't ultra-large datasets, but "a small set of highly curated data for exactly the use case that you care about."
This really hit home. In my work at SNOMED International, I see firsthand how the lack of curated data holds healthcare back.
For years, we've known that healthcare data has a quality problem, not a quantity problem. To build trustworthy clinical AI, we need to move from a sea of unusable information to clean, structured, and interoperable data. That's where standards like SnomedCT come in.