The Invisible Infrastructure: Why Ontologies Determine Whether Your Healthcare AI Actually Works
FHIR gives you the shape of healthcare data. Ontologies give it meaning. Without both, your AI is guessing.
ThetaRho Team · May 2026 · 6 min read
![]()
Everyone in healthcare AI talks about FHIR. The HL7 standard has become the assumed foundation — the universal language that finally makes clinical data portable and queryable. And it is foundational. But FHIR alone is not enough. Not even close.
Here is the problem nobody warns you about: FHIR gives you the container. It does not give you the meaning inside it. A patient record can be perfectly FHIR-compliant and still be completely unusable for AI — wrong code systems, missing structured codes, locally invented identifiers, deprecated ICD-9 codes in old records, free text where there should be structure. FHIR says the data should be there. It does not guarantee the data means anything consistent.
That is what ontologies are for. And if you are building healthcare AI without them, you are building on sand.
Let's Start With Why Healthcare Data Is So Messy
Before we talk about ontologies, it helps to understand what AI systems are dealing with when they encounter real patient data.
A single patient’s clinical picture might live across Epic, Cerner, an athenahealth practice, a CommonWell HIE feed, claims data in ICD-10, labs from a reference lab using their own internal codes, and a stack of PDFs from a specialist who still faxes. Each of these sources uses different coding schemes, if they code at all. Each has different conformance levels. And when they all get normalized into FHIR resources, you end up with something that looks structured but contains:
- The same concept coded differently across sources (one system uses SNOMED, another uses a local mnemonic)
- Free text in fields that should be coded — a condition recorded as a note rather than a structured diagnosis
- Outdated code systems — ICD-9 codes that were never migrated
- Display text that contradicts the underlying code — often from copy-paste errors
- Codes that cannot be resolved against any canonical vocabulary
- The most clinically important information still buried in narrative notes
This is not an edge case. This is the norm. Every production healthcare AI system that works with real EHR data deals with this every day.
FHIR normalizes the shape of healthcare data. Ontologies normalize its meaning. You need both.
So, What Exactly Is an Ontology?
In healthcare informatics, an ontology is a structured vocabulary that defines clinical concepts and — critically — the relationships between them. The major ones you need to know:
SNOMED CT: The master clinical vocabulary. Covers diagnoses, findings, procedures, body sites. When you need to know that ’low blood pressure’ and ’hypotension’ are the same concept, or that diabetic retinopathy is a subtype of both diabetic complication and retinal disease — that’s SNOMED.
RxNorm: The standard for medications. Not just drug names — it encodes ingredients, dose forms, strengths, and links to drug classes and mechanisms of action through MED-RT.
LOINC: The standard for laboratory tests and clinical observations. Identifies not just what was measured, but the analyte, specimen type, measurement method, and scale.
MED-RT: Drug relationship vocabulary. Tells you that valsartan belongs to the angiotensin receptor blocker class, that it may treat hypertension and heart failure, and what its mechanism of action is.
OMOP: A cross-vocabulary mapping layer. Tells you that ICD-10 code I95.9 (hypotension, unspecified) maps to SNOMED concept 45007003 (low blood pressure). The translation layer that connects everything else.
Together, these vocabularies form the semantic layer that makes clinical data queryable, aggregatable, and safe to reason over.
What Happens When You Skip Ontologies
Here is the scenario that plays out constantly in healthcare AI: a clinician or care manager asks a reasonable question — ’show me everything related to this patient's diabetes.’ The AI system goes to work. And it fails in ways that are not obvious.
The system might miss the HbA1c result because it does not know that LOINC code 4548-4 is diabetes-related. It might pull in a medication a family member was taking, mentioned in a note. It might invent a relationship between a drug and a condition based on its training data rather than the patient’s actual record. With a large patient bundle, the context window fills up and the summarization gets lossy in unpredictable ways.
And here is the dangerous part: every output looks plausible. There is no error message. The AI produces confident prose. There is no ground truth to validate against.
Without ontologies, your AI is not retrieving clinical facts. It is making educated guesses that sound like facts.
This is the hallucination problem that gets discussed abstractly in AI safety conversations. In healthcare, it is concrete: a medication missed, a complication not flagged, a clinical summary that sounds authoritative and is wrong.
What Annotation Actually Does — Four Examples
The right approach is to annotate every FHIR resource with the appropriate ontology vocabulary before the data ever reaches an LLM. Here is what that looks like in practice.
Medications
A raw FHIR MedicationRequest for valsartan 160mg gives you an RxNorm code, a dose, and a date. You can group by that specific RxNorm code. But you cannot ask ’show all ARBs this patient has been on’ or ’which patients on antihypertensives have a heart-failure indication.’
With annotation: the RxNorm code gets enriched through MED-RT to add drug class (Angiotensin Receptor Blocker), mechanism of action, and SNOMED-coded indications (heart failure, hypertension, diabetic nephropathies). Now the medication is anchored to the disease landscape. The specific condition/problem the medication was ordered for is usually part of the request itself. Now you can answer the clinical question.
Figure: RxNorm string → drug class, mechanism, indications (ThetaRho annotation layer)
Conditions
A condition encoded only in ICD-10 — say, hypotension unspecified (I95.9) — is useful for billing. It is limited for clinical reasoning. ICD-10 was designed to describe encounters for reimbursement, not to represent disease semantics.
Annotation maps it to SNOMED (45007003, low blood pressure) via the OMOP Maps-to relationship, then walks the SNOMED hierarchy to assign it to the cardiovascular body system. Now you can ask ‘show me every cardiovascular condition across this patient’s history’. That is the question clinicians ask, and they should not care about the coding system used in the data. Without ontologies, this is a custom project per query.
Observations and Labs
LOINC codes for labs are often cryptic — the short form for an LDL calculation is ’LDLc SerPl Calc-mCnc,’ which is unreadable to anyone outside the lab. More importantly, without decomposing the LOINC parts (analyte, specimen, property, scale), you cannot reliably group related lab values or trend them across encounters.
Annotation resolves the code to canonical display and extracts the structured components. Now ’show me this patient’s LDL trend across all sources and all naming variants’ becomes a reliable query rather than a hope.
Clinical Notes
Notes are where the richest clinical information lives — and where structured coding almost never exists. A cardiology consult note is a blob of HTML containing diagnoses, medications, history, and clinical reasoning, none of it queryable.
Clinical NER (named entity recognition) extracts mentions of conditions and drugs from the narrative text, runs them through the same ontology resolver, and anchors the note to SNOMED disease codes. A note mentioning ’chest pain,’ ’hypertension,’ and ’metoprolol’ becomes searchable by cardiovascular system, just like a structured Condition resource would be.
Critically, assertion detection (negation, family history, uncertainty) filters out ’denies chest pain’ and ’family history of MI’ — keeping the signal clean.
The SNOMED Pivot: How Everything Connects
The deeper purpose of all this annotation is to converge everything — conditions, medications, labs, procedures, notes — onto a single semantic axis: SNOMED. SNOMED’s hierarchical structure is what allows clinical questions to be answered by traversing a disease graph rather than matching raw codes.
When a clinician asks about a patient’s diabetes, the system resolves ’diabetes’ to SNOMED node 44054006, then retrieves all resources tagged with that node or its descendants. Conditions that are SNOMED-native get there directly. ICD-10 codes get there via OMOP mapping. Medications get there via order indications — the reasonCode on a MedicationRequest links to the condition being treated, which then maps to SNOMED. Labs get there via ServiceRequest reasoning. Notes get there via NER extraction.
The result: the retrieval agent can cover the entire patient record, structured and unstructured, regardless of what code system was used to enter the original data with much greater confidence.
One disease query. Every resource type. That is what ontologies enable.
Why This Matters Specifically for LLM-Based AI
The pattern above — annotate, pivot through SNOMED, retrieve the relevant slice, then summarize — is what makes LLM-based clinical AI trustworthy rather than merely impressive.
When the LLM receives a scoped, ontology-grounded slice of the patient record relevant to the specific clinical question, three things become possible that are otherwise not:
Bounded context. The model only sees what is clinically related to the question. A high-utilizer patient with thousands of FHIR resources does not blow the context window because the retrieval is precise.
Groundable output. Every sentence in the summary can be linked back to a specific annotated FHIR resource. Citations come for free. The model is paraphrasing known facts, not constructing plausible narratives.
Testable behavior. You can write regression tests with real assertions: ’if metformin is in the diabetes slice, the summary must reference metformin.’ When the output is wrong, you can diagnose whether the failure is in annotation (data/annotation gap) or query (retrieval gap) or generation (model hallucination). All are debuggable. There is no mystery.
This is the difference between AI that ships to production and AI that looks good in a demo.
The Honest Takeaway
Healthcare AI is not hard because LLMs are not capable. It is hard because clinical data is semantically incomplete by default, and most AI architectures treat data cleaning as an afterthought rather than a prerequisite.
Ontologies — SNOMED, RxNorm, LOINC, MED-RT, OMOP — are not academic standards maintained by committees. They are the reason a question like ’what cardiovascular medications is this patient on’ can be answered reliably across a record that was built by a dozen different systems over a decade.
FHIR is the substrate. Ontologies are the semantics. LLMs are the interface. All three must work together. Skipping the middle layer does not simplify the architecture — it just pushes the failure mode downstream, where it is harder to see and harder to fix.
The AI is not the hard part. Getting the data ready for the AI is.
This post is part of The Clarity Protocol, ThetaRho’s ongoing series on AI, clinical workflow, and healthcare data. Next in this series we will look in to “Four Things AI Can Actually Do in Healthcare Right Now”.
The Clarity Protocol
ThetaRho AI / thetarho.ai / Honest conversations about healthcare data infrastructure, zero hype.

Leave a Reply