I will be attending a ‘Fresh Look’ meeting in Washington next week. The idea is to make some progress on the topic of ‘detailed clinical models’ (DCMs). Some of the goals include setting up a repository of DCMs, establishing governance, and defining a roadmap for tooling. Underlying all this is a huge list of formalisms and models, including OWL, UML, ADL, HL7 MIF, XSD, LRA, RMIMs, CDA templates, greenCDA and so on.
Are models like Chinese food?
Some people in health informatics appear to believe that ‘models’ and ‘formalisms’ are just a detail to be worked out later by software developers, and that any combination of models can be interchanged, combined and converted, rather like the menu in a cheap Chinese restaurant. They could not be further from the truth.
Models are essentially formalised structural expressions – they are mathematical in nature. There are not only different models, but different kinds of models. Indeed, models cannot sensibly be discussed about without talking about things like:
- the overall modelling framework, in which different types of models serve distinct functions;
- the underlying formalisms, each of which has its own philosophy and mathematical properties;
- the quality of the individual models;
- whether the models in a framework are mutually coherent.
For these reasons, we cannot compare apples with oranges, e.g. XSD (a data exchange schema definition language) with OWL (a description-logic based ontology language), nor can we just assume that ‘EHR models’ such as openEHR/13606, CDA and CCR are interchangeable.
We need at least three things to make models work:
- a framework – i.e. a theoretical system of formalisms that will satisfy overall needs
- an architecture – i.e. base models and patterns which provide the basis for modelling in a certain domain
- models – the specific models for the problem at hand.
No conversation about ‘models’ can be sensible unless people are talking within the same framework (or understand the differences between frameworks), and within the same architecture (or they understand the differences across architectures).
A modelling framework for DCMs
What kind of thing can be used as a modelling framework for building content models in e-health? As discussed in a previous post, we have found in openEHR over the last few years that it consists of 3 irreducible levels of modelling, as well as two further key elements:
- modelling level #1: the reference model – this defines information as persisted, shared etc, and is expressed as an object-oriented model;
- modelling level #2: archetypes – data point / group definitions – e.g. define the possible items in a ‘systemic arterial BP recording’;
- modelling level #3: templates – data sets defined by aggregating and refining archetypes – e.g. to define a clinical document, pathology report, or any other use-case specific data set;
- terminology interface: a way of formally connecting elements in the three levels of information modelling to terminologies
- query formalism: AQL – a query language based not on physical database schemas, but on the archetypes, enabling queries to be defined alongside clinical models.
These elements are shown below.
You can’t get rid of the reference model, because it defines the concrete form of the data (like Quantity, CodedText, Observation etc). You can’t get rid of archetypes because they enable you to define a library of clinical content elements like ‘blood pressure – systolic’ and groups like ‘diagnosis – occurrences’ once, and you can’t get rid of templates because they are where you put the archetype bits together to make real-world data sets, define messages, forms etc. If you get rid of archetype-based querying, you are stuck with SQL statements defined against some concrete schema. And without binding to terminology, you can’t state the relationship of terms and ref sets to information structures.
A realistic Modelling Architecture
To make the framework really work for us, we need to define various levels of model authoring, sharing, and add some tooling so that we can get useful software artefacts. The following picture shows a real world architecture as used in openEHR.
The key to this architecture is the computable use of templates to create an ‘operational template’ (as if you had built just a custom model on its own), from which software APIs, message definitions, XML schemas and so on can be generated. These downstream artefacts are what software developers work with. This presentation tells the story in a bit more detail.
Does this framework, and the openEHR architecture actually work? We can say today that it does, since every element of it has been put into production in real systems, and it works as intended. Tools like the Archetype Workbench show how the internals of the model types relate to each other. Some of the outcomes:
- all data in all openEHR systems anywhere in the world are instances of the one reference model – that means we can build and deploy an openEHR back-end system without knowing anything about the archetypes or templates it might one day handle; it also means data can be shared with impunity;
- software developers have been able to work with downstream XML schemas and APIs fully generated from operational templates;
- converters in and out of exchange formats like CDA (and soon ‘green CDA’) have been demonstrated, also based on operational templates;
- the querying capability (Archetype Query Language) has turned out to be of central importance, underlying most screens and all reporting.
Is it all plain sailing? Certainly not. The downstream tool generators need to be more powerful, and more rule-driven, for better flexibility, for one thing. AQL needs to be better connected with terminologies – which requires standards for terminology services to mature. Many lessons learned on the way have created the need for ADL 1.5, the latest version of the archetype formalism, now well into testing.
But the overall evidence is of significant savings in development effort, and a quantum leap in flexibility, as well as ability to compute with health information.
In any discussion such as the ‘Fresh Look’ initiative, framework and architecture have to be discussed and understood. It may well be the case that other participants don’t agree with the above architecture. However, they will need to think about a framework and architecture in order to make any comparisons between models meaningful.
Next: what is the meaning of the ‘reference model’ (aka information model) in this framework?