FHIR fixes: why a type hierarchy would help

One of the principal reasons for why I and others are proposing (some) type hierarchy in the FHIR Admin resources is as follows (my earlier post on this). Working Groups (i.e. committees) building Resources are currently in the situation of defining Elements in a Resource, i.e. defining name, type, cardinality etc. The Resources are a typed system. Now, in places where Reference() is used, typing is being subverted; they are no longer stating a necessary type, they are trying to think of all possible use cases, and stating a corresponding list of types of instances in those use cases.

The first problem here is that the way typing normally works is that for the type of any attribute in a class/module, you state the minimum (= most abstract) type that is needed for the rest of the class to function. What this means is that as long as the attributes defined in that abstract type are present in the instance at run-time (guaranteed in instances of all concrete sub-types), then we are happy. At modelling time, this type choice is the effective meaning of the attribute (along with any class invariants).

In any model system using ‘choice’ (in FHIR, Reference() or choice[x]), this is not happening. Instead, no-one really knows what the attribute means, they just start creating a type restriction list. As more meetings are held, this list grows, shrinks, changes. No-one can ever say formally if any of the types in the list is correct, because there is no stated list of minimal characteristics, as there would be in a model system using abstract types – they just know that some informal agreement was reached in some meeting on 22 May 2018. The result is brittle models, data and software – because making constant changes to those lists breaks things downstream.

A simple example

Specimen.collector is typed as Reference(Practitioner | PractitionerRole). What data are required about a ‘specimen collector’, really? I suspect, probably just some identity information. Apparently discussions have been had where the group thought that only Practitioners (with or without organisational responsibilities attached) do this in real life. Firstly, that’s not correct (most urine and stool samples are collected by the patient at home); secondly, even if we now modified it to Reference(Practitioner | PractitionerRole | Patient | RelatedPerson) (allows for a parent to collect a child’s sample), we still don’t know if a) we got all the possibilities, or b) what minimal data items are required here? In other words, what general category of entity do we want here? The most obvious answer is ‘party’, i.e. an accountable entity with agency, and some minimal (at least potentially) identity, contact info etc. So the type should just be Party, if such a type existed.

Now, in specific local use for an in-house hospital lab, patient collection is out of the question, and you might profile this Specimen resource with a constraint to say

collector matches {Practitioner | PractitionerRole}

I’m using pseudo-archetype notation here to make it clearer that this is now a constraint on the original type-space. Which is what Profiles should be used for (among other kinds of constraining).

A messier example

Communication.sender[0..1]: Reference(Device | Organization | Patient | Practitioner | PractitionerRole | RelatedPerson | HealthcareService | Endpoint)

We can more or less guess the sense in which most of these types could send some communication (although, I will always forget that ‘Practitioner’ could be a paid driver, who is very unlikely to be sending any clinical communication). But in what sense is EndPoint a ‘sender’? Clearly anything sent electronically is likely to come from a WS end-point, but this is just the mechanical act of sending, not the intentional act.

And then why not Group? Or following the logic of Specimen.subject, why not Location? And following many other ‘subject’ definitions, why not Device? And now let’s think about a Practitioner as sender; most likely his/her HealthcareService is also the sender, in an organisational sense. And why not the EMR system EndPoint as well? Which one do we want? As we can see, this kind of thinking can go on forever, but we never answer the basic question: what minimal data items are necessary for Communication.sender to make sense? Role is probably good enough, where Role =def Actor in a role of responsibility under which such communications are sent.

The same argument now applies here: define the base Resource with something like Communication.sender[0..1]: Role (or Reference<Role> to keep the notion of ‘reference’) and then profile in different situations just the specific types you want to allow. Generic receivers know it will always be a Role, and receivers that have been developed with some specific agreement with the sender system’s developers may be able to assume a shared profile, not just the base Resource.

Taking this approach would vastly simplify the endless discussions on what goes in the parentheses of every Reference() expression in a Resource definition. The discussion just becomes, ok, what is the minimal sensible thing here. A Party? A Role? Maybe it is just a Thing. It also means people building software to consume specific Resources can now make some meaningful assumptions about the type of these Reference() typed elements.

Another thing it solves is that in the current way of doing things, the list of types is the super-position of a whole lot of use cases, it doesn’t (generally) correspond to any particular use case. So there is not even any sense in building software, queries or anything else that assumes the whole list as stated in the Reference() definition, because that list will never eventuate in any particular use context.

The above arguments apply also for the use of choice[x], which is widespread, as shown by this page: http://hl7.org/fhir/R4/choice-elements.json

None of the above is specific to the Admin Resources; it applies across all of FHIR.

In summary, my general recommendation is:

treat Resources as a possibility space, defined by open types that state the minimal requirements of any Element in a Resource; doing so requires the insertion of some abstract type Resources (e.g. Party, etc as mentioned above);
perform all type-constraint in Profiles, specific to the types that really can occur in the use case(s) the Profile is built for. In this sense, Profiles are a constraint space.

For formal methods massochists, here is some explanation of why ‘choice’ construct is problematic.

[The above is from an ongoing discussion in the FHIR Zulip chat space, stream:methodology topic:Participation+Heirarchy]

1 Response to FHIR fixes: why a type hierarchy would help

Anders Thurin says:

19/11/2019 at 03:27

This reminds me of the GALEN project’s GRAIL notation, which described a few levels of sanctioning for attributes : sensible for the possibility space and necessary for the constraint space…
Some examples :

([GeneralisedStructure
GeneralisedSubstance
Process] whichG isConsequenceOf BodyProcess)
sensibly hasPathologicalStatus PathologicalOrPhysiologicalStatus;
sensibly hasIntrinsicPathologicalStatus PathologicalOrPhysiologicalStatus;
sensibly hasAbnormalityStatus NormalOrNonNormalStatus;
sensibly hasIntrinsicAbnormalityStatus NormalOrNonNormalStatus;
sensibly hasBeneficialStatus BeneficialStatus.

(Substance whichG isConsequenceOf
(BodyProcess which hasIntrinsicAbnormalityStatus nonNormal))
topicNecessarily hasAbnormalityStatus nonNormal.

see also OpenGalen.org… “Making the impossible very difficult”..