Towards a Classification of Epistemic Status for health data

This page contains notes concerning a possible classification of epistemic status for healthcare data. The problems it tries to address are the following:

  • the need to distinguish between ‘raw’ statements entered into the EHR and statements that have been verified in some way
  • the need for a way to verify items in the EHR so that they can be dependably asserted to be the same or distinct within a referent tracking framework
  • the need to mark data so that different categories of information (knowledge) in the EHR are distinguished, even when their structural form is the same, e.g. a target BP versus an actual BP measurement. This is a major requirement for clinically safe querying.

A Conceptual Model

One conceptual organisational approach to managing health data is as two pools of information: one ‘epistemological’, one ‘ontological’. The latter are items that actually do reflect the patient state, as best as it is known in reality. The former are information items that are understood as claims by various parties, of varying degree of veracity and overlap. The need is to be able to migrate such statements, by a process of curation to the ontological pool, which we can regard as safely computable and an appropriate basis from which to generate actionable conclusions. (These may be debates about whether these are just categories of explanatory convenience here, i.e. with respect to health records, since the usual assumption of the EHR is that all it contains is more or less fallible statements. However, the convenience appears worthwhile.)

A key question here is what ‘curation’ involves. Abstractly, it means verification of the statements to the extent possible by professional actors in the health system (since we accept that nothing can be shown to be an absolutely true description of reality – anyone who has seen ‘House’ will know that even the most evidence-based diagnosis can be later overturned!). However establishing reliability / veracity isn’t all that is needed. Another important function is determining identity of mentioned referents, that is to say, which statements refer to the same specific things in reality. This function should be understood as part of a referent tracking approach to health data. The aim would be for the curation process to establish things like:

  • two mentions of a diagnosis of cancer are referring to the same episode /occurrence of cancer
  • two mentions of a physical injury to the same location refer to the same injury or a different one.

The idea is that as initially disparate mentions of some ‘bone injury’ or some ‘Dx of throat cancer’ are curated, form part of a ‘crystallised’ ontological view of the patient’s reality, rather than a series of assertions that have to be further processed by human or other agents to determine their relationship with reality.

When is a Statement ‘Reliable’?

The discussion below assumes that ‘reliable’ has an objective definition – i.e. the implication is that once a statement is marked as ‘curated’ there is no longer any doubt. This is probably true very often, but is certainly not guaranteed to always be the case. Nevertheless we have to be practical. The EHR is not a philosophical debating arena, it must function as a place of evidence gathering and opinion forming in support of decisions on interventions for patients. Therefore, for our purposes here, it seems reasonable to define a ‘reliable’ statement in the health record as one on which health care professionals are actually prepared to act, i.e. commit to interventions.

Curation & Reliability

Something like ‘curation’ already occurs in many EHR / EMR situations today. It typically takes the form of review and adjustments made to the EHR to populate ‘managed lists’ such as medications list, problem list and so on. The activity of ‘medications reconciliation’ is one well known example. It appears that curation of the EHR more generally is needed to increase the amount of information we can consider as reliable ‘ontological’ items, rather than unreliable items.

To get a handle on curation, we need to look at how it might apply across a) basic categories of information and b) how authorship affects it. The set of categories we used in openEHR is shown below.


In openEHR we treated all ‘opinions’ (diagnoses, prognoses, plans, assessments, goals) as if they had been written by the healthcare professional. But clearly there are any number of these types of things that patients say (In Weed’s POMR, they are treated as the ‘subjective’ part of the encounter note). Our previous way of modelling this was to say that these are all ‘observations’ (i.e. fall into the first category above), since in openEHR we treated everything that the physician ‘saw, heard, measured’ as an observation. We can improve on this, since it’s more useful if ‘observations’ are things we can rely on as being some reasonable approximation of an element of reality (a BP or whatever).

A starting point is to assume that objective observations  (typically repeatably measurable) are reliable. The evidence is that this is generally true, even more so in the case of patient supplied observations. These statements fall into the ‘ontological’ pool, since they (+/- occasional errors in machines or human measuring) do represent the state of some observable aspect of the patient at a point in time.

What are often known as ‘subjective’ observations in medicine are statements by the patient about symptoms and experience, e.g. reported levels of pain, experience of headaches and so on. These statements are not usually treated as being adjustable in terms of reliability, i.e. while physicians try to help patients be more precise (e.g. please state your pain on a level of 0 – 10, where 10 is the worst you have ever known), they essentially have to trust the patient’s claims as they are.

In theory we would like to assume that orders are reliable, since they are formally stated, and represent a committed stance by the HCP to a course of action – one which he or she is prepared to stand by medico-legally. The well-known problem here is that recording of orders in different locations (GP, GP while on holiday, hospitals, other clinics) leads to competing medication lists in each place. Each list tends to consist of orders prescribed locally, augmented by additions of drugs that the physican adds due to questioning the patient. The results of the questions are often wrong (either categorically, in dose, or some other detail), but nevertheless are added to medication lists. The result is that numerous medication lists per patient, none of which can be completely trusted. This problem is widely recognised, and is the reason for the official ‘medications reconciliation’ processes promoted by various health authorities, e.g. NHS England & Scotland.

In a similar way, it initially seems reasonable to assume that administrative information is reliable, other than the effect of clerical / data entry errors. However, it is recognised that mis-identification of patients when they present at provider locations (e.g. mixing two ‘Annette Smiths’ up; tribal names and inability to provide accurate birth dates are also common problems) is a universal problem. Therefore, a more realistic approach may be to say that identification information (at least) is unreliable until accepted by the patient or some other reviewer agent, or verified by sighting of certain kinds of proof etc.

A major grey area is that of ‘clinical opinion‘. We can think of this category of information as conclusions supported by evidence that are intended to be ‘ontological’ statements, e.g. ‘diagnosis of diabetes mellitus’ is an assertion that there is a real a) situation of ongoing insulin insufficiency, and b) that this is part of an ongoing disease process within the patient that conforms to the normal medical description of diabetes mellitus. Typical statements in this category include:

  • clinician diagnosis – initial / working / final
  • patient statement of previous diagnosis, e.g. ‘I have been diabetic since 2002’
  • patient statement of allergy, e.g. I am allergic to dairy products

Normally we like to think that clinical opinions provided by health professionals as reliable. However, the evidence is that there is a continuum of reliability even within the clinical profession with respect to diagnoses, prognoses etc. If a clinical opinion is supplied by the patient, this would normally be treated as ‘raw’, until curated by the HCP into a reliable statement. For example a patient claim of being diabetic can be easily checked, and converted to a formal statement of diagnosis made at some earlier date. A patient claim of being allergic to some substance, unsupported by evidence may be treated as less reliable, and not promoted to curated status.

An Initial Classification

The following informal classification proceeds from the original openEHR Clinical Investigator Ontology that distinguished basic categories. However, it did not take account of the raw / curated distinction, and lacked some categories shown below. It is not yet sufficiently formalised to constitute a computable ontology.

In the following, real types of clinical information are shown in bold, with candidate ontology categories shown in italics.

  • HIS_entry: any information committed to a health information system
    • patient identification entry – statements establishing the identity of the patient, i.e. of the form patient_characteristic is_about patient having EHR with EHR_id x
      • demographics – age, sex, occupation
      • external identification – e.g. NHS number, etc
    • governing statement – statements defining rules/constraints about patient / health system interaction
      • patient information consent – informational governance (IG) consent by patient for use of health data by specific parties
      • patient preferences – e.g. do not resuscitate (end of life); no blood transfusions;
    • financial entry – statement about financial transaction(s)
      • billing event – record of billing of other party by provider
        • patient invoice – billing of patient for service
        • re-imbursement request – request for payment by payor organisation for services provided to patient(s)
      • payment event – record of payment that has been made made by billed party to service provider
        • patient payment – xxx
        • re-imbursement – record of payment by payor organisation for services provided to patient
      • cost assessment – description of costs associated with potential intervention options
    • management_entry – statements about entities and events in the clinical system (cf the patient)
      • patient matching – results in association of validated patient identification with a particular EHR
      • episode event – record of event relating to episode of care
        • admission – commencement of episode
        • discharge – completion of episode
      • allocation of provider – results in HCP(s) being added to ‘legitimate relationship’ list
      • transfer of responsibility – record of transfer of patient away one responsible care provider to another, or to self-care
        • referral
      • scheduling event – commitment by the health system to provide a service to the patient at a specific time in the future
        • booking
        • rebooking
        • cancellation
      • reportable event – events required to be formally reported to an external authority [WARNING this is most likely a pseudo-category, but I’m parking it here until I can come up with a better analysis of these kinds of things]
        • reportable patient event   in which the patient is the responsible party
          • birth
          • death
          • suicide attempt
          • patient criminal act
        • reportable provider event – events required to be formally reported to an external authority in which the provider is the responsible party
          • MHA sectioning
          • adverse care event – e.g. drug dose error causing patient injury
          • surveillance event – e.g. notification of SARS, MRSA etc detected in new patient
          • patient privacy breach (relevant to EHR?)
    • clinical_care_entry – statements about clinical state and treatment of subject of care relating to specific health care concerns
      • patient_statement – statement about patient needs, circumstances, etc
        • concern
          • issue – statement of specific health problem that is an issue, for the patient e.g. being overweight, fatigue, back pain
          • anxiety – e.g. patient expressed anxiety of fear of getting/having cancer
        • goal – patient expressed goal e.g. to ‘be fitter’, ‘be able to move back to house with stairs’
        • background information – peripheral information about external factors provided by patient pertinent to current situation, e.g. loss of job, marriage break-up in treatment of depression; general social situation
        • consent – consent for specific procedure to take place
      • clinical_assertion – an assertion typically from patient, family, or other non professional actor that makes some clinical claim; treated as ‘uncurated’
        • symptom description – reported current experience; corresponds to ‘subjective’ category in Weed’s POMR
        • report of previous diagnosis – claim of existing diagnosis
        • report of existing allergy or interaction – report of allergy to a substance or other allergen
        • family history – statement about patient family member used as a surrogate / predictor of risk to patient
      • clinical_statement – statements assumed to be reliable and usable in the execution of the professionally managed clinical process
        • historical (T < NOW) – record of anything seen in or done to patient
          • observation_result – description of occurrent or continuant at point / points / interval in time, e.g. BP, heartrate, etc; ; corresponds to ‘objective’ category in Weed’s POMR
            • (numerous clinical statements relating to specific phenomena seen in patient)
          • performed_action – description of action performed, e.g. drug administration
            • (numerous statement types relating to actions performed on / for patient)
        • clinical_professional_assertion – an opinion / thought of a health care professional at time ‘now’ about X that itself may relate to a past, present or future time
          • assessment – of state of patient; ; corresponds to ‘assessment’ category in Weed’s POMR
            • existing risk – record of relevant fact that constitutes risk to patient, typically specific family history events
            • risk assessment – statement of possible future occurrence of X with probability P based on risk factors, or guideline-based assessment
            • diagnosis – statement that process P exists in patient, made at time T (e.g. could be past diagnosis) based on signs and symptoms S
          • prediction – about future patient state(s) in time
            • prognosis – statement of likely forward trajectory of diagnosed process P in patient
          • proposal – statements relating to future state of patient and means of getting there; ; corresponds to ‘plan’ category in Weed’s POMR
            • target – clinically set goal for patient
            • plan element – clinically set statement of treatment
          • report – statements intended as a communication to another professional party, also acting as medico-legal record
            • progress report – report on care delivered during episode or other defined period of time
            • patient summary – summary of patient current status, including key problems, medications etc
        • clinical_instruction (T> NOW) – record of formal order for something to be done to / for subject of care
          • investigation_order
            • (numerous types of investigation order)
          • intervention_order
            • (numerous types of order for imaging, medication, etc)

Second order EHR structures:

  • care plan – structured plan consisting of goals, targets, plan elements, orders, actions, observations (by reference)
  • medication list – managed single-source-of-truth medication list
  • problem list(s) – managed list of current problems of patient, including all diagnoses
  • allergies and interactions list – managed list of known allergies, interactions, contra-indications
  • family history – structured document of relevant health events in family
  • vaccination record – list of vaccination orders and administrations
  • social situation – where relevant, description of patient social circumstances relevant to condition and/or care

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s