Following on from various posts in the past, including my 2014 post What is an open platform?, I thought it might be time to post a succinct (as possible) definition of the platform idea, for e-health.
As stated in that post, the key thing to understand about a platform is that it represents progress away from being locked-in to a monolith of fixed commitments, toward an open ecosystem. This is true both technologically and economically.
The Technical platform
The word ‘platform’ indicates the notion of a common base on which higher-level components can be built, obviating the need for those components (typically applications) to privately solve needs that are provided for by the platform. These are typically functions such as reliable persistence, versioning, identification, communication, various administration functions, certain types of reference data and so on. In order for applications and other components to be able to use a platform, they need to know:
- what resources it provides and
- how to connect to it.
To be usable, it must therefore have a published description, which includes the formal specification of its interface. To actually be used, it must also of course have implementations.
If we make the assumption that an information systems platform for major domains such as healthcare provision and research is likely to provide non-trivial functionality – in order to be a more attractive option than private implementation – and that its architecture will follow the basic principle of separation of concerns, we can assume that it consists of multiple components, each providing a well-defined function within the overall functionality.
Based on this minimalist analysis, we propose the following definition:
Platform: a published architecture whose component interfaces form the basis for independent development of conformant applications.
Here a component is any coarse-grained element of an architecture that corresponds to an identifiable business function or kind of information, for example patient record, terminology, medications list, etc. If we think of a component as a black box that performs certain functions, say persistent patient health records, then the interface of the component can be expressed in terms of a formal interface definition, often known as an Application Programmer’s Interface (API). This is the set of logical function and procedure calls that can be made by a calling application or other component. Such calls include typed arguments and in the case of functions, return types. Pre- and post-conditions for each call are commonly specified, as well as exceptions that may be raised under certain calling conditions. The types belong to one or more information models.
Following these considerations, we can now say that the published description of a platform includes at least the following elements:
- Component architecture: arrangement of components defining functional elements of the platform;
- Interface descriptions, interfaces of components enabling various modes of interaction by applications, defined by APIs (including usage protocols);
- Information model(s), i.e. models from which data are instantiated in real systems; these models encode the most basic universal concepts relevant to the domain of the platform, e.g. ‘clinical statement’, ‘order’, etc.
The word independent above refers to the ability of application developers to work independently of platform development companies, i.e. solely via the published interfaces.
Component-centrism and API publication are what distinguishes a platform from a non-platform technology.
The Model-driven Platform
A central challenge for a platform in most domains is how the semantics of the domain will be represented in the platform. For example, in health IT, where are entities such as ‘blood pressure measurement’, ‘Apgar result’, and ‘breast cancer treatment plan’ represented? One approach is to encode them directly into the interface and information models (or database schemas). For complex and information-rich domains such as healthcare, this entails numbers of information model entities that are not only very large (O(10k) – O(100k)), but constantly growing and changing.
The question then is whether such large numbers of entities should be directly modelled in the main Information Model(s) of the platform. To try to do so has been the classical approach of IT for decades, but invariably becomes unsustainable, meaning that deployed products gradually stop reflecting new requirements and domain specifics, as their vendors can no longer keep up with the numbers. Experience shows that classical information models, whether expressed in UML, RDBMS schemas or another form, are manageable up to around 150 classes or tables.
Size of information model isn’t the only problem, and might not even be the main one. The fact of constant change in domain content and process models means that a single information model or schema that tries to directly incorporate everything from the domain will never become stable. This is the last thing anyone from software engineering or product deployment wants to hear.
The alternative is to express domain semantics, i.e. the domain knowledge relevant to the purpose of the platform, as independent formal entities, which may be used to configure technical information entities, or may be directly consumed by the platform. For example, if a platform information model contains the formal type Document, definitions of the contents of the thousands of particular kinds of documents used in healthcare could be stated as formal models that defined particular headings and possible contents of each kind of document. Similarly, a workflow definition or guideline rule-set might be directly consumable by the platform. This approach enables the definition of a highly stable technical platform, on which back-end software and databases can rely, while providing separate means of expressing the ever-growing and changing domain semantics.
Not every platform is defined in this way. A model-driven platform is one whose domain semantics are explicitly defined by knowledge artefacts rather than being implicitly expressed in software and database schemas. Such a platform for healthcare includes the following additional elements:
- Domain content models, describing data-sets and their elements used in the domain;
- Computable guidelines and Care pathways, expressing diagnostic and therapeutic plans, and associated decision logic;
- Terminology value-sets expressing allowable values of coded elements within other models;
- Portable queries used to interrogate data, based on knowledge and information models, and terminology.
- Terminology sub-sets: the choice of available terminologies and/or limited subsets for use in the platform.
Underlying this are pure domain descriptions in the form of terminologies and ontologies, for formally describing domain entities (universals) and their natural relationships, i.e. domain ‘truths’.
The knowledge artefacts mentioned above are created as an ongoing activity when the platform architecture is put to use. These models provide the semantics to implementations of the platform services and other components. A major part of defining the platform for an organisation or geography is consequently the establishment of the knowledge modelling activity, and engaging domain specialists to work within it. We can think of this as establishing a knowledge factory run by domain experts, rather than IT staff.
The use of formal knowledge artefacts separate from the software and database, allows for the possibility for domain knowledge to be consumed at runtime by implementations. Back-end services and databases can then be constructed with no direct dependency on domain-level models, enabling them to be deployed and maintained with no modification required as new knowledge models are added. This is the same principle on which today’s rule and workflow engines operate, generalised to the totality of an information systems environment.
A foundational aspect of knowledge representation is the set of formalisms used to express domain knowledge artefacts. Well-designed formalisms are crucial to any platform’s success since they determine the sophistication and coherence of what may be expressed in the models. In particular, the formalism for each kind of artefact must be as close as possible to the cognitive models of the designers and users of that artefact. Cognitive distance forces model-builders to mentally translate their familiar concepts into unfamiliar primitives in the manner of developers working in old programming languages such as Assembler and C.
Formalisms of course need to be implemented in tools to be of any use, and the tool outputs need to be usable either to generate software or else directly in execution environments.
Thus, in order to create a model-driven platform, the following infrastructure is needed:
- Knowledge engineering environment: the large-scale use of knowledge-models requires the use of domain-oriented authoring tools, intelligent repositories and governance procedures;
- Knowledge artefact translation: tools used to translate knowledge models into developer-usable artefacts such as data schemas, JSON, APIs, source code etc used for application development.
We can visualise the model-driven platform and related elements as follows:
One crucial aspect of this platform concept may easily be missed: there is one library of knowledge models for the whole platform, that is, for all components, such as EHR, demographics, scheduling, orders and so on. This means that components that may be used in the platform not only obey the basic type system (core information models), API conventions and design, but are derived from a common set of domain definitions. Domain modelling thus becomes an independent activity, both for platform procurer and implementer alike.
What About Standards?
Many standards in e-health are essentially message-oriented rather than model-oriented, and they do not represent a good basis for a platform architecture, other than for specific data access interfaces required to implement each standard. Such standards come and go, and will always be replaced by new versions that define in some new way various specific information retrievals.
A well-designed platform is its own standard, and creates interoperability as an automatic outcome, not a post-hoc afterthought. Indeed, in a platform-based ecosystem, today’s message-oriented standards would be obsolete.
Some longer discussions about this:
Here I have defined the notion of a technical platform, and then a model-driven platform, where the models encode the semantics of the domain. Adopting the latter is a necessity in health IT, because the domain is too large and volatile to encode into the software and databases.
The model-driven platform thus presents the possibility of a major re-arrangement of industry effort, enabling far greater efficiency than today, where all products independently define their own version of needed semantics, inevitably in inconsistent ways.