Data Validation for openEHR Conformance Verification

Posted by Pablo Pazos Gutierrez on February 17, 2023, 2:30 am

In openEHR Conformance Verification, when a system that receives data from external sources claims compliance with the specs, one key area to test is Data Validation.

In openEHR everything is constrained by archetypes and templates, and those constraints work as several layers of data validation:

1. Type constraints

2. Structural constraints (cardinality, occurrences, existence)

3. Terminology constraints

4. Data constraints

 

Any system that receives/processes data from external sources should have all those layers implemented, most will use just templates, though some might rely also on archetypes. Some systems might or not validate coded data against external terminology servers, since that implies a performance penalty, though it guaranties coded data is correct, for instance, against a SNOMED CT subset.

 

Another key aspect for systems being tested is the way they report errors. Errors should be:

 

1. Locatable: know exactly where the error is in a given data structure

2. Understandable: self explaining which type of error occurred and why

 

Above that you can include any other metadata that your system might implement, but those are the basic requirements.

Some good Data Validation error examples, which are implemented in our openEHR SDK (https://github.com/ppazos/openEHR-OPT) are:

1. [path:/content(0)/data/items(1)/value/defining_code, error:code_string 'at9999' is not in the code list [at0004, at0005, at0006]] - a code in the data is not in the allowed list of codes defined in the template

2. [path:/content(0)/data/items(1)/value/value, error: Value 'xxxxx' doesn't match value from template 'aaa'] - the value of a coded text doesn't correspond to the value given in the template (even if the code is the same)

3. [path:/content(0)/data/items, error:Children with archetype_node_id=at0002 occurs 0 times, violates occurrences constraint 1..1] - an item that should appear in the data structure is not preset

4. [path:/content(0)/data/items(2)/value/defining_code, error:terminology 'SNOMED-XXXXX' doesn't match the external terminology SNOMED-CT] - the template includes a reference to an external terminology but the data has a different terminology ID which is not allowed

 

Another good practice is to indicate in which place of the archetype/template is the constraint being tested against the data, that was violated by the data, in this case it's the 'optPath':

- dataPath: /content(0)/data/items

- optPath: /content[archetype_id=openEHR-EHR-ADMIN_ENTRY.data_validation.v1]/data[at0001]/items[at0002]

- error: Children with archetype_node_id=at0002 occurs 0 times, violates occurrences constraint 1..3

 

 

Soon we will add a Data Validation service to the openEHR Toolkit (https://toolkit.cabolabs.com/) that you can use to verify data instances. Note we already have a data generator service based on templates.

If you have questions or need help with Conformance Verification of your openEHR tools, contact us at www.CaboLabs.com