Searching for Document Contents in an IHE-XDS

1092

MEDINFO 2013 C.U. Lehmann et al. (Eds.) © 2013 IMIA and IOS Press. This article is published online with Open Access by IOS Press and distributed under the terms of the Creative Commons Attribution Non-Commercial License. doi:10.3233/978-1-61499-289-9-1092

Searching for Document Contents in an IHE-XDS EHR Architecture via ArchetypeBased Indexing Of Document Types Christoph Rinnera, Michael Kohlera, Samrend Saboorb, Gudrun Huebner-Bloderb, Elske Ammenwerthb, Georg Duftschmida b

a Section for Medical Information Management and Imaging, Medical University of Vienna, Austria Institute of Health Informatics, UMIT - University for Health Science, Medical Informatics and Technology, Hall in Tirol, Austria

Abstract and Objective The shared EHR (electronic health record) system architecture IHE XDS is widely adopted internationally. It ensures a high level of data privacy via distributed storage of EHR documents. Its standard search capabilities, however, are limited; it only allows a retrieval of complete documents by querying a restricted set of document metadata. Existing approaches that aim to extend XDS queries to document contents typically employ a central index of document contents. Hereby they undermine XDS’ basic characteristic of distributed data storage. To avoid data privacy concerns, we propose querying EHR contents in XDS by indexing document types based on Archetypes instead. We successfully tested our approach within the ISO/EN 13606 standard. Keywords: Medical Records, Medical Records Systems, Computerized, Reference Standards, Models, Theoretical

Introduction To improve the standard metadata-based capabilities of searching a patient’s EHR in the shared EHR system architecture IHE XDS, several solutions for a content-based search were proposed. These solutions typically create a central index on the documents’ contents [1, 2]. While this allows finegranular search, it also raises data privacy concerns by undermining XDS’ basic concept of storing EHR data in a distributed manner and limiting central data to a limited set of metadata only. Current EHR standards like ISO/EN 13606 and HL7 CDA are based on the dual model approach; with a static reference model and knowledge artifacts that constrain the reference model. Archetypes, a computer-processable form of these constraints, represent a promising origin for an alternative approach for content-based search in XDS.

Methods In our approach we assume that the contents of each document type, which may occur in the IHE XDS domain, is described by Archetypes. An Archetype repository may thus be seen as an index on the contents of the document types, instead of the documents themselves. The Archetypes are used to create a thesaurus, based on which content-based queries are created. A query can be constructed from an arbitrary number of these terms connected with Boolean operators. The content-based query is converted into a standard metadata-based query using the Archetype-based document type index and all relevant documents are retrieved from the XDS repositories. Next, the structural information in the Archetypes is used to identify the

absolute paths of all queried terms. The content-based query is transformed into a disjunctive normal form and converted into an XQuery expression containing the absolute paths of the Archetype nodes. The XQuery is applied to all documents retrieved in the previous step. The resulting document contents are shown to the user, together with links to the complete documents, from which the contents were extracted.

Results The approach was tested in the context of diabetes-specific EHR information retrieval. ISO/EN 13606 test documents for 12 different document types, which were described by a set of 128 Archetypes, were created in an IHE XDS environment. Seven clinicians, who assessed our content-based search to be superior to a standard IHE XDS metadata-based search, evaluated our approach. Converting a query with 80 terms into an XQuery takes 0.5 seconds on an Intel® Core™2 Quad processor. On average executing this XQuery takes 0.15 seconds per document. Most of the time is spent on downloading the documents from the XDS repository.,

Conclusion The Archetype repository allows the relevant documents for a content-based query to be identified, and contains the structural information needed to located the searched contents within the documents. In contrast to web search, searching the comparatively tiny EHR of a single patient, it is possible to download the potentially relevant distributed EHR documents and direct the query to the documents themselves. This allows us to implement a content-based query in IHE XDS, without compromising its data privacy concept with a central index on document contents. We plan to adapt this approach to CDA documents to use it in Austria’s national EHR system ELGA. This work was supported in part by the Austrian Science Fund (FWF), project number P21396.

References [1] Pruski C. et al. Efficient medical information retrieval in encrypted electronic health records. Stud Health Technol Inform. 2012; 180(225-9) [2] Liu S et al. Beyond regional health information exchange in China: a practical and industrial-strength approach. AMIA Annu Symp Proc. 2011; 2011(824-33) Address for correspondence: [email protected].

chris-