A Ferret-Based Gastrointestinal Image Retrieval System

3 downloads 0 Views 68KB Size Report
A Ferret-Based Gastrointestinal Image Retrieval System. Steven Bedrick, BA. 1. , Jayashree Kalpathy-Cramer, Ph.D. 1. 1. Oregon Health and Science University, ...
A Ferret-Based Gastrointestinal Image Retrieval System Steven Bedrick, BA1, Jayashree Kalpathy-Cramer, Ph.D1 1 Oregon Health and Science University, Portland, OR Abstract We developed a web-based interface for image retrieval and a cluster analysis system. The system handles search queries using Ferret, a port to the Ruby language of the Apache Lucene indexing and searching system. The system uses de-identified endoscopic images from the Clinical Outcomes Research Initiative data repository, and is designed for use by students and researchers.

groups images into several categories according to the primary diagnosis. Another notable feature of our system is the cluster visualization component. In the course of developing image classification algorithms, we routinely generate very lengthy files containing image identifiers and putative cluster assignments. Our system includes functionality enabling a user to upload the output of their classifier and immediately see which images were assigned to which clusters.

Introduction As more and more clinical data take the form of digital images, the need for effective and user-friendly retrieval systems has become more and more pressing. Proposed uses for such systems fall into categories including education, decision support, and data mining1. Our ongoing research is in the field of automated image classification using both content- and context-based methodologies. The system under discussion serves as an experimental test bed that allows us to store and interact with our collection of gastroenterological images, as well as providing an efficient way to examine and compare the results of different image classification techniques. Background Our system is written in the Ruby programming language, using the Ruby On Rails web application framework and is backed by a PostgreSQL relational database. Users of the system are able to browse the stored images in a hierarchical fashion, and can also use natural-language queries to search for images with particular anatomical or pathological features. The natural-language query engine uses Ferret, a Ruby port of the popular Lucene indexing system. As such, it offers a highly flexible query language and a variety of options for parsing input queries. Researchers in this area are faced with a general paucity of well-annotated medical image collections2. We are fortunate to have access to a subset of the image data found in the Clinical Outcomes Research Initiative (CORI, http://www.cori.org) data repository. CORI is a gastrointestinal outcomes research organization, collecting data from several hundred physicians’ practices. Our subset contains data regarding diagnoses and findings for each image, and

Analyses and Results The system currently contains 1,440 annotated images from the CORI data repository. Average query times are quite usable, though there is room for improvement; on one author’s development computer, the query “esophag*” takes just under 1.5 seconds to retrieve just over 500 documents ( x = 1.484 seconds, SD  0.1 second). Conclusion Our image retrieval system combines useful functionality with an extensible architecture. Future plans involve expanding the image database to include more data from CORI, experimenting with more sophisticated natural-language query parsing, and streamlining the cluster visualization and retrieval interfaces. Acknowledgements The authors gratefully acknowledge the database management team at the Clinical Outcomes Research Initiative for their help in acquiring the images. This work was supported in part by NLM Training Grant 1T15LM009461. References 1.

2.

Müller H, Clough P, Hersh W, Deselaers T, Lehmann T, Geissbuhler A. Evaluation axes for medical image retrieval systems - the ImageCLEF experience. International Conference ACM Multimedia (ACM MM 2005), November 2005 Hersh, WR, Müller H, et al. Advancing biomedical image retrieval: development and analysis of a test collection. Journal of the American Medical Informatics Association 13(5): 488-96. 2006.

AMIA 2007 Symposium Proceedings Page - 868