and computed axial tomography

0 downloads 0 Views 3MB Size Report
Jun 3, 2007 - programming language with usage of the Java Swing components. Processing ..... Jena [11] is the programming environment for developing the semantic web applications ..... M. Deitel [2] and Sun's Java Tutorial [29]. All the VTK ..... In Middle Ages the main issue were universals and it was then, when the.

THE TECHNICAL UNIVERSITY OF ŠÓD™ Faculty of Electrical and Electronic Engineering

Master of Engineering Thesis Medical software for three-dimensional analysis of magnetic resonance (MR) and computed axial tomography (CAT) images with embedded reporting engine for description of patient's health

Bartªomiej Wilkowski Student's number: 115006

Supervisor:

dr in». Marcin Janicki Auxiliary supervisors:

Óscar Pereira Paulo Miguel de Jesus Dias

Šód¹, 2007

ABSTRACT

This thesis refers to the specic medical software designed for the radiologist/doctor assistance. The whole functionality of the software is enclosed in the package called MIAWARE. MIAWARE stands for

Analysis With Automated Reporting Engine.

Medical Image

The complete descrip-

tion of the functionality of this software and its application can be found here. MIAWARE integrates two important aspects of image-based medicine. It makes possible to analyze radiological images in order to nd any disease changes, and at once allows to carry out reporting of the patient's health state in a very automated manner. MIAWARE's report generation engine requires from the user (radiologist) a detailed specication of the pathologic changes found in patient's body and their locations. Unlike to the present habits, the radiologist cannot describe those ndings with his own words, but can use only the specic medical vocabulary provided by the application. Consequently, MIAWARE software is able to create normalized medical reports according to the information about pathologies introduced earlier by the user. Finally, the intelligent search engine for medical reports is implemented, based on the relations between the real-world objects. The ontology for lungs was developed in order to use the relations between the parts of the lungs in the search algorithm. Consequently, a deductive report search was obtained, which can improve the disease recognition process. Any patient case can be compared to other archive reports (cases), which contain of similar symptoms, what should lead to better diagnoses and faster decisions over the suitable patient's treatment to apply.

STRESZCZENIE

Prezentowana praca dotyczy oprogramowania medycznego przeznaczonego do analizy zdj¦¢ tomograi komputerowej. Caªe oprogramowanie znajduje si e w pakiecie nazwanym MIAWARE, co po angielsku oznacza Medical Image Analysis With Automated Reporting Engine (Analiza obrazów medycznych ze zautomatyzowanym moduªem raportuj¡cym). W kolejnych rozdziaªach tej pracy mo»na znale¹¢ dokªadny opis zastosowania oraz funkcjonalno±ci omawianego oprogramowania. Gªównym zadaniem oprogramowania MIAWARE jest integracja dwóch wa»nych aspektów medycyny obrazowej.

MIAWARE umo»liwia analiz¦ i

wizualizacj¦ radiologicznych obrazów medycznych w celu rozpoznania zmian patologicznych w badanym obszarze ciaªa pacjenta. W mi¦dzyczasie, wszystkie spostrze»enia oraz wnioski na temat znalezionych zmian chorobowych, patologii, etc. mog¡ by¢ zachowane i powi¡zane z krytycznymi miejscami na zdj¦ciach w celu pó¹niejszego otrzymania sprawozdania o stanie zdrowia pacjenta. Na t¦ chwil¦, MIAWARE oferuje mo»liwo±¢ tworzenia sprawozda« tylko dla obszaru pªuc ludzkich. Technika raportowania patologii zaimplementowana w MIAWARE ma na celu ostateczne otrzymanie sprawozda« medycznych znormalizowanych pod wzgl¦dem terminologii w nich u»ywanej. Oznacza to, »e radiolog pracuj¡c z oprogramowaniem MIAWARE i opisuj¡c znalezione patologie nie mo»e robi¢ tego w dowolny sposób i u»ywa¢ wªasnych sªów. MIAWARE oferuje baz¦ terminów medycznych niezb¦dnych do scharakteryzowania patologii w pªucach ludzkich, które s¡ wybierane krok po kroku przez radiologa podczas procesu tworzenia sprawozdania. Ostatecznie, moduª raportuj¡cy zaimplementowany w MIAWARE szereguje otrzymane informacje i automatycznie generuje odpowiednio sformatowane, znormalizowane sprawozdanie medy-

czne. Dzi¦ki takiej metodzie, sprawozdania sporz¡dzone na podstawie zdj¦¢ jednego pacjenta powinny prawie zawsze, niezale»nie od radiologa, by¢ identyczne. Ewentualne ró»nice mi¦dzy sprawozdaniami mog¡ by¢ spowodowane niedokªadn¡ analiz¡ b¡d¹ bª¦dami ludzkimi. Ostatni¡, istotn¡ cz¦±ci¡ pakietu MIAWARE jest inteligentna wyszukiwarka raportów medycznych. U»ywaj¡c jej, lekarz b¡d¹ radiolog, mog¡ ltrowa¢ archiwalne sprawozdania medyczne w celu znalezienia patologii w okre±lonych lokalizacjach pªuc. Co istotne, sprawozdania nie s¡ przeszukiwane tylko na podstawie sªów wprowadzanych do kryterium wyszukiwania, ale równie» na podstawie terminów logicznie powi¡zanych z wprowadzonymi sªowami kluczowymi. Funkcja ta zostaªa zaimplementowana przy u»yciu ontologii, gdzie wszystkie anatomiczne elementy pªuc zostaªy uªo»one w pewn¡ logiczn¡ struktur¦, dzi¦ki której komputer jest w stanie wydedukowa¢ wszystkie podelementy danej cz¦±ci pªuc. Normalizacja sprawozda« medycznych byªa warunkiem koniecznym do stworzenia wydajnej wyszukiwarki. Wymienione aspekty oprogramowania MIAWARE, odpowiednio u»yte, mog¡ uªatwi¢ lekarzowi stawianie diagnoz oraz przespieszy¢ proces rozpoznawania choroby.

CONTENTS

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2

Streszczenie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3

Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8

List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.1

Motivations . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

1.1.1

Present radiological reporting schema . . . . . . . . . .

12

1.1.2

Shortcomings and limitations of the present reporting schema . . . . . . . . . . . . . . . . . . . . . . . . . . .

12

Automated reporting with MIAWARE . . . . . . . . .

13

Contributions of this thesis . . . . . . . . . . . . . . . . . . . .

14

1.2.1

Integration of visualization, reporting and searching . .

14

1.2.2

Normalized report generation . . . . . . . . . . . . . .

15

1.2.3

User-friendly software

. . . . . . . . . . . . . . . . . .

16

Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . .

16

1.3.1

Analysis of CAT images . . . . . . . . . . . . . . . . .

16

1.3.2

Reporting over lung pathologies . . . . . . . . . . . . .

16

1.3.3

Medical reports ltering . . . . . . . . . . . . . . . . .

18

1.1.3 1.2

1.3

1.4

Roadmap

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

18

2. Radiological examinations overview . . . . . . . . . . . . . . . . . . 20 2.1

2.2

2.3

MRI characteristics . . . . . . . . . . . . . . . . . . . . . . . .

20

2.1.1

MRI technology . . . . . . . . . . . . . . . . . . . . . .

21

2.1.2

MRI advantages and disadvantages . . . . . . . . . . .

23

CAT characteristics . . . . . . . . . . . . . . . . . . . . . . . .

24

2.2.1

CAT technology . . . . . . . . . . . . . . . . . . . . . .

24

2.2.2

CAT advantages and disadvantages . . . . . . . . . . .

26

MRI and CAT comparison . . . . . . . . . . . . . . . . . . . .

28

3. MIAWARE Architecture . . . . . . . . . . . . . . . . . . . . . . . . 30 3.1

3.2

3.3

3.4

Decisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

30

3.1.1

Tools for visualization  review

30

3.1.2

Tools for reporting and ontology-based search engine 

. . . . . . . . . . . . .

review . . . . . . . . . . . . . . . . . . . . . . . . . . .

31

3.1.3

Tools for GUI applications  review . . . . . . . . . . .

33

3.1.4

Final environment choice for the MIAWARE software .

35

Image visualization and GUI development

. . . . . . . . . . .

36

3.2.1

Integrating VTK with Java

. . . . . . . . . . . . . . .

37

3.2.2

Creating 3D model . . . . . . . . . . . . . . . . . . . .

37

3.2.3

3D model cross-sections generation . . . . . . . . . . .

42

3.2.4

Creating GUI . . . . . . . . . . . . . . . . . . . . . . .

45

Medical report generation

. . . . . . . . . . . . . . . . . . . .

46

3.3.1

Medical vocabulary selection and representation . . . .

46

3.3.2

XML-based reporting form creation . . . . . . . . . . .

48

3.3.3

Resource Description Framework

. . . . . . . . . . . .

52

3.3.4

Normalized medical report generation . . . . . . . . . .

54

Ontology-based search engine development . . . . . . . . . . .

59

3.4.1

Ontology denition and development . . . . . . . . . .

59

3.4.2

Medical ontology for lungs . . . . . . . . . . . . . . . .

70

3.4.3

Search algorithm for RDF les . . . . . . . . . . . . . .

82

4. Medical analysis and reporting with MIAWARE . . . . . . . . . . . 92 4.1

Installation notes . . . . . . . . . . . . . . . . . . . . . . . . .

92

4.2

4.3

4.4

Analysis of CAT images

. . . . . . . . . . . . . . . . . . . . .

94

4.2.1

Specifying CAT stack location . . . . . . . . . . . . . .

94

4.2.2

3D model manipulation . . . . . . . . . . . . . . . . . .

97

4.2.3

2D images manipulation . . . . . . . . . . . . . . . . .

99

Reporting over lung pathologies . . . . . . . . . . . . . . . . . 101 4.3.1

Dening pathologies

. . . . . . . . . . . . . . . . . . . 101

4.3.2

Viewing and editing pathology descriptions . . . . . . . 103

4.3.3

Generating medical reports

. . . . . . . . . . . . . . . 105

Searching for the medical reports . . . . . . . . . . . . . . . . 106

5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Bibliography

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

Appendix

115

A. How to build VTK on Windows with Java support . . . . . . . . . 116 A.1 Required downloads and software installation

. . . . . . . . . 118

A.1.1

VTK source download . . . . . . . . . . . . . . . . . . 118

A.1.2

CMake download and install . . . . . . . . . . . . . . . 118

A.1.3

C++ compiler installation . . . . . . . . . . . . . . . . 118

A.1.4

Java SDK download and installation . . . . . . . . . . 118

A.1.5

Eclipse download and installation . . . . . . . . . . . . 119

A.2 Compiling the VTK source with CMake

. . . . . . . . . . . . 119

A.3 Building the conguration in C++ compiler (Microsoft Visual Studio 2005) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 A.4 Conguration of Java environment in Eclipse . . . . . . . . . . 123 A.5 Summary

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

B. Article about MIAWARE software

. . . . . . . . . . . . . . . . . . 126

LIST OF FIGURES

1.1

CAT stack images analysis . . . . . . . . . . . . . . . . . . . .

17

2.1

Siemens Symphony MRI scanner

. . . . . . . . . . . . . . . .

20

2.2

Functional conguration of MRI scanner . . . . . . . . . . . .

21

2.3

MRI knee reconstruction . . . . . . . . . . . . . . . . . . . . .

22

2.4

Toshiba Aquillion CAT scanner . . . . . . . . . . . . . . . . .

25

2.5

CAT scan work principle . . . . . . . . . . . . . . . . . . . . .

26

2.6

Human thorax CAT slice . . . . . . . . . . . . . . . . . . . . .

27

3.1

Steps of CAT stack processing in MIAWARE . . . . . . . . . .

38

3.2

3D model representation . . . . . . . . . . . . . . . . . . . . .

42

3.3

3D model manipulation with widgets . . . . . . . . . . . . . .

43

3.4

2D cross-section planes . . . . . . . . . . . . . . . . . . . . . .

44

3.5

Pathology reporting form . . . . . . . . . . . . . . . . . . . . .

50

3.6

Ontology for cars visualization . . . . . . . . . . . . . . . . . .

64

3.7

Ontology inverse properties

. . . . . . . . . . . . . . . . . . .

65

3.8

Ontology class relationship . . . . . . . . . . . . . . . . . . . .

67

3.9

Inferred hierarchy of ontology . . . . . . . . . . . . . . . . . .

69

3.10 Ontology reasoning classication results . . . . . . . . . . . . .

70

3.11 Human lungs structure . . . . . . . . . . . . . . . . . . . . . .

72

3.12 Classes taxonomy in lungs ontology  part 1 . . . . . . . . . .

72

3.13 Classes taxonomy in lungs ontology  part 2 . . . . . . . . . .

73

3.14 Properties in lungs ontology . . . . . . . . . . . . . . . . . . .

75

3.15 Restrictions in lungs ontology . . . . . . . . . . . . . . . . . .

76

3.16 Ontology-based search engine . . . . . . . . . . . . . . . . . .

83

3.17 Ontology-based search algorithm owchart  part 1 . . . . . .

85

3.18 Ontology-based search algorithm owchart  part 2 . . . . . .

86

3.19 Ontology-based search algorithm owchart  part 3 . . . . . .

87

3.20 Ontology-based search algorithm owchart  part 4 . . . . . .

89

3.21 Ontology-based search algorithm owchart  part 5 . . . . . .

90

4.1

MIAWARE software les . . . . . . . . . . . . . . . . . . . . .

93

4.2

MIAWARE graphical user interface . . . . . . . . . . . . . . .

95

4.3

Medical images loading  progress bar

. . . . . . . . . . . . .

96

4.4

3D model rendering window . . . . . . . . . . . . . . . . . . .

97

4.5

Actions and manipulation toolbar for the 3D model view . . .

98

4.6

2D cross-sections panel . . . . . . . . . . . . . . . . . . . . . . 100

4.7

Add pathology  conrmation dialog . . . . . . . . . . . . . . 101

4.8

Pathology reporting frame . . . . . . . . . . . . . . . . . . . . 102

4.9

List of the pathologies . . . . . . . . . . . . . . . . . . . . . . 103

4.10 Pathology description view frame . . . . . . . . . . . . . . . . 104 4.11 Medical report disk location and name denition . . . . . . . . 105 4.12 Successful report generation dialog

. . . . . . . . . . . . . . . 106

4.13 Ontology-based search engine graphical user interface . . . . . 107 4.14 TXT medical report view frame . . . . . . . . . . . . . . . . . 107 A.1 CMake window before conguration . . . . . . . . . . . . . . . 120 A.2 CMake window before customization . . . . . . . . . . . . . . 120 A.3 Visual Studio 2005 screenshot . . . . . . . . . . . . . . . . . . 122

9

LIST OF TABLES

2.1

MRI vs CAT

. . . . . . . . . . . . . . . . . . . . . . . . . . .

28

3.1

Pathology denition steps  example . . . . . . . . . . . . . .

50

3.2

ControlPoint members association  example . . . . . . . . . .

51

3.3

Sample ControlPointInfo members . . . . . . . . . . . . . . . .

52

3.4

Segments in lung lobes . . . . . . . . . . . . . . . . . . . . . .

74

1.

INTRODUCTION

The rst objective of this thesis is to present MIAWARE software (Medical Image Analysis With Automated Reporting Engine), which enables doctors and radiologists to carry out detailed analysis of the patient's lungs state examining medical tomography images and then, in parallel, to perform health state reporting process. Secondly, an intelligent search engine for medical reports is presented, together with all its advantages over the ordinary searching schemas. The entire MIAWARE software and its modules presented here were prepared by the author of this thesis. Any external, other authors' sources used during the development of the MIAWARE software are properly documented and accompanied with suitable references. The complete source code of the MIAWARE software can be found on the CD enclosed with this thesis. The description of the MIAWARE software can be also found on the web page: www.miaware.org. Finally, the Appendix B contains an article describing the MIAWARE software, which was submitted to BIOSTEC 2008 - International Joint Conference on Biomedical Engineering Systems and Technologies.

1.1 Motivations Medical image analysis performed with the software, which gives the opportunity to localize and mark the pathologic changes during its observation as well as to add additional comments out of hand, can increase the precision of the pathology reporting and speed up that process. Moreover, an automated medical report generation can be considered as a potential improvement in report analysis process.

Furthermore, creation of a three-dimensional model from two-dimensional medical images should add more realism to the analysis course. It is easier and more natural to observe the disease changes having a visual reference to the 3D model, thus the radiologist's deduction may be more accurate. Integration of the detailed three-dimensional analysis of the radiological images with the `on the y' reporting system, able to normalize the radiologist's observations and generate automatically the nal report, can possibly improve the disease recognition process in the radiology area. Last, but not least important feature of the MIAWARE software - an intelligent search of the archive medical reports, can improve a disease recognition process since the earlier patients' cases can be easily compared with the recent ones, what automatically may produce faster and more ecient diagnosis.

1.1.1

Present radiological reporting schema

Preparation of the medical reports by the radiologists is very common and frequent activity. Radiologists are responsible for preparing a report after the detailed analysis of the MR or CAT images. Usually, they can view the set of the 2D images (slices) made in one or more plane directions during the MR/CAT scan. A radiologist looks carefully for all worrisome alterations in the patient's body which can resemble any disease, dene it and describe in the report which is later analyzed by the doctor.

1.1.2

Shortcomings and limitations of the present reporting schema

The accuracy of the present reporting process is not suciently high to be sure that the diagnosis made by radiologist and doctor is very accurate. There can be found some serious shortcomings. The main problem is that reports dier in structure from radiologist to radiologist. Every human has dierent way of thinking, dierent way of expressing things, remarks and observations. It means that given the same medical data to many radiologists in order to make analysis, it can and will, almost surely, produce many dierent reports with various observations on the patient's health. Moreover, 12

sometimes the doctor interprets the report in some dierent way than the radiologist which lowers the eciency of the diagnosis.

1.1.3

Automated reporting with MIAWARE

In order to improve the present radiological reporting schema, creation of a new reporting structure is needed. In the MIAWARE package, the automated reporting engine with its own, unique way of expressing data is proposed. Such a new solution should enhance following factors of the reporting process:

• Visualization 3D  a three dimensional model created from two dimensional images provides users with dierent, additional view of a given data.

• Disease location  the radiologist can mark the critical, pathologic points on the image (cut) and can see their location on 3D model immediately. Moreover, the disease searching is easier and faster as some special tools are introduced which allows cutting the model in one of three plane directions and showing the output in order to perform closer analysis.

• Disease denition  while doing the analysis, the radiologist is using the set of the specic medical vocabulary dened earlier by the qualied doctors. The radiologist does not have to think about how to describe the pathology, but only which of the given options denes it in the best and accurate way.

• Normalized reporting  the generated reports are always written with the same syntax and layout, independently on the person which does the analysis and introduces the data. Moreover, additional reports in the computer understandable language are created to enable its further processing.

• Intelligent report ltering  normalized reports, readable for a computer, allow the deduction-based searching of the reports according to the given search criteria. Thanks to the ontologies, the search is 13

performed according to the logical connections (relations) between the real-world concepts (in this case, human body parts), thus it is not pure, ordinary, lexical search, based only on the words entered in the search query. MIAWARE software package addresses to the aforementioned areas. The three dimensional visualization may bring some improvement to denition and recognition of critical changes in a human body. Interactive reporting schema should speed-up and ease-up an analysis course and make it more accurate. Finally, the normalization favors the easier understanding of the reports for the doctors, patients and also computers what makes a place for implementation of an ecient search engine.

1.2 Contributions of this thesis The MIAWARE software package contributes to the radiological reporting area in the following ways.

1.2.1

Integration of visualization, reporting and searching

The MIAWARE software package allows to open CAT images, visualize and process them and nally process the nal report. Moreover, the set of the previously generated reports can be ltered according to given criteria. This software presents how the previously mentioned aspects work when joined together and why such a software is important to be introduced to the real life. The presented MIAWARE software version is prepared and opened for future development in order to achieve nal, market product. It is properly working prototype with well-built structure, but still with too limited capabilities to be used in real-life situations. An advantage of this application over the others, used by radiologists up to now, can be the fact that they don't need to separate the steps of viewing the radiological images and then writing a report. Furthermore, a radiologist can investigate a three-dimensional model as well as its slices (cross-sections), obtained from three cutting widgets (CAT 14

scan usually oers, as the output, the images in only one plane direction, thus images in others are generated directly by the software). While observing medical data, radiologist can interactively change its view, make analysis and add report information. Finally, when all the remarks are made, the report can be generated. Finally, the ontology-based search engine is provided. User can choose a pathology to be searched and its location in the lungs.

Afterwards, a

previously specied set of RDF medical reports (generated by MIAWARE software) is veried according to given criteria. Thanks to that, doctor can consult the database of old medical reports in order to nd similar patient's cases. This may speed up a diagnosis process and improve disease recognition.

1.2.2

Normalized report generation

The MIAWARE software package incudes the reporting engine implementation. It requires from the radiologist only some pure, basic data to introduce such as type of the diseases, its location etc. Afterwards, the normalized report is generated, based on the previously set layout. Up to now, only the reports over the patient's lungs medical state can be generated. The software has its own database of the all possible lung pathologic changes and their characteristics. As a result, the radiologist does not introduce the description of the nding using his own words, but chooses the options available from the database. In the future software releases there can be added the option for adding short, extra comments or dening the size of the pathology change. In the recent software version the full aspect of the normalization is kept and there is no option to insert any extra information. All the data dened by the radiologist is sent to the engine, which is able to create the report, normalized with some rules, describing the patient's health state. The style, context and layout of the report is less dependent on the radiologist. It gives an advantage that the generated reports, which concern the same type of diseases or similar patients cases, will not dier or will dier very slightly.

15

1.2.3

User-friendly software

The MIAWARE package software was designed to oer to users an intuitive framework with friendly Graphical User Interface and to be exible for further development and extensions. User interface is entirely created in Java programming language with usage of the Java Swing components. Processing of the data is performed with the usage of Visualization Toolkit (VTK) [22] and ImageJ environment [31] (a public domain Java image processing program).

1.3 Applications This section describes the applications of the MIAWARE software.

1.3.1

Analysis of CAT images

The MIAWARE software provides users with the intuitive graphical user interface together with a functionality, which allows to perform detailed analysis of the computed axial tomography images. Figure 1.1 presents the screen shot of the MIAWARE application. Tomography examinations usually provide the set of images (slices) only in one plane direction. MIAWARE software supplies the view of the slices in three main plane directions (x,y,z) integrated together with the 3D model. Thanks to that, every point of the body (scanned by CAT apparatus) can be in three dierent plane cuts. It improves the preventive research of the human body for alarming pathology changes. The full description of how the analysis can be performed with MIAWARE software is presented in Chapter 4, section 4.2.

1.3.2

Reporting over lung pathologies

This application is closely connected with the previous one. After the analysis of the CAT images for human thorax, the special reporting engine can be started in order to report the pathologies found in the lungs. It can be simply done by marking the pathologic change on one of the 2D slices and 16

Fig. 1.1:

MIAWARE graphical user interface (CAT stack images analysis)

17

then associate the exact information with it. During the pathology reporting process, the groups of options are presented to the user, in order to dene the exact denition of the pathology type and its location in the lungs. Finally, the text report is written as an output.

The detailed description of the

steps that can be performed when reporting with MIAWARE is described in Chapter 4, section 4.3.

1.3.3

Medical reports ltering

Finally, the MIAWARE software package can be used by doctors during the medical investigation of the patient's case and assist the decision process over the suitable treatment application. The doctor can consult the database of old medical reports in order to nd reports with similar cases by applying the lter criteria in the query for MIAWARE search engine. Afterwards, the disease recognition can be made taking into account all previous cases, what increases signicantly the eciency. The specicity of the MIAWARE search engine for medical reports is described in the Chapter 3, section 3.4.1. The user's guide for report ltering with MIAWARE can be found in Chapter 4, section 4.4.

1.4 Roadmap After short introduction, the four following chapters will describe the details of the MIAWARE software together with the theoretical aspects closely connected to this thesis.

Chapter 2: Radiological examinations overview (MRI and CAT)  brief description of two radiological examination types: Magnetic resonance imaging (MRI) and Computed axial tomography (CAT), their features, applications, apparatus.

Chapter 3: MIAWARE Architecture  description of related work, development and design of MIAWARE software architecture, decisions, specications, theoretical references, creation of the 3D model from 2D slices,

18

normalized reporting schema, ontology-based search engine for medical reports and graphical user interface.

Chapter 4: Medical analysis and reporting with MIAWARE  user's guide, software functionality, illustrations of the user interface, analysis of CAT images, reporting of the lung pathological changes, ltering sets of medical reports according to the given criteria with MIAWARE software.

Chapter 5: Conclusions

19

2.

RADIOLOGICAL EXAMINATIONS OVERVIEW

This chapter refers to two major radiological examinations, magnetic resonance imaging and computer axial tomography. Both of them are noninvasive and painless methods for examining the body internal structures. They produce the sets of images demonstrating the internal parts of the object (human body) in order to nd physiological alterations.

2.1 MRI characteristics MRI, sometimes known as NMRI (Nuclear Magnetic Resonance Imaging) is a relatively new method that is used in medicine since the beginning of 1980s. The following subsections describe briey how the MRI apparatus looks like and works, what are the main features of this examination method and when such examinations are performed.

Fig. 2.1:

Siemens Symphony MRI scanner [14]

Fig. 2.2:

Functional conguration of MRI scanner [36]

2.1.1

MRI technology

Figure 2.1 presents the photo of MRI scanner (Siemens Symphony). This specic scanner has a tunnel 1.5m long and 60cm in diameter [14], and moreover, the visible cylinder is in fact the 1.5 Tesla Superconducting Magnet. The general functional conguration of the MRI scanner is presented in Figure 2.2. The following description of the MRI scanner functionality was prepared based on the content of the HowStuWorks web page. Every MRI scanner uses both, magnetic and radio-frequency waves for creation of the output imaging. The patient, during the MRI examination, is lying on the special table, which slides slowly into the horizontal tube, called

bore

of the magnet.

The scan begins when the body part to examine is exactly in the isocenter of the magnetic eld. Since human body consists mainly of water (billions of hydrogen atoms), the principle of the MRI scanner's work relies on the specic behaviour of such atoms in strong magnetic elds composition in presence of the RF waves. All the hydrogen atom nuclei are randomly spinning in every direction. In the magnetic eld of the MRI scanner, the hydrogen atoms will line up with the direction of the eld in any of its two turns. This results in that majority

21

Fig. 2.3:

MRI knee reconstruction [24] - colors inverted

of atoms will cancel each other out. Then the hydrogen-specic RF pulse is applied by the MRI scanner using

RF coils and consequently protons, ab-

sorbing such energy, change their spin direction (resonance appears). There is also another group of magnets in the MRI machine (gradient

magnets),

which are responsible for sudden changes of the magnetic eld in the specic, small area of the patient's body. The single MRI picture (slice) is taken exactly from that area. Thanks to the gradient magnets, the scanner is able to take pictures in any direction without changing the position of the patient. It is possible, because when the RF pulse is turned o, the hydrogen protons begin to slowly (relatively speaking) return to their natural alignment within the magnetic eld and release their excess stored energy. When they do this, they give o a signal that the

(RF) coil now picks up and sends to

the computer system. What the system receives is mathematical data that is converted, through the use of a Fourier transform, into a picture that we can put on lm. That is the `imaging' part of MRI [13]. What should be mentioned yet is that the alterations of the local magnetic eld play a role of contrast in MRI, which is used in order to obtain better quality and more detailed images. The example MRI knee reconstruction photo is presented in Figure 2.3.

22

2.1.2

MRI advantages and disadvantages

The MRI examination gives the opportunity to see the tissue-level images, full of details, clear slices of the patient's body in any direction. The major advantage of the MRI scan is the great number of situations when it can be applied. The Netdoctor [30] web page provides broad overview of the MRI applications, thus it is cited here: Because the MRI scan gives very detailed pictures it is the best technique when it comes to nding tumours (benign or malignant abnormal growths) in the brain. If a tumour is present the scan can also be used to nd out if it has spread into nearby brain tissue. The technique also allows us to focus on other details in the brain. For example, it makes it possible to see the strands of abnormal tissue that occur if someone has multiple sclerosis and it is possible to see changes occurring when there is bleeding in the brain, or nd out if the brain tissue has suered lack of oxygen after a stroke. The MRI scan is also able to show both the heart and the large blood vessels in the surrounding tissue. This makes it possible to detect heart defects that have been building up since birth, as well as changes in the thickness of the muscles around the heart following a heart attack. The method can also be used to examine the joints, spine and sometimes the soft parts of your body such as the liver, kidneys and spleen. Moreover, MRI does not use the ionizing radiation and does not present any serious side eects. Unfortunately, it has also some disadvantages. Firstly, the MRI machines are very loud, what produces very unpleasant atmosphere in the examination room. Another problem is that not all the people can take part in MRI examinations, because of the claustrophobia or because they are simply too big. Next drawback is the price of the MRI scan. Since the whole system is very expensive, the examination prices are 23

also relatively high. Finally, the last problem is that the MRI images are distorted frequently.

The reason of that can be found in relatively long

duration of the examination (from 20 minutes up to 90 minutes). During this time, patient should not move, since it can produce distortions and the examinations would have to be repeated. The distortion can be caused also by the hardware in the examination area, since it aects and changes the MRI magnetic eld slightly, and only in presence of the uniform magnetic eld, high quality MRI images can be obtained. This concludes the section over magnetic resonance imaging. In the next part, the overview of the computed tomography is given.

2.2 CAT characteristics Computed axial tomography (CAT), sometimes shortened to computed tomography (CT), is a type of examination, which produces as the output sets of photos (slices) through the examined body part. Following the Wikipedia, the word "tomography" derives from two Greek terms: and

graphos

 image or

graphein

tomo  slicing, cutting

 to write. Simply speaking, tomography

is a process of representing three dimensional body in form of its subsequent 2D slices. It is worth to notice that aforementioned MRI examination is also a tomographic technique since it produces 2D images of the examined body.

2.2.1

CAT technology

CAT scanner produces narrow slices of the examined body only in one (axial) cutting plane direction. Computed tomography is, in fact, modern and more advanced X-ray imaging, which enables easy three-dimensional computer model reconstruction, since the output slices are always in the same plane direction and have equal spacing between each other. The MIAWARE software uses that advantage and with help of VTK toolkit is able to generate from axial CAT slices, not only the 3D model, but also 2D slices in sagittal and transaxial planes. The typical CAT scanner resembles from exterior the MRI scanner (see Figure 2.4).

24

Fig. 2.4:

Toshiba Aquillion CAT scanner [34]

Fundamental concepts of the CAT scan work will be discussed next. CAT scanner provides donut-shaped X-ray machine. The patient is placed inside such a chamber, where an x-ray source and an array of detectors arranged in an arc of the circle [36] are placed on the opposite sides of the scanning circle

°

(Figure 2.5). Both, x-ray tube and detectors, are making 360 rotations and, for every tube position, the cross-sectional view is created. The x-ray tube generates the beam of x-ray photons directed to the part of patient's body and the result is catched by the detectors on the other side of the patient. A cross-section image representing a sweep of the signals in the detector array [36] is produced. Following the NASA Remote Sensing tutorial [36], the x-ray tube and

°

the detectors move as a unit through a complete 360 rotation around the patient, thus providing a succession of images, each consisting of a view of the body at some angle. This multiple viewing provides additional information that improves the image contrasts among the organ(s) being examined and hence better denes them. Upon completion of the scan for that slice, the unit can move forwards or backwards parallel to the length of the body (or part(s) thereof), hence the designation of `axial', when the person is placed on a horizontal table. The example CAT scan slice of the human thorax is presented in Figure 2.6. Tomography bases on physical and mathematical operations as well 25

Fig. 2.5:

CAT scan work principle [15]

as signal processing methods in order to generate the images. Some of them are: Fast Fourier Transform (FFT), wave transformation, image formation, interferometry, etc. [36].

2.2.2

CAT advantages and disadvantages

The CAT scan produces the detailed view of the internal body structures. The multiple viewing of the same slice in dierent angles results in more detailed, high quality output image. There are many situation when the computed tomography is applied. CAT scans are performed to analyze the internal structures of various parts of the body. This includes the head, where traumatic injuries, (such as blood clots or skull fractures), tumors, and infections can be identied. In the spine, the bony structure of the vertebrae can be accurately dened, as can the anatomy of the intervertebral discs and spinal cord. In fact, CAT scan methods can be used to accurately measure the density of bone in evaluating osteoporosis [25]. It is also used for detection of tumours, cysts or any pathologic alterations in the chest (the recent MIAWARE reporting schema refers to that area).

26

Fig. 2.6:

Human thorax CAT slice - colors inverted

The signicant advantage of the CT scan is the duration of the examination. It is relatively short, and, for example, the lung imaging can be performed in less than one minute. This is also related to shorter radiation exposure to the patient. Computed axial tomography is painless, noninvasive and does not present signicant side eects. The most common problem is an adverse reaction to intravenous contrast material. Intravenous contrast is usually an iodine-based liquid given in the vein, which makes many organs and structures, such as the kidneys and blood vessels much more visible on the CAT scan. There may be resulting itching, a rash, hives, or a feeling of warmth throughout the body. These are usually self-limiting reactions and go away rather quickly [25]. Moreover, according to Medindia.com web page, there is a need for contrast media for enhanced soft tissue contrast. Recently, there appears sometimes a problem with highlighting particular tissues. The last drawback of CAT scan is, similarly to MRI scan, relatively high cost of the examination. The last section of this chapter discusses and compares MRI and CAT scans.

27

2.3 MRI and CAT comparison The following section compares two previously described radiological examinations. Additionaly, the Table 2.1 summarizes it in more systematized way. In the beginning it should be mentioned that the most popular format for images from CAT or MRI examinations is DICOM (Digital Imaging and COmmunication in Medicine). It keeps not only image data, but also important properties of the examination like patient's name, date of examination, slices format and many more. The basic dierence between the MRI and CAT scan is the method of obtaining image data. MRI uses magnetic eld together with radio frequency (RF) waves (non-ionizing radiation). On the other side, CAT uses X-ray (ionizing radiation).

Methods Radiation Output Reconstruction of slices in various planes Pathology examination Scan duration Price

MRI

CAT

magnetic elds, RF waves non-ionizing slices in any plane fair

X-rays

very good very long very high

good short high

Tab. 2.1:

ionizing slices in axial plane very good

MRI vs CAT

Following the Wikipedia, CAT is a good tool for examining tissue composed of elements of a relatively higher atomic number than the tissue surrounding them, such as bone and calcications (calcium based) within the body (carbon based esh), or of structures (vessels, bowel). MRI is excellent for examinations of non-calcied tissue. Both of them give as an output the cross-sectional images of the examined body. However, they use dierent methods for obtaining image contrast. In CAT the contrast is obtained by attenuation of X-rays. MRI is more exible in that area and by changing any of the scanning properties, many 28

various features can be shown. For dierent purposes, dierent materials with paramagnetic properties are used, creating dierent contrast agents. Both of the technologies produce the slice (cross-sectional views) images, but on the contrary to MRI, which can give images in any plane, CAT results with images only in axial plane.

On the other side, MRI does not give

such great exibility for reproducing the image data in any plane, as CAT technology. Multi-detector CAT scanners with near-isotropic resolution allow to generate images in any plane having only the photos made in axial plane. This feature is used by MIAWARE software where dierent plane images are generated together with 3D model. Although MRI is better for nding pathologies and tumours than CAT, CAT is used more frequently since it is cheaper, the duration of the scan is much shorter and consequently it is more comfortable for the patients, which, for example, does not need to be sedated or anesthetized before the examination. This concludes the chapter.

In the Chapter 3 the full description of

the MIAWARE software package architecture is given.

According to the

contributions of this thesis, the MIAWARE software performs processing of the CAT images and provides the necessary tools facilitating their analysis.

29

3.

MIAWARE ARCHITECTURE

In this chapter the full architecture of the MIAWARE software is presented. The description of the MIAWARE software development process is divided into three main parts. First part refers to the visualization made with usage of the VTK toolkit [22], Java Swing [28] and ImageJ [31] software. Second part deals with the generation of normalized medical reports over patient's lungs state. Finally, the functionality of the ontology-based search engine is presented together with the user interface. All these parts are accompanied with a necessary theoretical basis.

3.1 Decisions After circumscribing the motivations, advantages and general reasons for the creation of the software MIAWARE, presented in Chapter 1, the selection of the environment and adequate tools for its development was the very crucial step for success of the further work.

3.1.1

Tools for visualization  review

One of the MIAWARE project objectives is to generate three-dimensional model from two-dimensional CAT images. First step was to nd appropriate tools, which can facilitate the work in this area. It was required to nd any toolkit, which is freely available (under public domain) and open source. Due to the fact that the Visualization Toolkit (VTK) [22] matches almost completely with all previously dened needs, it was chosen as the main tool for development of the project's visualization part. Visualization Toolkit is the system for the 3D computer graphics, advanced image processing and visualization. Moreover, it is a toolkit, open

for development, implemented in C++ programming language. The advantage of VTK comparing to other similar tools is its exibility, since it contains several wrappers to various programming languages. Thanks to that VTK applications can be written not only in C++, but also in Tcl/Tk, Java or Python. VTK can be easily integrated with the GUI classes of any of the mentioned environments. Other tools that could be used for 3D computer graphics, like OpenGL or PEX, are created at lower level of abstraction than VTK, therefore creating the 3D visualizations and processing it is more dicult and time-consuming process comparing to VTK. The only disadvantage of VTK, found just at the beginning of mastering it, is its documentation. Its manual (API) lacks of precise function denition and furthermore uses ambiguous argument/parameter names, what impedes the understanding of the member functionality and slows down solutionnding process. Another objective of MIAWARE software is to present the 2D crosssections of the generated 3D model and at this stage of project preparation the other tools besides the VTK were taken into account. ImageJ package was one of such tools and was considered as a potential help in this case. ImageJ [31] is a public domain, free Java image processing program. It is able to read all types of the images and create the image stacks from the many image slices made in one plane direction, as it is with the images taken from the computed axial tomography.

3.1.2

Tools for reporting and ontology-based search engine  review

The second and third, closely related parts of MIAWARE architecture is automated report generation and ontology-based searching over previously prepared reports. The generation of the reports is achieved having detailed denition of the pathologic changes found in the patient's body, based on the medical images and created 3D model analysis. Two types of reports are to be created: rst one, normal ASCII text report and the second, actually more important, the report in RDF format. Such a specic format (Resource

31

Description Framework) facilitates performing ecient ontology-based report search over some given criteria. The detailed characteristics and denition of ontology and RDF le format can be found in the following subsections: 3.4.1 and 3.3.3. The normalized report generation requires the accurate denition of the vocabulary which will be used for description of the disease changes. Such vocabulary has to be kept in some exible format which allows its easy redenition and manipulation. Moreover, it should have structured form of representation for easier processing by dierent information systems (in this case, by MIAWARE software). After close review of the possible options for such case, the XML format was chosen for its exibility and easiness in use. The other options were: database or normal text le. The rst one introduces some diculty in data redenition. The doctors or radiologists could nd it quite dicult to edit the vocabulary which is kept in the database. Only the specialists would be able to do it. On the other hand, the normal text le format hampers processing of such data by software. XML format seems to be the golden mean as it allows easy manipulation of data in any text editor, and at the same time has simple structure for processing by information systems. The detailed description of the structure of the XML le used by MIAWARE software can be found in the subsection 3.3.2. The review of the ontology development tools brought the list of some interesting applications worth considering. The brief description and nal decision is presented below.

Protégé ontology editor Protégé [16] is a free, open source ontology editor, which allows to export the ontologies in many formats: XMLSchema, OWL (Web Ontology Language), RDF. Protégé is Java software with room for extensions and plugins. The creation of the ontologies in Protégé is made visually, without the necessity of writing any code. There exist some special Protégé plugins, like OntoViz [37] and OWLViz [4], that allow graphical representation of the ontologies and are used for presentation purposes.

32

Jena framework Jena [11] is the programming environment for developing the semantic web applications, thus also ontologies. It is open source framework implemented in Java programming language with distribution packages for further development. Jena provides environment for development in RDF, RDFS, OWL and SPARQL.

Decision These two editors presented above are considered as the best for this application with very good API (Application Programming Interface) and detailed documentation what signicantly eases and speeds up the programming process. Finally, the decision is that Protégé editor is to be used for design of the ontology, and afterwards, Jena framework is exploited for programming re-creation of the ontology previously designed in Protégé. As the output, Semantic Web ontology (OWL le) is obtained. The advantage of Jena over the Protégé is that it facilitates both: RDF (reports) and OWL (ontology) les creation and processing programatically in Java using the same distribution packages. Despite of it, Protégé is excellent tool for visualization of the ontologies and makes the beginning stage of ontology development much simpler.

3.1.3

Tools for GUI applications  review

After dening the specic tools for the visualization and ontology development, a full review through the tools for Graphical User Interface applications was made. The choice of the appropriate one is dependent on the type of the work environment used for the software development and the additional toolkits. Some golden mean is required to be found in order to integrate them to work together. Below, one can nd a brief description of the GUI libraries/tools which were investigated.

33

Microsoft Foundation Classes - MFC Microsoft Foundation Classes, originally known also as Application Framework eXtensions (AFX) is the application framework, which basically is the library, wrapping together the Windows API and C++ classes. It can be used obviously when working with the C++ programming language and provides objectoriented programming model to the Windows API. Furthermore, the creation of ModelViewControllerbased architectures can be created using the Document/View framework. The disadvantage of MFC is that it is not portable across various operating systems.

Quasar Technologies Qt Qt is the toolkit used for application development, mainly prepared for C++, but it provides also bindings to Java, Python, Ruby, PHP, C#. It is mainly used for the development of the GUI programs. The additional advantages of Qt are visible through the embedded internationalization support and the availability.

GLUI library GLUI is the library which provides the GUI elements for the OpenGL applications. It relies on GLUT (OpenGL

Utility Toolkit)

library thus it is

de facto the GLUT-based C++ user interface. It does not contain many features like the others tools described in this subsection, but its advantage is the ease in usage.

Java Swing toolkit Java Swing is the GUI toolkit created for Java programming language. It is the successor of AWT (Abstract Window Toolkit) and provides all necessary GUI widgets for the creation of advanced GUI programs. Swing is platform-independent toolkit with Model-View-Controller GUI framework for Java system. Moreover, it is freely available and open for

34

development like the Java programming language. It is recommended to use Swing when developing the applications in Java.

3.1.4

Final environment choice for the MIAWARE software

After the full review of the necessary tools to use, the nal decision about the work environment was to be made. Two programming languages were taken to account: C++ and Java. Below, one can nd the pros and cons of the above-mentioned programming languages for the development of the MIAWARE software. 1. C++ programming language + Usage of VTK [22] is very easy and simple, because the wrappings are not used and the VTK classes are used directly in the code. + There exist very good toolkits for creating the GUI applications.  Integration of the ontologies in the C++ programming language seems to be very dicult, as there no editor was found for this purpose.  Using the C++ produces platform-dependent applications.  The usage of the non-free tools is necessary. 2. Java programming language + Swing can be used directly to create the advanced GUI application without any integration problem. + Java is considered as the best language for Semantic Web development (ontologies) + All editors for OWL and RDF les presented above are Java distribution packages. + The applications created with Java are OS-independent. + All the tools to be used with Java are free and open source.

35

+ The VTK [22] toolkit can be used with Java through the provided wrappings.  Usage of VTK is more complicated as the Java Wrappers are used. The documentation for Java Wrappers does not exists. The integration of the VTK [22] with Java [28] environment is a timeconsuming and more dicult process comparing to C++.  Java Virtual Machine is required to run any application. The balance between the advantages and disadvantages is much better for Java than for C++. Crucial was the fact, that it is almost impossible to integrate the ontology data models with C++. Besides of that, both, C++ and Java could be used. Summarizing the gathered facts, the Java programming language is more adequate and is chosen as the working environment for development of the MIAWARE software. The visualization process is carried out with usage of VTK [22] (integrated with Java through Java Wrappings) and ImageJ [31] for additional graphical purposes. The reporting schema uses XML format le for vocabulary storage. Protégé and Jena frameworks are used for ontology creation and RDF les processing. The Graphical User Interface for all parts of the MIAWARE software is prepared purely in Swing. The following sections describe in more detail the architecture and stages carried out during the development of the MIAWARE software.

3.2 Image visualization and GUI development In this section one can nd the detailed description of the steps made when creating the visual part of the MIAWARE software. Firstly, the VTK environment was installed in order to use it together with Java. Subsequently the 3D model and its cross-section creation stages are presented. Finally, the description of the Graphical User Interface creation and its features is covered.

36

3.2.1

Integrating VTK with Java

The rst step in the MIAWARE development was to integrate the VTK classes with the Java programming language.

As the VTK provides the

wrappers to Java, the only thing to do is to congure it correctly. The entire conguration is performed using the CMake [18] system following the VTK recommendations. Many useful information about the VTK, its functionality, development and installation steps were found in the VTK User's Guide [17]. CMake [18] is the cross-platform, open-source make system. CMake is used to control the software compilation process using simple platform and compiler independent conguration les [18]. CMake is able to generate the makeles and workspace for a chosen compiler. VTK and CMake are the products developed by Kitware, thus the CMake conguration les provided with VTK source were used to congure VTK for Java. After the conguration, where the VTK wrapping for Java was enabled, the whole VTK source was compiled using the C++ compiler (in this case it was Visual Studio 2005). After, long-term compilation process, the Java classes and source les were generated in order to use them for further development of VTK applications in Java. Unfortunately, the VTK installation process described above had brought many unexpected problems.

Moreover, there was almost no information

in the Internet about installation of VTK with Java.

As a result, after

completing successfully this stage, the HOWTO document was created for the other VTK users who are keen to install and use VTK with Java. The document was placed on the web page: http://www.spinet.pl/ wilku/vtkhowto/. It can be also found in the Appendix A of this thesis. The Java application was developed with usage of the Eclipse framework, which facilitates enormously the design and writing code process. Eclipse is a free and open software.

3.2.2

Creating 3D model

After successful VTK conguration the rst task was to render and display the 3D model from the set of CAT images. Figure 3.1 is a schematic rep37

Fig. 3.1:

Steps of CAT stack processing in MIAWARE

resentation of the steps necessary to create and visualize three-dimensional model and cross-sectional images.

Reading stack of CAT images According to the information from Chapter 2, the computer axial tomography gives as the output a stack of images made in an axial plane. Such images have DICOM or processed DICOM format. The data that is gathered from the stack images is:

38



Number of slices  NS



Slice thickness  TS , distance between subsequent slices



x, y spacing  Sx , Sy , spacing between adjacent pixels



Rows, Columns (Width, Height)  W

/ W[mm] , H / H[mm]

where

W[mm] = W · Sx , H[mm] = H · Sy Moreover, two additional factors were dened and calculated:



Stack depth  D[mm] = TS · (NS − 1)



XY Ratio  D[mm] /W[mm] or D[mm] /H[mm] (if W[mm] = H[mm] ).

Let's present possible values for CAT stack images:

• NS = 87 • TS = 5.0mm • Sx = Sy = 0.781mm • W = H = 512 • W[mm] = H[mm] = 512 · 0.781mm = 399.872mm • D[mm] = (87 − 1) · 5.0mm = 430mm Having this data, the aspect ratio (AR) of three cross-section windows can be calculated. For slices made in z-direction (axial plane) the aspect ratio (window size) is dened as:

AR = W × H = 512 × 512 Window sizes for slices in x- and y-direction (sagittal and transaxial) are calculated in a following way:

AR = W × (XY R · H) = W × (D[mm] /W[mm] · H) = 512 × (1.0753 · 512) AR = 512 × 550.58 39

Such properties are read in MIAWARE using ImageJ [39] software package. Suitable class in Java was created (MedicalImageScanner ), in order to retrieve information using the class

ImagePlus

(from ImageJ).

Following the diagram presented in Figure 3.1, two important actions are performed before the start of 3D model creation procedure. Firstly, the size of the GUI windows must be calculated (using the above formulas) and set according to the properties data read by Secondly, the

points.mwr

MedicalImageScanner

class.

le (created by the MIAWARE software) is to

be read. This le stores the data, about the pathologies and their specic location (coordinates) on the images, inserted by the user. Afterwards, the VTK starts its role by loading the images into memory using the class

vtkImageReader2.

To achieve proper rendering of the

image data, some properties have to be introduced to the instance of

ageReader2 Firstly,

vtkIm-

class.

data spacing

for three dimensions (x, y, z) is to be set. Data

spacing for x and y plane direction is simply a pixel distance in any 2D

slice thickness. That value is the further property read by MedicalImageScanner. stack image. Data spacing for z direction is the parameter called also

This is basically the distance between two proceeding images made during the CAT examination. The second property to be set is the data

extent.

In VTK, such a property

is dened by six values, one pair of values for each plane direction. The dierence between each of two values from any pair gives the length of the image in a specied direction. Thus, for the x direction, these values specify image width (in pixels) while, in the y direction case, image height. Values for the z direction represent simply the index number of both, the rst and the last images in the stack, which belong to the 3D model.

vtkImageReader2

allows also reading any single image, not only stacks. In such a case, the two last values should be equal. Finally, the path for the folder with the images (le

pattern

should be set. The execution of the method

action of loading image stack data into memory.

40

prex ) Update

and the

le

nishes the

Preparing model for display The image data already loaded must be processed in order to create the displayable 3D model. This is achieved using the VTK lters, mappers and actors. These are also known as the visualization pipeline. The contour lter is applied to the 3D image data as the rst. Setting the threshold values in such a lter can create respective isosurfaces. In case of the MIAWARE 3D model only one threshold value is used. After the data ltering, the mapper is used (vtkPolyDataMapper ) which maps polygonal data and graphics primitives. The mapper can change displayed colour of the model by changing the model scalar values. Actually, this option is not used with MIAWARE recently. Finally, the actor is created, which represents the 3D object (model) in a rendered scene. The additional actor options like opacity, color, position, etc. can be set in order to improve the model appearance on the screen.

Rendering the scene Showing the scene with actors, in our case  3D model, is made using the VTK wrapper class for Java, called

vtkCanvas.

It is the heavy-weight component

in Java, what means that it remains always on top in the GUI application. This class inherits from the other VTK wrapper class 

vtkPanel.

Both of

them encapsulate some VTK objects responsible for showing the scene and interacting with it (vtkRenderer,

vtkRenderWindow, vtkRenderWindowInter-

actor ). To create the scene, it is only necessary to create the object of the

Canvas

vtk-

class, add the camera (vtkCamera ) which allows viewing and model

manipulations. Some properties of the camera like position or focal point can be set. Finally, the model is rendered and displayed in the 3D model window. The sample model created from arbitrary set of CAT images is presented in Figure 3.2. The

vtkCanvas

has the default interaction set with mouse and the key-

board. Thanks to that, it is possible to rotate, zoom and translate the object. Such the interactor can be overloaded and the mouse and keyboard behavior 41

Fig. 3.2:

MIAWARE three-dimensional model created from CAT stack images

can be changed. Unfortunately, manipulation of the model only with the mouse and its buttons is not always very natural and easy. That is why, additional navigation panel with buttons (for model rotation and zooming purposes) is available with the user interface.

3.2.3

3D model cross-sections generation

The second assumption of the MIAWARE software is to give opportunity to reslice a 3D model in any of three plane directions and show the output in separate windows. It is obtained using specic VTK classes, which inherit from the superclass

vtk3DWidget.

The widgets are presented on the 3D scene together

with the model and provides the user with specic operations. They support interactive cutting of three-dimensional objects by invoking pre-dened events. Such events are invoked when the widget enters some previously dened state. In the MIAWARE software, cutting widgets are objects of

vtkImagePlaneWidget

class. This class supplies its objects with methods

useful for reslicing models and retrieving slice's data. The view of visible widgets together with the three-dimensional model is shown in Figure 3.3. In this case, three widgets are created (only two are visible in Figure 3.3), 42

Fig. 3.3:

Model manipulation with widgets in MIAWARE

each for one plane direction. Widget input data is taken from vtkImageReader object (see 3.2.2). Afterwards, the plane orientation of the widgets is set, together with the special widget's events and its activation keyboard key. When the activation key of some widget is pressed, the widget is displayed on the 3D scene, since it is initially hidden. Three types of events are dened:

• StartInteractionEvent  is called when the interaction with a widget is started. In that moment, the frame rate of the rendering window is updated to the previously dened value. Here, it is 10 frames per second.

• EndInteractionEvent  is called when the interaction with the widget is nished. The frame rate of the rendering window is updated back to desired, normal value. Here, it is 0.001 frames per second.

• InteractionEvent  is called when any operation on the scene's object is preformed with usage of some widget. Herein, the event updates the view in the windows, where three two-dimensional model 'cuts' are displayed. The slice image, obtained by the widget during execution of the InteractionEvent, is displayed in the separate window representing respective cross43

Fig. 3.4:

Three 2D cross-section planes together with marked pathologies

section (cut) of the model. For this purpose, the special VTK class 

ageViewer

vtkIm-

is used. It is able to display input data from the widget in the

separate window. Moreover, the display properties of the image, like color window and color level can be set using the The

vtkImageViewer

vtkImageViewer.

class automatically creates the new VTK window

with the reslice image, what is not desired in the Java application. It was crucial to capture only the display of the vtkImageViewer and show it in some Java component, therefore, as the result, a wrapping class was required. Such a class was found on the ImageJ Plugins web page [39] called

Panel.

ImageViewer-

This class, created by Jarek Sacha ([email protected]) [39],

allows to place the cut image on the Java Swing panel and treat as the Java component.

CATSlice

ImageViewerPanel

class is subclassed by the specially created

class which encapsulates the necessary functionality for interactive

selection of the displayed image. Here, the connection between the visualization and reporting part is introduced. The radiologist, in order to add pathology description, has to select rst its location on the 2D cut image. Java mouse listeners are added to all CATSlice panels, what enables detection of any left button mouse click. When detected, suitable action is performed. In this case, for every single point selection, the reporting form 3.5 is presented to the user, in order to introduce pathology description which is then joined with the selected image point. Figure 3.4 presents three sample cross-sections together with marked pathologies. 44

Any introduced pathology is represented on 2D images as well as on the 3D model, in form of small spheres. These spheres are actually simple VTK actors added to the 2D and 3D scenes. They can be later picked with mouse in order to edit previously inserted information or simply delete it.

3.2.4

Creating GUI

The whole GUI window for the MIAWARE software was designed using the Java Swing components. The modelling and placing of the components was performed in the Eclipse Visual Editor, an open development platform supplying platforms for creating GUI applications. The additional information about creating the GUI applications and other useful information about programming in Java were found in the book Java - How to Program by Harvey M. Deitel [2] and Sun's Java Tutorial [29]. All the VTK outputs described above, like renderer window with the 3D model and three cross-section windows were placed in the Java Swing panels (JPanel) and easily integrated with the GUI. As mentioned in the subsection 3.2.3, special wrapper classes, which inherit from Java components, were created in order to complete the full integration with Java. The objects of the created wrapping class

ageViewerPanel )

CATSlice

(subclass of

Im-

were connected directly to the widget event  Interaction-

Event (look at section 3.2.3). Thanks to that, with every interaction performed on the 3D model widget, the cross-section view is changed automatically in the respective

CATSlice

object.

The cross section panel size is dependent on the properties of the input images. The Properties object, which stores the necessary information about the format of introduced medical data, created by the object of the ImagePlus class (taken from ImageJ [31] distribution, see section 3.2.2), supplies the GUI window object with the image dimensions or stack length and, consequently, is able to t the images completely in cross-section panels.

45

3.3 Medical report generation This section describes medical report generation process in detail. Firstly, medical vocabulary selection criteria are described and the way of its representation in the MIAWARE software. Subsequently, the structure of the medical form for introducing pathologic changes data and its integration in the XML medical vocabulary le is presented. Finally, creation of normalized text- and RDF-reports with some necessary theoretical basis is discussed.

3.3.1

Medical vocabulary selection and representation

Creation of well structured, normalized medical report demands precise definition and arrangement of the vocabulary used. In case of this version of the MIAWARE software, the medical report consists of the specic information about various pathologies found, also known as processes, and its exact location in the patient's lungs. The vocabulary set used for denition of the disease or pathologic change is xed and cannot be modied by the radiologist while using the software. That is one of the crucial assumptions which has to be taken into account while creating the reporting engine, in order to achieve good normalization in the end. Arrangement and selection of the vocabulary was made after the consultation with doctor Miguel Castro working in hospital in Beja (Centro Hospitalar do Baixo Alentejo - Hospital José Joaquim Fernandes de Beja) and RadLex (A Lexicon for Uniform Indexing and Retrieval of Radiology Information Resources) term browser, which can be found on the RSNA (Radiological Society of North America) web page [32]. RadLex term browser was created in order to unify the radiological vocabulary used during image analysis and reporting procedures. The entire vocabulary used in the MIAWARE reporting engine is divided into two sets: Morphophysiological processes (all the pathologies that can be found in human lungs) and Thorax locations (the detailed parts of the lungs). That data is kept in the XML le -

dataform.xml,

because of the

usability reasons explained in section 3.1.2. The whole vocabulary for the report is enclosed with the tag

and every smaller set of such a 46

vocabulary is dened in the indexed tag . Every element which belongs to the vocabulary set is marked by the indexed tag of small part of 1

2 3



5

i n d e x="1"> L e f t l u n g i n d e x="2"> R i g h t l u n g i n d e x="3">Upper l o b e

50

I n f e r i o r

51

L i n g u l a

...



52 53

segment



... 133

Up to this moment, the vocabulary is divided into sets, which contain words from the same category. Obviously, in the moment of introducing data, radiologist should have the opportunity to add subsequent information step by step, every time with relatively short lists of options (subsets of words) and not the entire vocabulary set. The description of how the vocabulary sets in MIAWARE reporting form were divided in order to create logical wholeness and speed-up reporting process is presented in the next section 3.3.2.

47

3.3.2

XML-based reporting form creation

The well-structured and ecient reporting form should be easy to understand by the person who lls it and should oer the group of issues to be chosen or dened. In case of MIAWARE reporting form, the set of comboboxes is used where every step oers a group of medical vocabulary (taken directly from one vocabulary set in our

dataform.xml

le) dening one basic issue.

For example, to dene the location of the pathology in the lungs, rstly one has to determine in which lung it is placed (right or left lung), then the respective lobe of this lung, and nally, the specic segment of the selected lobe. In this simple example, there are three steps (comboboxes) which use the same vocabulary set from the MIAWARE XML le (Thorax locations), but in each step is shown only a small subset of words. It is much simpler to dene the location starting from the biggest part of body (in this case lungs) and continue till the most detailed location is dened than to present the whole vocabulary set (in one combobox). It was necessary to dene the initial combobox and the subsequent ones, reference comboboxes, depending on the previously chosen options. Every combobox element has to have dened one reference combobox, which will appear next. Moreover, all comboboxes have to have the denition of the vocabulary set they are using. Such interconnections were set in our MIAWARE XML le -

dataform.xml.

Small printout and explanation is shown

below: 1

2 3



... 53



54



55



57



58



56

48

59

i n d e x="1" s e t e l e m="3" c b r e f="4"> i n d e x="2" s e t e l e m="4" c b r e f="4">

...

133

This printout presents the second, and the last part of our XML le. First part, presented in the section 3.3.1, denes the vocabulary sets, and the one, presented above, represents the comboboxes (here collected in tag) and their logical interconnections in the form. This XML le data is read directly into the MIAWARE software during

MWRXMLReader which is able through its methods to extract any desired structure from dataform.xml le

execution and analyzed by the Java class

and process it adequately. This reader is attached to the graphical user interface of the reporting form frame adding necessary functionality. Figure 3.5 on page 50, represents the GUI of the MIAWARE reporting form.

(see line 132 of the dataform.xml le) where the attribute cbref keeps the index number of the rst combobox in the form. In our case cbref="1". Respective combobox is found after scanning content, for such a which index attribute's value equals to 1. Our initial combobox Initial combobox reference is read from the special tag

is dened at line 55 of our XML le. It has additional attributes assigned:

title

and

setindex.

The rst one, sets the title of the combobox, which is

later displayed on the GUI reporting frame over the combobox. The

setindex

attribute denes the vocabulary set which is used in this combobox. Every

consists of multiple indexed tags, which represent options of the given combobox. Every denes, through its attributes, the vocabulary set element number (setelem attribute) and the reference to 49

Fig. 3.5:

Pathology reporting form in MIAWARE

the following combobox, which should appear while choosing this



(cbref attribute).

Step

Combobox title

Selected value

1. 2. 3. 4. 5. 6.

Morphophysiological process Neoplastic process Location Left lung location Left lung upper lobe location Left lung upper lobe lingula location

Neoplastic process Mass Left lung Upper lobe Lingula Superior segment

Tab. 3.1:

Example pathology denition steps in MIAWARE

For example, in our case, the initial combobox has the title "Morphophysiological process" and uses the vocabulary set number 1, thus it refers to "Morphophysiological process" vocabulary set. Our initial combobox has two options: vocabulary set elements: 1 and 2, thus: "Neoplastic process" and "General process". These two options will be presented when starting GUI reporting form frame. If the user selects the rst option ("Neoplastic process"), the next combobox which will appear is indexed with number 50

ControlPoint

int x int y int z 148 Tab. 3.2:

168

43

Example of ControlPoint members association in MIAWARE

2. If second option selected ("General process"), the next combobox has number 3. This procedure is repeated until the selected



has

cbref

attribute's value equal to -1. It means that the selected option is nal and it ends the reporting process. The example pathology denition steps are shown in Table 3.1. The reporting form frame is executed when the user marks any point on the 2D image by clicking on it. In that moment the program asks if the user really wants to add the information to the specic, selected point. If approved, the reporting frame pops-up and the information about the pathology is to be added. When user nishes that operation, the pathology description is kept in the Java interface.

Map

HashMap

class object which implements the

Map

is a specic Java Collection where the keys are associated to

values. One-to-one mapping is used here, which means that any key can map only one value. In case of MIAWARE software, all the reported pathologic

HashMap object where the keys are objects of the specic class ControlPoint and values are the objects of the class ControlPointInfo. ControlPoint class encapsulates the 3d coordinates of the point marked by the user. ControlPointInfo class encapsulates the introduced data stored in three dynamic containers (Vector). These vectors keep: changes are collected in

options selected by the user during reporting, titles of respective comboboxes and names of respective vocabulary sets used. Table 3.2 and Table 3.3 visualize how

ControlPoint

and

ControlPointInfo

classes look like after one

pathology reporting, where the rst is the key and the second is the value in

HashMap.

The values in these tables are prepared for the same example as

it was in Table 3.1. These data structures are read and processed afterwards, while creating normalized text- and RDF- reports.

51

Vector

ControlPointInfo Vector

Vector

Combobox name

Selected option

Morphophysiological process Neoplastic process

Neoplastic process

Location Left lung location Left lung upper lobe location Left lung upper lobe lingula location

Left lung Upper lobe Lingula

Morphophysiological process Morphophysiological process Thorax location Thorax location Thorax location

Superior segment

Thorax location

Tab. 3.3:

Mass

Vocabulary set

Example of ControlPointInfo members association in MIAWARE

3.3.3

Resource Description Framework

Resource Description Framework (RDF) is the specic model of data representation which belongs to the W3C specications family (World Wide Web Consortium). This is particularly a framework which enables data representation and modelling of information through many syntaxes. For instance, two rst rows of Table 3.1 can be used to show how data can be modelled. In this case

Neoplastic process

describes the data item

Mass.

Therefore, it provides simple denition: "Mass is a neoplastic process". Moreover, if we look at rst row of the Table 3.1, its own description as Morphophysiological

Neoplastic process

process.

has also

The denition of it would

be: "Neoplastic process is a morphophysiological process". Modelling of such data can be done simply using RDF data model. RDF model introduces description of resources by statements. The way in which the resource can be described is intuitive and resembles an ordinary sentence style. In RDF data model contains three components: resources, properties and statements. These components form expressions sometimes 52

Resources are any datatype items, which can obtain denition (statement) through some given relation (property

called as triples. any value

or predicate). Any statement can consist of a new triple resource-propertystatement. Just as an English sentence usually comprises a subject, a verb and objects, RDF statements consist of subjects, properties and objects [5]. For example, in our sentence: "Mass is a neoplastic process".

Mass

neoplastic process (statement) through the property is. is a subject,

is the value assigned to our subject

Let's consider the following sentence: "Mass is a neoplastic process thus it is a morphophysiological process". In this case the sentence analysis will have two levels. Firstly, we extract

Mass as the subject, is as the property and neoplastic process thus it is a morphophysiological process as our statement. Finally, the statement is analysed again. Neoplastic process is a subject, is takes role of property and morphophysiological process is a nal statement (object). This is an example of embedded statements, dened in RDF model as reication. RDF model has its specic XML syntax. Thanks to that one can describe the sentences presented above in XML format. The example printout of our sentence in RDF is presented below: 1

4 5



r d f : a b o u t="http: // sentence/"> r d f : r e s o u r c e="http: // Process/Mass" /> r d f : r e s o u r c e="http: // Property/is" />

r d f : r e s o u r c e="http: // Process/ Neoplastic

>

53

process" /

11

12



13

In this case

rdf:Statement

rdf:Description

tag is representation of RDF statement, while

tag shows the denition of RDF reied statement. RDF sytax

requires the names of resources and properties in URI format. This is because, RDF was created especially for Semantic Web applications. Furthermore, it should be mentioned that the RDF data model can be represented in the form of directed graph, where resource and statement are nodes connected by edge (property). RDF model can be used for ontology development, and in such case the RDFS (RDF Schema) is employed as it contains more elements which allow to structure and arrange the RDF resources better. RDFS is the base for OWL language which allows to obtain even better knowledge description. More details about OWL language can be found in section 3.4, because this language is used for dening MIAWARE lungs ontology.

3.3.4

Normalized medical report generation

The simple denition of the MIAWARE medical report can be expressed as: information about one or more pathologic changes found during the medical analysis of the set of CAT thorax images of a patient. In this section the normalized report generation process is described, both in RDF-format and text format. Obviously, the text report format is required for radiologists, doctors and patients. RDF-format reports are necessary for more ecient ltering process of the groups of reports based on some given criteria.

RDF reports After brief theoretical introduction about RDF data model given in section 3.3.3, this part presents how RDF format is used to save medical information obtained with MIAWARE software. As it was already mentioned, the RDF-format is required for further information processing and searching.

54

Section 3.3.2 explains how the description of any single pathologic change found in patient's lungs is dened and saved by the MIAWARE software. Table 3.1 represents one example of such pathology denition. The nal medical report will usually contain more such denitions grouped in some specic way. The data gathered in the Table 3.1 can be represented as normal, lexical group of sentences describing any pathology found. For example: `A morphophysiological process was found. It is in the form of a neoplastic process of the type mass. It is located in the left lung, in its upper lobe, exactly in the superior segment of the lingula.' Such a group of sentences can be represented as resource-property-statement model and it is used in MIAWARE medical RDF reports.

In this case,

the rst underlined word is a resource and the rest is a statement.

As

our statement consists of group of resources it has to be analyzed further. Then the rst resource of the previous statement is a resource and the rest group another statement. Such embedded structure of the resource-propertystatement is created through RDF reied statements. It should be mentioned that the properties (which connect resources with the statements) in the above example are:

in form of, of the type, etc.

The MIAWARE medical report in RDF format bases on the examples shown above. It uses reied sentences to connect appropriate resources with each other. The only dierence is the choice of the properties. For simplication reasons, every property will be the combobox name of the resource in resource-property-statement group. For example, the sentence (underlined words are properties): `Morphophysiological process in form of neoplastic process' is dened in RDF-report as: `Morphophysiological process morphophysiological process neoplastic process' Another example: `Mass is located in left lung' 55

appears in RDF-report as: `Mass location left lung' One could say that such statements lack a lexical sense, and it is true, but such solution eases the further RDF le processing. Nevertheless, the second example in comparison with the rst keeps its correct lexical sense. Moreover, the main task of the RDF-format reports is not to be human-readable, but computer-readable, that's why such solution cannot be considered as a problem. The whole process of RDF-format report generation is done by a specially created

PlainRDFReport

class.

The functionality of this class is created

on the base of the methods from Jena distribution packages, in order to create the RDF le format data output. Firstly,

PlainRDFReport creates com.hp.hpl.jena.rdf.model

Model class from Jena package using the ModelFactory.createDefaultModel() method. This model will be lled-in with the RDF model elements, created by PlainRDFReport class methods. PlainRDFReport reads the HashMap (see section 3.3.2) with the the empty model,

pathology descriptions and processes every single key-value pair step by step: 1. Retrieve the (value) ControlPointInfo object for a given key and read its members of type Vector (see Table 3.3) 2. Create the URI base names for the resources (reminder: RDF le format requires all the names in URI format). Resources in this case are comboboxes' selected options. The MIAWARE resource URI name format is as follows:

http://// For example: the URI name for

Left lung

is:

http://ThoraxLocation/LeftLung/ 3. Create the URI names for the reied sentences. The MIAWARE format of such URIs is:

56

http://sentence 4. Create resources (objects of Jena-dened class

com.hp.hpl.jena.rdf.model

age

Resource

from the pack-

using previously created resources URI

names). 5. Create properties (objects of Jena-dened class Property from the pack-

com.hp.hpl.jena.rdf.model ).

age

The MIAWARE property URI name

format is as follows:

http://Property// For example: the URI property name for resource

lung

Upper lobe of left

is:

http://Property/Left_lung_location/ 6. Connect previously created resources through adequate properties using Jena-dened classes Statement and ReiedStatement from the package

com.hp.hpl.jena.rdf.model.

In this moment, the URI names of sen-

tences created in step 3 are used. This step ends the algorithm, which is repeated as many times as the number of key-value pairs in the

HashTable.

During every single execution, all the

created elements: statements, resources, properties are added to the global RDF model created in the beginning. Afterwards, when all the information is passed to the RDF model, it is saved to the le using ordinary Java le output stream.

Text reports The generation of normalized medical reports in ordinary text format is another objective realized by the MIAWARE software. The reports created nowadays by the radiologists are the sets of sentences describing the pathologic changes in the patient's body. One of the initially dened objectives for MIAWARE software is to create a text report which is clear and legible for both the patient and the doctor. That is why the layout of the MIAWARE 57

text reports does not resemble the layouts of contemporary reports. It is more structured, based on the relations between the specic information. The example printout of MIAWARE report is presented here: 1 2

************ REPORT g e n e r a t e d w i t h MIAWARE s o f t w a r e ************ G e n e r a t i o n d a t e : Jun / 0 3 / 2 0 0 7

3 4 5 6 7 8 9 10 11 12 13

*********************************** *********************************** C o n t r o l P o i n t no . 1 : ( x , y , z ) = ( 1 4 8 , 1 6 8 , 4 3 ) Specifications : Morphophysiological process : Neoplastic process N e o p l a s t i c p r o c e s s : Mass Location : Left lung L e f t l u n g l o c a t i o n : Upper l o b e L e f t lung upper lobe l o c a t i o n : L i n g u l a L e f t l u n g u p p e r l o b e l i n g u l a l o c a t i o n : S u p e r i o r segment

14 15

***********************************

... 27 28 29 30 31 32 33 34

*********************************** C o n t r o l P o i n t no . 2 : ( x , y , z ) = ( 2 8 2 , 1 6 8 , 2 0 ) Specifications : Morphophysiological process : General process G e n e r a l p r o c e s s : Post − t h e r a p e u t i c a l t e r a t i o n Location : Right lung R i g h t l u n g l o c a t i o n : Upper l o b e R i g h t l u n g u p p e r l o b e l o c a t i o n : P o s t e r i o r segment

35 36

***********************************

The information, which is later presented in the report is retrieved in the same way as it was in the case of RDF reports (see the beginning of section 3.3.4). The dierence between RDF reporting and text reporting is that in RDF reports only the data from value part of key-value

HashMap

pairs

is used, while in text reports also the 3D coordinates of the point selected on the 2d slice image is mentioned. These coordinates are extracted from

ControlPoint

objects (see Table 3.2) and give reference of the pathological 58

change found by the radiologist on the 3d model. This concludes the overview of the reporting schema in MIAWARE software. In the next section the ltering of MIAWARE medical reports is characterized.

3.4 Ontology-based search engine development In the previous section (3.3) it was mentioned that the generation of the RDF reports is needed for its future processing and searching. Herein, that necessity for eective medical reports ltering is going the be claried. Firstly, it must be understood that radiological examinations are carried out quite often in such places as hospitals, private and public surgeries or any other medical institutions. As a result, it produces a great amount of medical reports in relatively short time. Such reports should be kept and gathered together for future usage as references to previously encountered and dened pathologies or diseases. Manual searching of great amount of documents is really time-consuming. The automated generation of reports in some specic le format easy to read for computer (e.g. RDF) is a milestone in development of state-of-the-art computer-based searching. Consequently, intelligent search engine of medical reports can signicantly speed-up the disease recognition process, as, considering given criteria, it would immediately result in sets of references to the archive reports with similar pathological symptoms in other patients, the resultant diagnoses and applied treatments. Consecutive parts of this section deal with the denition of ontologies as well as give the clarication of why and how good ontology should be created to improve searching process. Afterwards, the developed ontologybased search algorithm for ltering MIAWARE medical reports is explained together with a presentation of the graphical user interface for MIAWARE search engine.

3.4.1

Ontology denition and development

In the beginning of this section the simple question should be formed: What actually the word 'ontology' means and what it refers to? The rst, basic an59

notation should be made before starting any explanation. Words: 'ontology' and 'Ontology' describe the same aspect in general, but actually are used with dierent references. The 'Ontology', written with capital 'o', refers to the philosophy branch - philosophy of being. On the other hand, 'ontology' written with small 'o', refers to computer science and information science, simply speaking, has Knowledge Engineering sense. Ontology Engineering bases signicantly on the assumptions made in Ontology. [5] After the introduction concerning Ontology as a philosophy of being with its historical background given, the ideas of ontological engineering will be presented.

Philosophy of being One of the main targets of the ancient philosophers was to dene how the world is built, how to classify the objects (things) which exist in everyday life, and the concepts which are broadly understood, but cannot be perceived by human senses. Across the centuries, many various denitions and ways of thinking were claimed. Dierent opinions were given, because the people's level of knowledge about the surrounding world was increasing with time and consequently more complex ideas, with more details, could appear. Two important aspects of Ontology are, following the introductory chapter of

Ontological Engineering

book [5], the essence and existence:

The essence of something is what this something 1999). However, an existence is

to be

is

(Gambra,

present among things in

the real world. The example of something what has its essence but does not exist can be a grin. Following the Greek mythology, grin is a lion with the head, hands and wings of the eagle. It cannot be perceived and seen by us (without existence), but still has the essence. The things in this world can change some of its properties, but they still are the same thing (their essence is the same). Philosophers were trying to dene the essence of the things, each of them in a dierent way. Plato and the Platonists (through the Theory of Forms) were claiming that the 60

material world as it seems to us is not the real world, but only a shadow of the real world [42]. Aristotle, in contrast, tried to understand world empirically and it was him, who dened various states of being in order to classify anything in the world. Such classication could be made through

substance, quality, quantity, relation, action, passion, place and time [5]. For instance, if two descriptions are considered: `John is Jane's brother' and `John is in the kitchen', they have dierent states of being: relation and place, respectively. Such set of the general properties describing things were called also as universal patterns (universals). Many questions categories like:

about what universals really are and corresponding assumptions appeared. In Middle Ages the main issue were universals and it was then, when the term 'ontology' was used for the rst time by Christian Wol (1679-1754), a German philosopher. Back then the main issue was to dene if universals are dened things of the world (realism ) or only the words (categories) which give description of actual things (nominalism ). Finally, the baseline of that times appeared to dene universals as symbols, denitions of things. Even more complex theory, which just can have the reection in today's life information systems, was created by Emmanuel Kant (1724-1804). He dened other categorization to structure the world's things. Kant's framework is organized into four classes, each of which presents triadic pattern: quantity(unity, plurality, totality), quality(reality, negation, limitation), relation(inherence, causality, community) and modality(possibility, existence, necessity) [5]. Logic was involved in creating such framework, as e.g. (unity, plurality, totality) triple refers directly to three logical judgments (singular logic judgment, ∃ - 'exists' judgment, ∀ - 'for all' judgment). In such a way, the world's things dening (Ontology) is being developed till now. Recently, the philosophy of being is closely connected to information systems engineering. From the beginning, ontology was closely connected to other life areas. Ontology is not a discipline which exists separately and independently from all the other scientic disciplines and also from other branches of philosophy. Rather, ontology derives the general structure of the world; it obtains the structure of the world as it really is from knowledge embodied in other disciplines [38]. 61

Ontological engineering The concepts expressed by Ontology as a philosophical movement are being used in Knowledge-based systems, Articial Intelligence or Computer Science, simply because they are common for today's world objects or situations and recently, there is a need to develop such computer systems which resemble, more and more, the real life. The questions formulated by Ontology like: 'What is this?', 'What describes it?' or 'How can it be described?' represent actually the model where some thing (subject) is to be described by other thing (object) through some relationship. The RDF model described in section 3.3.3 uses identical triple (subject-property-object) to represent the data. This example allows to create a simple denition of ontology data model as a set of concepts within a domain and the relationships between these objects [41]. The authors of the

Ontological Engineering

book [5] present

the following ontology types:



highly informal  natural language ontologies.



semi-informal  natural language ontologies, but with some structure and restrictions in the form.



semi-formal  articial language (formally dened) ontologies

There are also views that highly informal ontologies are not actually ontologies as they cannot be understood by the computers (machines). An ontology has some main components which are used during the development process. There exist many languages and many dierent knowledge modelling techniques to model the ontologies. For every technique or language, the ontology components can dier somehow. Herein, the detailed description of the Web Ontology Language (OWL) will be presented as the MIAWARE ontology for lungs (see section 3.4.2) was developed exactly in that language. OWL is the language created especially for Web ontologies dening. It uses description logics as its knowledge representation technique. Description

62

logics is used to represent the terminological knowledge of an application domain in a structured and formally well-understood way [40]. Since OWL belongs to W3C recommendation family, the original OWL description is given here: The OWL Web Ontology Language is designed for use by applications that need to process the content of information instead of just presenting information to humans. OWL facilitates greater machine interpretability of Web content than that supported by XML, RDF, and RDF Schema (RDF-S) by providing additional vocabulary along with a formal semantics. OWL has three increasingly-expressive sublanguages: OWL Lite, OWL DL, and OWL Full. [23] An OWL ontology can be considered as a network of its three main components:

• classes  dene names of the relevant domain concepts and their logical characteristics [19]. Sometimes, classes are called sets of individuals, since they describe a category, in which, the dened instances exist. They can be dened in taxonomies (superclasses with its subclasses).

• properties (slots, attributes, roles)  dene the relationship between classes [19] or establish the link between dened classes (individuals). Properties can be inverse, functional, inversely functional, transitive and symmetric.

• individuals (dened classes)  are instances of the classes with specic values for the properties [19]. Below, the detailed description of most important components and their types is presented, basing on the example ontology for cars designed especially for educational purposes using Protégé OWL plugin [16] described in section 3.1.2. This is a simple ontology which denes relations between car models and countries where they are produced.

63

Fig. 3.6:

OWLViz [4] visualization of the example ontology for cars (asserted hierarchy)

64

Fig. 3.7:

OntoViz [37] visualization of inverse properties in the ontology for cars

Classes and taxonomy All the classes in OWL ontology are subclasses of

owl:Thing

class. The

class owl:Thing is the class that represents the set containing all individuals [12]. Classes in our example ontology have the taxonomy and it is shown in Figure 3.6. Such a representation is called asserted hierarchy, because it represents exactly the same structure which was created by a designer. The class

Country

is divided into two concepts:

Asian

and

European

countries, and then further subdivisions appear. Similar situation appears

Car

for

class, where

Moreover, other

NamedCar

Car 's

class consist of some specic car models.

subclasses as

EuropeanCar

or

JapaneseCar

dene

concepts related with geographical areas where some given car is produced. One can notice that and

ItalianCar,

EuropeanCar

exist on the same level with

SpanishCar

but logically it should be subclassed by them. This will be

no problem if the relations between classes are set correctly. In that case, any reasoner (system used for validating and computing logical relations between any knowledge data) is able to deduce correct taxonomy. In ontologies, any individual can be an instance of more than one class. In our case there is no individual which can be both a

Car

and a

Country,

disjoint classes in our ontology. The same must be done for subclasses of Country, EuropeanCountry and NamedCountry. thus these classes are set as

65

Properties The next step is to dene properties, which will serve as links between the individuals. In OWL properties can link two individuals (object property), individual with any datatype (datatype property) or assign some information - metadata to class, individual or datatype/object property (annotation property) [12]. In our ontology two object properties are set: and

hasProductionOf.

isProducedIn

They will dene what (cars) and where (countries) is

produced. These two properties are signed as inverse properties (Figure 3.7). It means that if any individual (X) has link to the other individual (Y)

isProducedIn property, then (Y) has link to (X) through the inverse property hasProductionOf. There are also other types of properties: through

• transitive  if any transitive property connects individuals X with Y, and Y with Z then through deduction individual X is connected with individual Z. In predicate logics it is presented as:

∀X(∀Y (∀Z((tr_prop(X, Y ) ∧ tr_prop(Y, Z)) ⇒ tr_prop(X, Z)))) • symmetric  any symmetric property connects two individuals in both ways. It is a bidirectional property. If X connects with Y then automatically Y connects through this property with X. It can be described in predicate logics:

∀X(∀Y (sym_prop(X, Y ) ⇒ sym_prop(Y, X))) • functional  the property which connects one individual with other individual, and only with that individual. In our case, the isProducedIn property is marked as functional, because the assumption was made that one type of car can be produced only in one country. In predicate logics it can be described as:

∀X(∀Y (∀Z((f n_prop(X, Y ) ∧ f n_prop(X, Z)) ⇒ Y = Z))) • inverse functional  If a property is inverse functional then it means

hasProductionOf is inverse functional, since it is inverse property of isProducedIn. that the inverse property is functional [12]. Property

66

Fig. 3.8:

OntoViz [37] visualization of the class relationships in the ontology for cars

When creating the properties in any ontology, the domain and range of these properties can be restricted. By default, the domain and range of any

isProducedIn was restricted by Car class and range by Country class. Since hasProductionOf is inverse property, their domain and range was set automatically to Country and Car, respectively. Figure 3.7 visualizes how domain and range are set property is class

owl:Thing.

In our ontology, domain of

for inverse properties.

Restrictions and individuals (dened classes) After successful classes and properties denition, the properties must be applied to classes. This is obtained by setting suitable restrictions. Quantier restrictions are used the most often in OWL ontologies. They can be dened using two, well known quantiers: existential quantier ∃ and universal quantier ∀. In our ontology, there exist ∃ restrictions of example:

• hasProductionOf ( Japan, ToyotaCorolla ) • hasProductionOf ( Italy, Lamborghini ) • hasProductionOf ( Spain, SeatLeon ) 67

Country

class family, for

Universal quantier, however, was used for dening restrictions in

Car

class family (predicate logic):

• ∀ X (instanceof (X, Lamborghini) ⇒ producedIn (X, Italy)) • ∀ X (instanceof (X, SpanishCar) ⇒ producedIn (X, Spain)) • ∀ X (instanceof (X, EuropeanCar) ⇒ producedIn (X, EuropeanCountry)) Consequently, the families of

Car

and

Country

classes achieved some

relations between them (Figure 3.8). Universal quantier cannot be used with

Country

classes, since e.g.

Japan

is NOT producing ONLY

Toyota

cars, but

also others car brands. That's why the existential quantier was used. On the other hand,

Lamborghini

is

Italian

car (it was assumed that production

of cars is performed in the same country where it was found), therefore it is produced ONLY in

Italy, so universal quantier was used herein.

There are other restriction types, for example, CardinalityRestriction. It denes exact number or some limit of individuals which can relate with given class through some property. Symbols used when dening cardinal restrictions are: equal =, smaller or equal ≤ and greater of equal ≥. Three subclasses of

NamedCar

class have specied CardinalityRestriction which

states that any car can be produced by only one individual. Finally, the denition of individuals must be performed. Every class can have its

necessary

Any class is dened (individual)

sucient

necessary and sucient conditions. when it has one or more necessary and

conditions as well as

conditions. Class without such conditions is always not dened

(primitive). If any individual is a member of a class X, which has only

necessary

conditions, we know that such individual has to satisfy these conditions. But it works as an implication. Even if some other individual satises such conditions, it is possible that it is a member of class X as well as that it is not. In case of any class Y with

necessary and sucient

conditions, any indi-

vidual which is a member of such class also must satisfy these conditions, but 68

Fig. 3.9:

OWLViz [4] visualization of the ontology for cars (inferred hierarchy)

now it is working like equivalence. If any individual satises these conditions, it is surely a member of class Y. In our ontology all the classes marked in orange in Figure 3.6 are in-

necessary and sucient conditions), while yellow ones are primitive classes. SpanishCar, EuropeanCar, ItalianCar, JapaneseCar and Italy, Japan, Spain are individuals, as they are instances with dened properties. For example, SpanishCar has for all restriction isProducedIn Spain, dividuals (have

and for such restriction it is obvious that all the cars produced in Spain will

necessary and sucient conditions. Opposite situation present the following classes: SeatLeon, Lamborghini or ToyotaCorolla. SeatLeon has the same restriction as SpanishCar, be Spanish cars. That's why these restrictions are

but in this case it is not individual, because it is not true that every car

Spain is SeatLeon. There are also other car brands, which can be produced in Spain, thus these restrictions are only necessary conditions.

produced in

Final step in ontology development is validation of the ontology with the reasoner. This ontology was successfully checked (no inconsistence was found) with the Racer reasoner [33]. Additional feature oered by reasoners is that since they can deduce facts from ontologies, they can also create a 69

Fig. 3.10:

Protégé [16] classication results frame after reasoning on the ontology for cars

new, inferred hierarchy for an ontology. In this case the inferred hierarchy for our ontology is presented in Figure 3.9. Moreover, Figure 3.10 shows the classication results and changes obtained in the inferred hierarchy after Racer reasoning. It is visible, that

SpanishCar

and

ItalianCar

are now subclasses

EuropeanCar (what was predicted earlier), and moreover Lamborghini, SeatLeon and ToyotaCorolla classes now subclass not only NamedCar class, but also ItalianCar, SpanishCar and JapaneseCar, respectively. The classes

of

with blue border, are classes, which changed their superclasses after reasoner deduction. All the steps for ontology design described here were applied while developing the medical ontology for lungs. It should be mentioned that the whole idea of development of the ontologies and the necessary theoretical basis was reached after the close lecture of the set of scientic articles of Nicola Guarino and Chistopher Welty, referring to that issue [7] [8] [10] [9].

3.4.2

Medical ontology for lungs

The recent version of MIAWARE software allows radiologist to dene in the medical report the pathologies found and their respective location in the lungs. The objective here was to create the special search engine, which will be able to nd some specied pathology in a certain location, but not in a lexical way (as the ordinary search engines work), but in a logical way. For

70

example, the doctor wants to nd all the reports where there exist `a polypus in the left lung'. The ordinary search, would give as the result all the reports, which have sentences exactly with the words: `left lung' and `polypus'. In case of the MIAWARE search engine, such a query gives as a result all the reports, which have sentences where `polypus' is found in any part of the left lung, thus e.g. `lingula', `upper left lung lobe', `left lung' etc. Such a ltering procedure can be created using the ontology, where all elements of the lungs are logically connected to each other. Such a search engine does not only look for the given query words(elements), but also for the logical connections of that elements with others through some specic properties. Below, the specication of the lungs ontology is presented, which is used by MIAWARE search engine.

Lungs ontology development with Protégé The rst step to be made was to become acquainted with the structure of the human lungs. After consultation with the doctor Miguel Castro, RadLex webpage [32] and the

Anatomy and Physiology

book [35], the necessary in-

formation was gathered to start the ontology development. The basic human lungs structure representation is shown in Figure 3.11. Human lungs are represented as a pair: left lung and right lung. Both of them have parts, called as lobes. The left lung has only two lobes: upper and lower, while right lung has one more: middle lobe. Moreover, every lobe has its parts (segments), and upper lobe of the left lung has yet an element called lingula, which is then divided into segments of lingula. In anatomical discourse, the most natural path of reasoning follows part-whole relationships [26]. In the ontology for lungs, the part-whole approach is used to dene the lungs and their parts. The Foundational Model of Anatomy (computer-based knowledge source for bioinformatics) [6] was the reference during the ontology development [27] [3]. Firstly, the ontology was prepared in Protégé, as it is more intuitive and allows a graphical visualization. The hierarchy of classes for this ontology is shown in Figure 3.12 and Figure 3.13. The most important fact was to

71

Fig. 3.11:

Fig. 3.12:

Human lungs structure [21]

Lungs ontology - hierarchy of classes (created with OWLViz [4])

72

Fig. 3.13:

Lungs ontology - continuation (created with OWLViz [4])

73

create base classes, which are disjoint with each other (as it was in case of the

Car

and

Coutry

classes 3.4.1). In case of the ontology for lungs the

following concepts were extracted:

Lung, Lobe, Lingula

and

Lobe cannot be in the same a Segment etc. Moreover, in the

classes are disjoint since any

Segment, Lingula

cannot be

Segment. These time a Lung or human body,

there can exists lobes not only in the lungs, but also in other body parts like liver or brain. That is the reason why the subclass of

Lobe (LungLobe )

was

created. This allows further expansion of our ontology for other parts of the body. For the same reasons the subclasses of

Segment

and

Lingula

classes

were created. Table 3.4 represents the structure of lung lobes (their division into segments). Left lung

Upper lobe

lingula , segments: apicoposterior, anterior segments: anteromedial basal, lateral basal, posterior basal, superior

Lower lobe Upper lobe Middle lobe Lower lobe

Right lung segments: apical, posterior, anterior segments: lateral, medial segments: anterior basal, superior, medial basal, lateral basal, posterior basal

Tab. 3.4:

Segments in lung lobes

As it was in case of the cars ontology, the properties denition is needed. The dened properties are presented in Figure 3.14. Two main properties:

hasPart

and

isPartOf

have their subproperties (directly connected with

hasLung, hasLungLobe, hasLeftLungLobeLingula, hasLeftLungLobeLingulaSegment, hasLungLobeSegment and their respective `is*Of' properties are used to create restrictions

specic ontology classes). Finally, the properties:

and connections between classes. Such restrictions are later used by the MIAWARE search engine. All the restrictions created for subsequent classes were very similar. A sample screenshot from Protégé for the class

LungLobe

LowerLeft-

is shown in Figure 3.15.

First, there is a denition saying that our class is a subclass of the 74

Lower-

Fig. 3.14:

LungLobe

Object properties in lungs ontology - Protégé screenshot

class. Secondly, four restrictions are formulated dening the exact

segments (individuals) which are parts of the lower lobe of the left lung, and then, the cardinality restriction says that there can be only and exactly four segments in this lobe. Finally, the last restriction denes that our lobe is a lung lobe of the left lung using the property

isLungLobeOf.

Such an ontology was checked with the Racer reasoner and did not show any inconsistence.

Lungs ontology development with Jena The necessity for creating the ontology was based on the integration reasons. Protégé uses a bit dierent way of expressing some ontology structures (restrictions, properties) in OWL les than Jena. Since the MIAWARE reports 75

Fig. 3.15:

Restrictions in lungs ontology (lower lobe of left lung example) - Protégé screenshot

in RDF format are created using Jena software and these reports are to be compared with the ontology OWL les when searching, Jena could not recognize all the relations described in Protégé's OWL les. That's why, the ontology created in Protégé had to be rewritten in Jena. As it was mentioned in the section 3.1.2, Jena does not have a visual editor, thus consequently, ontology must be created in a programmatic way. The steps which had to be followed to rewrite our ontology in Jena are the following: 1.

Declare base classes and

Segment

classes.

Lung, Lobe, Lingula The Jena's dened method createClass was used:  in our case these were

... 54

/ * CLASSES DEFINITION

55

OntClass OntClass OntClass OntClass

56 57 58

*/ o c l L i n g u l a = m . c r e a t e C l a s s ( n s + "Lingula" ) ; o c l L o b e = m . c r e a t e C l a s s ( n s + "Lobe" ) ; o c l L u n g = m . c r e a t e C l a s s ( n s + "Lung" ) ; o c l S e g m e n t = m . c r e a t e C l a s s ( n s + "Segment" ) ;

. . . The member 28

S t r i n g ns

=

ns

is the namespace of the ontology:

"http :// torax.owl#" ;

...

76

2.

Declare subclasses

 the JenaUtils class, which contains the group

of static methods useful during Jena's ontology development, was used here. The methods of that class were created in 2006 by two Portuguese students from University of Aveiro, Nuno Ricardo Oliveira and Filipa Ferreira. The original name of the class was xJena.java and was changed to JenaUtils.java as the common Java style requires that the classes names start with the capital letter and moreover, such a name (JenaUtils) was considered to be more adequate in this case than xJena. Short example printout is presented here: 60

/ * SUBCLASSES DEFINITION

*/

/*

*/

61 62

oclLingula

subclasses

63

O n t C l a s s o c l L e f t L u n g L o b e L i n g u l a = J e n a U t i l s . c r e a t e S u b C l a s s (m, o c l L i n g u l a , n s + " LeftLungLobeLingula " ) ;

64 65 66

/* o c l L o b e

67

subclasses

*/

68

OntClass oclLungLobe ns + "LungLobe" ) ;

69

70

=

J e n a U t i l s . c r e a t e S u b C l a s s ( m, o c l L o b e

,

... 3.

Declare base properties  in our case these were isPartOf and hasPart properties (Jena method used):

230

/ * PROPERTIES DEFINITION

231

OntProperty oprhasPart = + "hasPart ") ; O n t P r o p e r t y o p r i s P a r t O f = m. c r e a t e O b j e c t P r o p e r t y ( n s + " isPartOf" ) ;

232

*/ m. c r e a t e O b j e c t P r o p e r t y ( n s

... 4.

Declare subproperties  all the subproperties (shown in Figure 3.14) are dened here using JenaUtils's methods:

234

/ * SUBPROPERTIES DEFINITION

77

*/

235

/* o p r h a s P a r t

236

subproperties

*/

237

OntProperty oprhasLobe oprhasPart , ns + "hasLobe" ) ;

238

239

=

J e n a U t i l s . c r e a t e S u b O b j P r o p (m,

... /* oprhasSegment

257

subproperties

*/

258

OntProperty oprhasLeftLungLobeLingulaSegment . c r e a t e S u b O b j P r o p ( m , oprhasSegment , n s + " hasLeftLungLobeLingulaSegment " ) ;

259 260 261

=

JenaUtils

... 5.

Set range and domain of the properties methods

addDomain

and

addRange

 in this step, Jena's

are used:

o p r i s L u n g L o b e O f . addDomain ( o c l L u n g L o b e ) ; o p r i s L e f t L u n g L o b e L i n g u l a O f . addRange ( o c l U p p e r L e f t L u n g L o b e ) ;

308 309 310

o p r i s L e f t L u n g L o b e L i n g u l a O f . addDomain ( o c l L e f t L u n g L o b e L i n g u l a ) ;

311

... 6.

Construct restrictions  both, quantier restrictions and cardinality restrictions are to be formulated: SomeValuesFromRestriction svfrhasLungLobe_UpperLeftLungLobe = m . c r e a t e S o m e V a l u e s F r o m R e s t r i c t i o n ( ns + " svfrhasLungLobe_UpperLeftLungLobe " , oprhasLungLobe

326

327 328

,

oclUpperLeftLungLobe ) ;

329

... 485 486 487

C a r d i n a l i t y R e s t r i c t i o n cardhasLungLobe_3 = m . c r e a t e C a r d i n a l i t y R e s t r i c t i o n ( n s + " cardhasLungLobe_3 " , oprhasLungLobe , 3 ) ;

78

... 7.

Create the lists of restrictions  in this step, the restriction lists are created, one list for every dened class. Figure 3.15 presents the group of restrictions for one class (LowerLeftLungLobe ) in Protégé, which is equivalent to one restriction list in Jena. Below, the example of restriction list for

MiddleRightLungLobe

class is presented:

RDFList l i s t M i d d l e R i g h t L u n g L o b e

529

=

m . c r e a t e L i s t ( new RDFNode [ ]

{

svfrhasLungLobeSegment_LateralRightLungLobeSegment , svfrhasLungLobeSegment_MedialRightLungLobeSegment , s v f r i s L u n g L o b e O f _ R i g h t L u n g , cardhasLungLobeSegment_2

530 531 532

}) ;

... 8.

Declare intersection classes

 this is a new step in comparison to

the Protégé ontology development process. Creation of the intersection classes is an intermediate procedure, which must be carried out before assigning the restriction list to the ontology class. Intersection class, takes one list of the restrictions which are grouped exactly for one dened class. Short printout is presented below: I n t e r s e c t i o n C l a s s i c l U p p e r R i g h t L u n g L o b e = m. c r e a t e I n t e r s e c t i o n C l a s s ( ns + " iclUpperRightLungLobe " , l i s t U p p e r R i g h t L u n g L o b e ) ; I n t e r s e c t i o n C l a s s i c l M i d d l e R i g h t L u n g L o b e = m. c r e a t e I n t e r s e c t i o n C l a s s ( ns + " iclMiddleRightLungLobe " , l i s t M i d d l e R i g h t L u n g L o b e ) ;

595

596 597

598

... 9.

Set equivalent classes

 this is the nal step, the ontology classes

are to receive the restrictions through intersection classes, declared in the previous step. It is obtained by setting the ontology class to be equivalent with the appropriate intersection class: 671

oclLowerLeftLungLobe . s e t E q u i v a l e n t C l a s s ( iclLowerLeftLungLobe ) ;

79

oclLowerRightLungLobe . s e t E q u i v a l e n t C l a s s ( iclLowerRightLungLobe ) ; oclLeftLungLobeLingula . setEquivalentClass ( iclLeftLungLobeLingula ) ; oclAnteriorBasalLungLobeSegment . setEquivalentClass ( iclAnteriorBasalLungLobeSegment ) ;

672

673

674 675

... This step nishes the creation of the ontology. The problem, which arose during rewriting the ontology to Jena, was the fact, that a Java le with all the steps described above, consists of many classes and properties denitions and results to be a le with almost 900 source code lines. Many of these lines are repeated and dier only by the names of classes, properties or restrictions. Creating such a le by hand can produce some unexpected and dicult to nd mistakes. That is the reason why the generator for Java source code was needed and developed.

Generation of Java source code for ontologies The objective of such a generator is to create Java le with denitions of the Jena-based ontology according to the steps presented above, on the basis of the data found in the conguration le. Firstly the conguration le was prepared. Only seven types of parameters must be dened in order to create the ontology: CLASS, SUBCLASS, PROPERTY, SUBPROPERTY, DOMAIN, RANGE and RESTRICTIONS. The syntax of the conguration le is as follows:

• base classes denition  CLASS = , , ...

• subclasses denition  SUBCLASS: = , , ...

• base properties denition  80

PROPERTY = , , ...

• subproperties denition  SUBPROPERTY: = , , ...

• domain setting  DOMAIN: =

• range setting  RANGE: =

• restrictions assignment  RESTRICTIONS: { [email protected] : , ... , [email protected] : , ...} where [email protected] starts quantier ∃ restriction and [email protected] starts cardinality restriction. Below, the fragmentary printout of the conguration le is shown: 4

CLASS

=

Lingula

,

Lobe , Lung , Segment

5 6

SUBCLASS : L i n g u l a

LeftLungLobeLingula

=

7 8 9

SUBCLASS : Lobe = LungLobe SUBCLASS : LungLobe = LowerLungLobe

,

MiddleLungLobe

,

UpperLungLobe

... 36

PROPERTY

=

hasPart

,

isPartOf

37 38 39

SUBPROPERTY : h a s P a r t SUBPROPERTY : h a s L o b e

= =

hasLobe , h a s L i n g u l a , hasSegment , hasLung hasLungLobe

... 55 56

DOMAIN : h a s L e f t L u n g L o b e L i n g u l a = U p p e r L e f t L u n g L o b e RANGE : h a s L e f t L u n g L o b e L i n g u l a = L e f t L u n g L o b e L i n g u l a

81

... 92 93 94 95 96 97 98

RESTRICTIONS : U p p e r L e f t L u n g L o b e { [email protected] h a s L e f t L u n g L o b e L i n g u l a : L e f t L u n g L o b e L i n g u l a [email protected] h a s L e f t L u n g L o b e L i n g u l a : 1 [email protected] hasLungLobeSegment : A p i c o P o s t e r i o r L e f t L u n g L o b e S e g m e n t [email protected] hasLungLobeSegment : A n t e r i o r L e f t L u n g L o b e S e g m e n t [email protected] hasLungLobeSegment : 2 [email protected] i s L u n g L o b e O f : L e f t L u n g

99 }

... Such a conguration le is read by specially developed

tor.exe

ontologyGenera-

program, which is able to produce adequate Java le with Jena-based

ontology denitions. The generator was created using

ex

(fast lexical ana-

lyzer) free software in C programming environment. It is performing lexical analysis of the conguration le and, according to the data found there, is generating suitable Java le. After compilation and execution of such a Java le, the Jena-based ontology OWL le is produced. Having both, RDF les and ontology already prepared, the last objective was to invent and implement ontology-based search algorithm for RDF les. The next section refers to that issue.

3.4.3

Search algorithm for RDF les

In this section the development process of the ontology-based search engine is presented. After a brief description of the graphical user interface, the details of the search algorithm are discussed. Figure 3.16 presents the graphical user interface for the MIAWARE search engine. The whole GUI was created in Java using Swing components. It is divided into four panels:

• search criteria • ontology-based location • RDF reports folder path

82

Fig. 3.16:

Ontology-based search engine  graphical user interface

• search results The detailed guide over search engine functionality is presented in the Chapter 4, section 4.4. Despite of it, it should be mentioned that the values selected in rst three panels, thus: pathology name, lung location name and RDF reports folder are three parameters, necessary to start searching. The algorithm for searching (ltering) RDF les according to entered criteria is divided into the following parts: 1.

Read search criteria  Three aforementioned parameters for searching are obtained here. Having the name of the lung location, the appropriate ontology class object is obtained (parent class of the given search).

2.

Get location subclasses  the subclasses of the parent location class are read from the ontology. All these subclasses (together with the parent class) are saved in the memory (dynamic Java container - Vector - used).

3.

Get related classes  All the individuals in our ontology for lungs are related together through part-whole relationships (restrictions) using 83

hasPart /isPartOf

properties family. Herein, the objective is to gather

all the individuals, which are related through the property

hasPart

(or

any of its subproperties) with the classes previously saved into memory. The Java method

getAllPropertyConnectedClasses

was developed to

reach this purpose. Figures 3.17, 3.18 and 3.19 represent pseudocode owchart for the algorithm used by this method. Figure 3.17 presents the declaration part of the algorithm. Method's arguments: CLASSES and PROPERTIES are the vectors, which keep the ontology classes (retrieved in the previous step) and properties (hasPart property with its subclasses), respectively. The brief description of the declared items is presented below:

• ALL_CLASSES  this vector will keep all the classes retrieved during execution of this method, together with the CLASSES vector. Returned as the method's result.

• NEW_CLASSES  this vector will keep only the related classes retrieved from the restrictions of the CLASSES vector elements.

• hasNewClasses  this boolean ag, set by default to FALSE, informs if any related class was added to the NEW_CLASSES vector. Figure 3.18 shows the loop where all the elements of CLASSES vector are checked in order to retrieve any hasPart related classes. Every class has some number of restrictions which are grouped in the RLIST list. Every (not cardinal) restriction from such a list, is checked for presence of hasPart property. If found, the related class is retrieved, and if it is not already present in NEW_CLASSES vector, it is added to that vector. In that moment, the hasNewClasses ag is set to TRUE. Unless all the classes are checked, the loop executes. Figure 3.19 presents the last section of the method, which is executed when the CLASSES checking loop terminates. Firstly, the

Classes

hasNew-

ag is consulted, and if it is set to TRUE, it means that there

was at least one related class added to the NEW_CLASSES vector. 84

Fig. 3.17:

Ontology-based search algorithm owchart, part 1

85

Fig. 3.18:

Ontology-based search algorithm owchart, part 2 86

Fig. 3.19:

Ontology-based search algorithm owchart, part 3

87

In such a situation, that vector must be checked in the same manner as it was with the initial CLASSES vector. recursive call of the

Consequently, the

getAllPropertyConnectedClasses

method is made,

with NEW_CLASSES vector and PROPERTIES vector as the arguments. The result of such call (vector of classes) is added to the ALL_CLASSES vector and nally, as a method conclusion, is returned as a nal result. 4.

Search selected RDF le reports  here the proper search algorithm is implemented. Every RDF le found in the given RDF reports folder is veried during the

checkRDFFiles

method execution. Figure 3.20

presents the pseudocode owchart for that method. First, the empty FILTERED_FILES vector is created, where the lenames of the RDF les, which satisfy the search criteria, will be added. Next, the loop is started, where every RDF le content will be checked separately by the

doSearch

method (see owchart in Figure 3.21).

It returns

boolean value: TRUE - if the le fullls search criteria, and FALSE, otherwise. For TRUE case, the recent RDF lename is added to the FILTERED_FILES vector, and the loop continues checking the rest of les. Let's describe now the functionality of

doSearch

method. First, the

boolean ag FOUND is set to FALSE. That ag is returned by this method and is always reset to TRUE if the conditions satisfying the search criteria are encountered. Every RDF le contains one or more pathologies' denitions. As a result, the loop is executed, in order to compare each denition separately with the search criteria vectors. Table 3.1 on page 50 presents the example content of one pathology definition. Such a description (statement) can contain of multiple pathology names (hierarchy) as well as locations. Suitable statement vectors are loaded with these values (stmt_pathologies and stmt_locations). The search criteria (pathology and vector ALL_CLASSES) are then compared respectively with the aforementioned statement vectors. If

pathology

is found in stmt_pathologies and concurrently any value from 88

Fig. 3.20:

Ontology-based search algorithm owchart, part 4

89

Fig. 3.21:

Ontology-based search algorithm owchart, part 5

90

ALL_CLASSES

stmt_locations, the search criteria is satised, the present loop breaks and doSearch method returns FOUND can be found in

ag set to TRUE. Otherwise, the loop continues the scan of remaining pathology denitions. If no pathology satised the search criteria,

doSearch

method returns FALSE.

All the methods, used in the above presented steps, are grouped in the Java

OntologyBasedSearch

class. That concludes present section over the

ontology-based search engine and simultaneously nishes the entire overview of the MIAWARE software architecture. Chapter 4 describes and summarizes the capabilities of the MIAWARE software package in form of the practical guide for its users.

91

4.

MEDICAL ANALYSIS AND REPORTING WITH MIAWARE

This chapter provides a detailed description of the activities that can be performed with MIAWARE software. It is a kind of MIAWARE user's guide. Firstly, the installation steps and prerequisites are specied (section 4.1). In further sections, the overview through MIAWARE functionality is given together with the print screens produced by the software.

4.1 Installation notes In order to run MIAWARE software without any problems, the installation steps presented below should be always followed. Figure 4.1 presents the le structure of the MIAWARE software package. The application should run without any problems if two main requisites are fullled:

• The Java Virtual Machine (version 1.6.0_01 or higher) must be installed. The Java version can be checked executing Windows batch le:

JAVA_VERSION_CHECK.bat.

It checks if the required Java

version is installed, and if not, executes the installer of Java Virtual Machine automatically.

• The Microsoft .NET 2.0 Platform must be installed. The verication, if the necessary .NET version is installed, the program called

dotNetTester.exe,

written by Guy Vider, is used. This program was

found and downloaded from

The Code Project

web page [1]. Later, the

source code was modied and another Windows batch program was created:

DOT_NET_2.0_CHECK.bat.

It checks, using the aforemen-

Fig. 4.1:

This gure presents the les and folders which belongs to the MIAWARE package. Four selected elements are necessary to be executed in order to install and execute properly the MIAWARE software.

93

tioned

dotNetTester.exe

program, if the Microsoft .NET 2.0 Platform

is installed. If it is not, the Microsoft .NET 2.0 installer executes automatically. Both installers, for Java VM 1.6.0u1 and Microsoft .NET 2.0 Platform were downloaded from the web pages of Sun and Microsoft companies, respectively. After the successful installation of the software, the MIAWARE application can be run. It is done simply by the execution of

MIAWAREv1.0.exe

le. The *.exe le was created using the Launch4j software [20]. Since Java creates executable programs with *.jar extension, it was decided to wrap it to *.exe le. Following the Launch4j webpage, Launch4j is a cross-platform tool for wrapping Java applications distributed as jars in lightweight Windows native executables. Execution of the

MIAWAREv1.0.exe

le will produce the MIAWARE

main frame, presented in Figure 4.2. All the elements in that gure are indexed in order to be used as references in further sections of this chapter. For example, the reference to the

Start analysis

button will be represented

in this chapter as [E2 - 4.2], which means: element number 2 in Figure 4.2.

4.2 Analysis of CAT images The detailed examination and analysis of the radiological images is a rst step in the pathology denition and further disease recognition. The following subsections describe all the tools and features of MIAWARE provided in order to ease-up image analysis process.

4.2.1

Specifying CAT stack location

In order to start an analysis, a single folder path with the radiological examination (CAT) images (stack) must be specied. The path can be entered directly to the text eld [E1b - 4.2]. Another option is to use le chooser dialog, which appears when user clicks the

94

Open

button [E1 - 4.2].

Fig. 4.2:

MIAWARE graphical user interface

95

Fig. 4.3:

This dialog with a progress bar pops-up after pressing the Start analysis! button. In that moment the medical images are being processed and loaded into the memory.

There are some restrictions in MIAWARE as far as the names of the stack images are concerned. The MIAWARE software is able to read only such a

IM%06d. It means that every image must have a name starting with letters IM followed stack, where any single image has a name, matching the pattern:

by a index number of a slice (image). Such an index number has always six digits. Moreover, the rst image has to have index equal to zero (IM000000 ). For example, the third image must have a name image -

IM000109.

IM000002,

while the 110th

The les should not have any extension.

After specication of the stack path, the

Start analysis!

button [E2 - 4.2]

should be pressed in order to load images into memory and process image data (described in section 3.2.2). This is a time-consuming action and can take from few seconds to minutes, depending on the machine. A progress bar will pop-up (Figure 4.3) and, in the meantime, will show percentage progress and description of the action, which is recently being performed. MIAWARE sets by default the path to

test-photos

folder where the sample CAT stack is

placed. All the print screens in this document were created for that, specic images. When nished, the 3D model together with three cross-section images are shown. The actions that can be performed on them are described in the further subsections of this section. Moreover, all the pathologies, which were dened and saved during previous software executions for a given stack, were loaded. The process of pathology denition, also known here as pathology reporting, is described in section 4.3 of this chapter.

96

Fig. 4.4:

This panel presents the three-dimensional model created from the CAT slice images. Together with the model the cutting widgets are visible and already dened pathologies (light green spheres)

4.2.2

3D model manipulation

The three-dimensional model of the patient's thorax created from the CAT slices is presented in the panel

Model 3D

[E3 - 4.2]. It resembles the human

thorax skeleton (Figure 4.4). The coordinate system directions are dened here as follows: horizontal and vertical edges of the resents, respectively, x- and z- axes.

Model 3D

panel rep-

The panel depth is here an y-axis.

MIAWARE's user interface provides three additional panels, which allows the manipulation (rotating and zooming) of the 3D model view as well as its cutting in order to achieve cross-sectional views. Such utilities provide better viewing of any part of the examined body. Any single point can be seen in three dierent planes of reference, what can be very fruitful while searching for the pathologies.

97

Fig. 4.5:

This simple panel provides a basic 3D model view manipulation toolbar (rotation and zooming utilities) as well as the cutting widgets visibility in the 3D rendering window.

Rotation The model view can be rotated in four directions using the

Rotation

panel

[E4 - 4.2] (Figure 4.5). Actually, the rotation is simple camera movement. The camera rotation about an axis placed in the center of the model in zdirection is performed using two buttons: [E4b - 4.2] and [E4d - 4.2]. The rst one (right arrow), performs model counter-clockwise rotation (clockwise camera movement), and the second (left arrow), inversely. Respective buttons for rotation about axis in x-direction (up and down arrows) are: [E4a - 4.2] and [E4c - 4.2]. Every single click on the mentioned buttons performs the rotation of 15 degrees. The last button [E4e - 4.2], placed in the center of the

Rotation

panel resets the model to the initial view.

It should be mentioned that rotation can be also performed directly on the

Model 3D

panel using the VTK-dened mouse actions. Following the

VTK User's Guide [17], when the mouse's left button is pressed the rotation is in the direction dened from the center of the renderer's viewport towards the mouse position. It means that if, for example, user has pressed the mouse's left button over the right side of panel with 3D model, the model will rotate to the right.

98

Zoom Our 3D model can be also enlarged and shrunk (zoom -in and -out). Zooming is a simple closing up and out of a camera. The

Zoom

panel [E5 - 4.2]

(Figure 4.5) provides such a functionality. Button with a plus sign [E5a - 4.2] zooms in the model, while minus sign button [E5b - 4.2] performs zooming out. Every single click on this buttons performs respective view change of 10%. The third button [E5e - 4.2], as it was in case of rotation, resets the model to the initial view. VTK provides also built-up mouse actions for zooming. Following the VTK User's Guide [17], the model is zoomed in when the mouse's right button is pressed and if mouse position is in the top half of the viewport. Analogically, zoom out appears if the mouse position is in the bottom half.

Cutting widgets The last panel,

Widgets visibility

panel [E6 - 4.2] (Figure 4.5), contains of

three buttons, each one for one plane direction. Every single click on the button, changes the visibility (on/o) of the cutting plane (widget) on the 3D model rendering window for respective direction. Initially, all the widgets are invisible. Pressing the button switches the visibility of the selected widget to ON and it starts to be visible until the same button is pressed again. Optional way to show the widgets on the 3D rendering window is to press the keyboard keys: 'x', 'y' or 'z' respectively to the required plane widget. It will work only if the mouse is placed over the rendering window. Widgets on the 3D model (Figure 4.4) play a very important role in an image data analysis. They allow to show the cross-section of the model in any of three principle plane directions. The functionality of the widgets is closely connected to the manipulation of the 2D images and will be described in more details in the next subsection.

4.2.3

2D images manipulation

According to previous subsection, the widgets on the 3D model allows to cut the model in order to see the 2D cross-sectional views. Such 'cuts' of the 99

Fig. 4.6:

This panel presents three cross-sections (cuts) of the 3D model made by widgets. Each of the cuts has its own slider used to move the widget along the respective axis. The light-green circles represent pathologies. Red circle is an active (selected) pathology.

3D model are presented in the

Slices 2D

panel [E7 - 4.2] (Figure 4.6). It

contains of three panels with two-dimensional images (slices), each for one cutting widget. Looking at them from left to right, they present the cut images taken from x-, y- and z- widgets, respectively. Each slice has its own, separate slider, which allows movements of the widget along one axis of the model [E8 - 4.2]. The numeric scale is attached to the sliders and represents the number of slices, which can be seen by the user. The number of slices for every widget is calculated during the process of reading the stack image data (section 3.2.2). For z-plane direction, the original CAT images are shown, and since in case of the sample

test-photos

CAT stack there are 87 CAT scan slices, this

is the number shown on the z-axis slider. In case of x- and y- cross-sectional views, since they are generated according to the original, stack slices and they have in this sample case the resolution of 512x512 pixels, there are 512 possible slices to be observed in these plane directions. This concludes the section about the analysis of the medical images. The detailed description of the view manipulation methods in MIAWARE for both, 3D model and 2D cross-sectional views, was given here. The provided tools appear to be sucient for ecient pathology searching and thorax ex-

100

Fig. 4.7:

The conrmation dialog, which appears when user desires to add a new pathology description to the previously selected point on the 2D slice

amination. According to the contributions of this thesis given in the introductory chapter, MIAWARE integrates the tools for analysis of the medical images together with `on the y' pathology reporting tools. These MIAWARE software feature is described in the following section.

4.3 Reporting over lung pathologies A radiologist performs the analysis of medical images in order to nd some pathological changes or disease signs in the examined body. Such ndings must be reported for the further doctor consultation. MIAWARE provides the tools for marking any found critical point (pathologies) on the specic 2D slice and attach to it any required information (pathology denition). Moreover, such an information can be saved permanently for a given stack of images (without its modication) and can be viewed during further, future analysis. Any earlier introduced data is not to be lost. It can be saved to the le

points.mwr

(using button

Save changes

[E9 - 4.2]), which is placed

together with the respective stack images (the same folder). If it is required to load some given stack images without previously dened pathologies, the

points.mwr

le should be removed from that folder or moved to another disk

location.

4.3.1

Dening pathologies

The pathology denition with MIAWARE is a very intuitive process. Radiologist (user) has to nd a 2D slice where the pathology is seen and to perform left mouse button click over that location. Consequently, the dialog (Figure 4.7) will pop-up asking for conrmation of the action to be performed. 101

Fig. 4.8:

The pathology reporting frame, which provides user with the set of options (comboboxes) to choose in order to dene (in a normalized way) the pathology name and its location in the lungs.

If conrmed, the

Pathology information

frame will appear (Figure 4.8) and

the description of the selected pathology can be added. The panel

Pathologies

[E10 - 4.2] presents a list of all pathologies dened,

showing theirs 3D coordinates on the model and a basic description. Process of adding a pathology description is normalized (in accordance with the contributions of this thesis).

Normalization is achieved by the

`choose-option' manner used for denition of the name of the pathology and its location in examined lungs. It means that, radiologist cannot add any description in his own words, but he can only follow and choose the options presented in the subsequent comboboxes. When nished, a new pathology is dened and saved into program memory. The pathology appears as a small red sphere on the 3D rendering window and it is placed in the respective location according to the previously selected point on the 2D slice. Moreover, all the widgets are placed in the location of the recently created point (pathology) and consequently, three 2D cross-sectional panels show the added pathology as a red circle. Usu102

Fig. 4.9:

This panel shows the list of all pathologies dened for one medical stack of images. The three dimensional coordinates (3D model location) of the pathologies are shown together with the pathology name and a lung location. The pathology marked in green is an `active' pathology and in that moment can be deleted (Delete button) or its description can be either viewed or edited (Edit/View button) .

ally, the pathologies are marked with a light green color (Figure 4.6). If any pathology appears in red it means that it is `active' pathology and all the widgets intersect in a given, selected point. If any pathology is `active' it is also selected on the pathologies' list in the panel

Pathologies

(Figure 4.9).

It can be seen in Figure 4.2 that a pathology (Liquid) that was found in the left lung is `active' since it is selected on the list (in green) as well as on the 2D slices and 3D model (in red).

4.3.2

Viewing and editing pathology descriptions

During the pathology denition process (pathology reporting) it is possible that previously added descriptions should be consulted, edited or even deleted. In order to perform any operation on the previously dened pathology, it must be set as `active'. Activation of the pathology can be done in two ways. Firstly, any pathology visible on the 2D slice (as a light green circle) can be activated through left mouse button click on it. Secondly, the user can select required pathol103

Fig. 4.10:

This frame presents the information associated with the point (pathology) selected by the user.

ogy on the

Pathologies

list. If any pathology is active, it can be deleted or

its description can be either viewed or edited. The button

Delete

[E11 - 4.2] removes an active pathology. This action

must be conrmed by the user in the special pop-up conrmation dialog. Another button,

Edit/View

[E12 - 4.2] allows to the user to view the

detailed information about the active pathology. Such a description appears in a new

Pathology information

frame (Figure 4.10). In that moment, user

can decide to edit the previously inserted information by clicking Edit button. In that moment the pathology reporting frame appears (Figure 4.8) in order to redene the pathology. It must be mentioned that all the changes performed from the beginning of the software execution are not saved automatically. The

Save changes

button [E9 - 4.2] or the keyboard shortcut CTRL+S must be pressed in order to save permanently all the modications.

104

Fig. 4.11:

This dialog asks user for the denition of the specic disk location, where the generated TXT and RDF reports should be placed, as well as the common name for them.

4.3.3

Generating medical reports

Finally, after the close image analysis and pathology denition the entire report over the examined stack of medical images can be generated. Following the section 3.3, MIAWARE generates the reports in two formats: ordinary TXT format and RDF format, which is used for further report processing. Since all the pathologies together with their descriptions are kept in the program memory, the button

Generate reports

[E13 - 4.2] can be pressed

in order to obtain the aforementioned reports. In that moment the

report location and name

Choose

dialog (Figure 4.11) pops-up where user can dene

where the generated reports should be placed and moreover, he can choose the specic name for the reports. Both reports, TXT and RDF are not allowed to get dierent names. Actually, user does not have to dene a specic location and name for the reports. As it can be seen in Figure 4.11, the default reports folder is set to

MIAWARE-reports.

Moreover, the unique name for reports is generated

using the current system date and hour. Finally, if OK button was pressed, the reports are to be generated and the message dialog pops-up informing the user about the successful report generation (Figure 4.12).

105

Fig. 4.12:

This dialog informs the user that the TXT and RDF reports were generated with success in the specied disk location with the chosen name.

4.4 Searching for the medical reports The last, but not least feature of MIAWARE software is an intelligent search of previously created medical reports according to some given criteria. Following the section 3.4, MIAWARE search engine does not lter reports using ordinary, lexical way, but it uses logical deduction. Figure 4.13 presents the graphical user interface of the MIAWARE ontology-based search engine. It can be executed as a separate program (MIAWARE-OB-SEARCHv1.0.exe le) or as a part of main MIAWARE program. In the second case, the reference to the search engine can be found and executed by

based search

Tools → Ontology-

menu [E14 - 4.2] or simple keyboard shortcut  CTRL+F.

In order to perform a search, rstly, a path to the folder containing medical RDF reports must be specied by the user in

RDF reports folder

path panel [E1 - 4.13]. Initially, the default folder with the reports is set to MIAWARE-reports (the same as it was for creating of reports with MIAWARE  section 4.3.3). Afterwards, the user must dene what kind of pathology is looking for in the medical reports. It is done in the

Search criteria

panel [E2 - 4.13] by

selecting any pathology name from the given comboboxes. It is enough to select only the pathology type in upper combobox [E2a - 4.13], where two general types of pathologies are dened: General or Neoplastic process. In this case, all pathologies, which are available in the lower combobox [E2b 4.13] will belong to the search criteria. The last step is a selection of a lung location, where the dened pathology

106

Fig. 4.13:

Fig. 4.14:

This frame represents the graphical user interface for ontology-based search engine. Herein, the medical reports can be ltered according to some given search criteria (pathology name and its location in lungs).

This frame presents the content of the selected medical TXT report generated by MIAWARE software. 107

must exist in order to match search criteria. It can be done in the

ontology-based location

Finding

panel [E3 - 4.13]. This is simple taxonomy represen-

tation of the classes from MIAWARE lungs ontology. The search engine will look for the pathologies not only in the selected location, but also in any of its subclasses and subparts dened by ontology (see section 3.3). Finally the

Search

button [E2c - 4.13] has to be pressed. The results

of the search are presented in the

Search results

panel [E4 - 4.13]. In this

panel a number of reports fullling the search criteria, together with a recent search query, is presented. Moreover, there is a list [E4a - 4.13], where all the report names, which passed the search algorithm, are displayed. Any of such reports (in both formats: TXT and RDF) can be viewed by pressing the buttons

Show TXT report

[E4b - 4.13] or

Show RDF report

[E4c - 4.13]. It

is enough to select a report and then to press one of the aforesaid buttons. A new frame will pop-up showing the content of the selected le (Figure: 4.14). This concludes the present chapter. The general summary and conclusions over the MIAWARE software performance and development process are claried in Chapter 5.

108

5.

CONCLUSIONS

This thesis describes a medical software, which can be used by radiologists and doctors during the analysis of the computed-axial tomography (CAT) scan images. MIAWARE application allows to visualize radiological images as a three-dimensional model generated from the CAT scan image stack. Moreover, two-dimensional images (cuts) of the 3D model are easily generated in order to facilitate the medical analysis. The obtained model can be easily manipulated, viewed at any angle and nally resliced by the cutting planes in the direction of any axis of the 3D coordinate system. MIAWARE contributes also to the lungs pathology denition and its further reporting. It provides an intuitive way for marking the pathological changes on the obtained images (slices) and reporting its properties and characteristics. The characteristic feature of MIAWARE reporting scheme is seen through the fact that it does not permit radiologist to describe remarks in his own words. A medical vocabulary database is provided and consequently, pathologies can be described using only such vocabulary set. As a result, the normalized reports are obtained as a nal output, what signicantly improve their further computer processing. In my opinion, normalization of the medical reports, can also improve the work between radiologists and doctors. A doctor will be able to understand faster and better the radiologist's remarks if it is written in the normalized language, what can lead to better diagnoses and faster disease recognition. The last MIAWARE quality is represented by the intelligent search module for MIAWARE medical reports, developed on the basis of the ontological engineering. Firstly, such search engine allows rapid ltering of the medical reports according to the pathologies dened in there. After a detailed lungs structure analysis, a hierarchic structure of their parts was prepared and

developed (ontology). Providing MIAWARE search engine with the knowledge about the parts relationship in the lungs, it is able to deduce internal elements of the specied lung part and to perform report searching of the pathologies not only in the determined lung location, but also in its subparts. This can actually be described as a logical searching of pathologies in the medical reports. The development of the aforementioned MIAWARE features was a time consuming process with many integration boundaries and implementation problems encountered. Fortunately, thanks to the World Wide Web communities together with the bibliographical positions, it was possible to nish successfully the presented project and reach all the objectives circumscribed at the beginning. I must admit that this work and the nal performance of the MIAWARE project gives me lot of inner satisfaction and contentment. I have exercised a lot of dierent techniques applicable in that practical programming project and every subsequent problem, when solved, transformed into a valuable experience for the future. Finally, I would like to express my hope and willingness for further MIAWARE software development in order to contribute, more and more signicantly, to the medical software engineering area. After the research and development of MIAWARE project, I feel more dedicated to the medical area and I am considering seriously the continuation of my professional career exactly in that eld.

110

BIBLIOGRAPHY

[1] CodeProject.

The code project webpage, 2007.

http://www.

codeproject.com/. [2] Harvey M. Deitel.

Java: How to Program.

Prentice Hall Professional

Technical Reference, 2002. [3] Maureen Donnelly, Thomas Bittner, and Cornelius Rosse. A formal theory for spatial representation and reasoning in biomedical ontologies. 2006. [4] Nick Drummond. Owlviz - ontology visualization tool webpage, 2007.

http://www.co-ode.org/. [5] Asuncion Gomez-Perez, Oscar Corcho, and Mariano Fernandez-Lopez.

Ontological Engineering : with examples from the areas of Knowledge Management, e-Commerce and the Semantic Web. First Edition (Advanced Information and Knowledge Processing). Springer, July 2004. [6] Structural Informatics Group. Foundational model of anatomy specications - webpage, 2007.

http://sig.biostr.washington.edu/

projects/fm/AboutFM.html. [7] Nicola Guarino and Christopher A. Welty. A formal ontology of properties. 2000. [8] Nicola Guarino and Christopher A. Welty. Evaluating ontological decisions with ONTOCLEAN. 2002. [9] Nicola Guarino and Christopher A. Welty. An overview of ontoclean. 2004.

[10] Nicola Guarino, Christopher A. Welty, and Christopher Partridge. Towards a methodology for ontology based model engineering. 2000. [11] Hewlett-Packard.

Jena framework webpage, 2007.

http://jena.

sourceforge.net/.

A Practical Guide To Building OWL Ontologies With The Protege-OWL Plugin. University of Manchester, 1 edition,

[12] Matthew Horridge.

June November 2004. [13] HowStuWorks.

How mri works webpage, 2007.

http://www.

howstuffworks.com/mri7.htm. [14] Southern Health Diagnostic Imaging. Mri scanners, 2007. http://www.

southernhealth.org.au/imaging/mri_mmc_equip.htm. [15] Imaginis.

Cat work principles, 2007.

http://www.imaginis.com/

ct-scan/how_ct.asp. [16] Stanford Medical Informatics.

Protégé webpage, 2007.

http://

protege.stanford.edu/. [17] Kitware.

The VTK User's Guide.

Kitware, 2006.

[18] Kitware. Cmake web page, 2007. http://www.cmake.org. [19] Olivier Holger Knublauch. Weaving the biomedical semantic web with the protege owl plugin. 2004. [20] Grzegorz Kowal.

Launch4j webpage, 2007.

http://launch4j.

sourceforge.net/. [21] Williams & Wilkins Lippincott. Representation of human lungs struc-

http://connection.lww.com/products/smeltzer9e/ images/figurelarge19-4.gif. ture, 2007.

[22] Ken Martin, Will Schroeder, and Bill Lorensen. Vtk web page, 2007.

http://www.vtk.org. 112

[23] Deborah L. McGuinness and Frank van Harmelen.

Owl web ontol-

ogy language overview (w3c) webpage, 2007. http://www.w3.org/TR/

owl-features/. [24] On Topic Media. Sport talk - mri knee reconstruction, 2007. http:

//www.sporttalk.com.au/knee-reconstruction-mri/. [25] MedicineNet.

Medicinenet.com webpage,

2007.

http://www.

medicinenet.com/. [26] José L.V. Mejino Jr, Augusto V. Agoncillo, Kurt L. Rickard, and Cornelius Rosse. Representing complexity in part-whole relationships within the foundational model of anatomy. 2003. [27] Joshua Michael, José L.V. Mejino Jr, and Cornelius Rosse. The role of denitions in biomedical concept representation. 2001. [28] Sun Microsystems. Java webpage, 2007. http://java.sun.com. [29] Sun Microsystems. Sun's java tutorial, 2007. http://java.sun.com/

docs/books/tutorial/. [30] NetDoctor. Netdoctor.co.uk - webpage, 2007. http://www.netdoctor.

co.uk/. [31] National Institutes of Health.

Imagej webpage, 2007.

http://rsb.

info.nih.gov/ij/. [32] Radiological Society of North America. Rsna - radlex term browser webpage, 2007. http://radlex.com/radlex/. [33] GmbH & Co. Racer Systems. Racer ontology reasoner webpage, 2007.

http://www.racer-systems.com/. [34] Konzept Design Realisierung Rilogistic. Rilogistic webpage, 2007. http:

//www.rilogistic.com/files/. [35] Rod R. Seeley, Trent D. Stephens, and Philip Tate.

iology.

McGraw-Hill Higher Education, 2005. 113

Anatomy and Phys-

[36] Nicholas M. Short Sr. Nasa remote sensing tutorial, 2007. http://rst.

gsfc.nasa.gov/Intro/Part2_26c.html. [37] Michael Sintek. Ontoviz - ontology visualization tool webpage, 2007.

http://protege.cim3.net/cgi-bin/wiki.pl?OntoViz. [38] Barry Smith. State university of new york at bualo - department of philosophy webpage, 2007. http://ontology.buffalo.edu/. [39] SourceForge.net. Imagej plugins webpage - java vtk examples, 2007.

http://ij-plugins.sourceforge.net/vtk-examples/index.html. [40] Wikimedia-Foundation. Description logic - wikipedia webpage, 2007.

http://en.wikipedia.org/wiki/Description_logic. [41] Wikimedia-Foundation.

Ontology in computer science (denition) -

wikipedia webpage, 2007. http://en.wikipedia.org/wiki/Ontology_

(computer_science). [42] Wikimedia-Foundation. Theory of forms - plato - wikipedia webpage, 2007. http://en.wikipedia.org/wiki/Theory_of_forms.

114

APPENDIX

A. HOW TO BUILD VTK ON WINDOWS WITH JAVA SUPPORT

ABSTRACT

I have decided to write this document to ease up to the other users the process of (sometimes painful and frustrating) building the VTK on Windows with Java support. The date of creation of this document: October 2, 2006. Of course, this is only mine experience, so unfortunately I can't assure and give any warranty to anybody that all of my remarks given here are correct, but for sure, such conguration and such steps resulted that the VTK is working correctly with Java on my computer. The details refer to my own experience with building VTK-5.0.2 source on Windows XP Service Pack 2, in order to use it with Sun's Java Development Kit 1.5.0_08 (JDK 1.5.0_08) support on Eclipse SDK 3.2 environment.

A.1 Required downloads and software installation A.1.1

VTK source download

The VTK source we can download from the VTK ocial site [22] . Then we proceed by entering the Download sub page and downloading the latest release of VTK for Windows (in my case it was vtk-5.0.2.zip). BE CAREFUL!!! If you want to use VTK with Java support you must download

the source,

not Windows installer. We can enable the Java wrapping only during the manual compilation of VTK - with Windows Installer it's impossible.

It is very important. You must unpack your vtk-[version].zip le on the NTFS partition. With FAT32 I had some serious errors during compilation process A.1.2

CMake download and install

To compile VTK I was using CMake software. CMake controls the software compilation process using simple platform and compiler independent conguration les. For Microsoft and Borland compilers there are pre-compiled binary. You can download the software from the CMake's webpage [18]. In my case, I got the most recent version (2.4).

Install CMake soft on the NTFS partition also (I'd recommend the same partition as the VTK). I suppose it could not work on FAT32. A.1.3

C++ compiler installation

I used the Microsoft Visual Studio 2005 compiler. I know that it can be also Borland but I didn't try it.

A.1.4

Java SDK download and installation

You can download it from the Sun's webpage: http://java.sun.com/. MUST be

It

Java Development Kit (JDK) not Java Runtime Environment

(JRE)!! Then there are various options to install it. You can enter the Sun's webpage, follow the link Java SE in the Download area and then look for 118

the JDK (the most recent) and download it. Check it out there! You can download also the source les in order to use them later in Eclipse.

A.1.5

Eclipse download and installation

You can download the software from the http://www.eclipse.org. I downloaded Eclipse-SDK-3.2, the most recent stable version. You can need also the EMF and GEF plugins as well as the Visual Editor (VE) for creating advanced GUIs. All of these you will nd on the same webpage. The installation of Eclipse is very easy as you don't need to install anything. You need to unpack the downloaded zip le, then copy some additional plugins and features to the respective folders in the Eclipse path and start the Eclipse with the Eclipse.exe le. For more info about the adding another features you can refer to the Eclipse webpage.

A.2 Compiling the VTK source with CMake As we have the C++ compiler installed, CMake installed, the VTK source unpacked and Java SDK installed in the our system we are able to begin with the VTK source compiling. We start with executing the CMake executable le. There will appear window as shown in gure A.1 on page 120. Firstly, we must specify the path where the VTK source is placed (the our unpacked directory). In my case it was: E:\vtk-5.0.2\VTK. Then we must create the folder for the binaries of VTK. I have chosen the name for it as follows: VTKbin. Thus, the path was: E:\vtk-5.0.2\VTKbin. Enable the Show Advanced Values option - it is necessary for the later conguration. Finally, we press the Congure button, praying for the execution without any error - just joking, I am almost sure (if you do it on the NTFS partition) that there will be no errors. In the meantime, before the conguring, the CMake will ask you about the C++ compiler you are going to use later to build the conguration. In this case I selected in the combo-box the Visual Studio 2005 option. After accepting, the conguration starts and is going to take a while. Afterwards we will see the variables and values found in the CMake cache as depicted in gure A.2 on page 120. 119

Fig. A.1:

Fig. A.2:

CMake window before conguration

CMake window before customization 120

Now, we are customizing our build.

In order to compile successfully

you must enable (set to ON) the following options: VTK_WRAP_JAVA, BUILD_SHARED_LIBS, and VTK_USE_RENDERING. These are the options, which are almost always necessary. Another time, you can always open CMake and enable more options if you require. After enabling and disabling all the options you consider as necessary, you must press Congure button one more time (or more times). You must do it until you reach that all the variables and values are not any longer in red. Then you can generate selected build les by pressing the OK button. This will cause CMake to write out the build les for the build selected. After that, the CMake will exit. In case of any error appearance (at any moment), I recommend you not to carry on with the next step of this tutorial, but to repeat all the actions described here. It is easier to repeat it than to be disappointed afterwards (the next step takes a longer while).

A.3 Building the conguration in C++ compiler (Microsoft Visual Studio 2005) Now, we must build the entire conguration in the C++ compiler. We do it in the following manner. We must nd the location in Windows of our already created binaries. Open this folder in a normal Windows window. In my case the path was: E:\vtk-5.0.2\VTKbin. We are there looking for the project called ALL_BUILD. In my case it was the le ALL_BUILD.vcproj. Double-click then and you will simultaneously enter to your's earlier chosen C++ compiler environment (in my case - the Visual Studio 2005) as depicted on the gure A.3 on page 122. Single-click on the Solution VTK (look at the screenshot above) and then

Build Solution.

choose the option Build

We can dene before the Active

Conguration of the build. I used the default one (Debug) but there are also Release, MinsizeRel and RelWithDebInfo modes. The build is the longtime process.

At the end of it we expect to not to have any error and

afterwards usually all the libraries and executables are located in the VTK 121

Fig. A.3:

Visual Studio 2005 screenshot

122

binary directory specied before (E:\vtk-5.0.2\VTKbin). Unfortunately, you will encounter some errors after the build. In my case there were problems with the build of the Java les. The compiler didn't compile any of my *.java les. In this apparently bad situation, it is the great information if you have only such the errors. It is very easy to repair it. You must only compile all the Java classes with any Java compiler. I did it with Eclipse, but you can do it even with the Windows Command Line. In the next point I will describe how I compiled it in Eclipse and how to start the rst Java-VTK application. The very important thing you must do yet is to edit your PATH environment variable. You turned on the option BUILD_SHARED_LIBS thus you must let Windows know where to nd the DLLs. I used the Debug conguration thus I added to the PATH variable the following path: E:\vtk5.0.2\VTKbin\bin\debug. It is recommended that you add this entry in the beginning of the PATH denition. Moreover, as the Java JDK and VTK has been installed, you need to set your CLASSPATH environment variable to include the VTK classes. You must include vtk\java directory. In my case the path was: E:\vtk5.0.2\VTKbin\java\.

A.4 Conguration of Java environment in Eclipse You can build the Java classes in very basic and manual manner. Firstly, you have to localize them. In my case I found them in the path: E:\vtk5.0.2\VTKbin\java. Inside this folder there is another folder vtk which contains all the Java source classes (*.java). You only need to compile them to obtain the *.class les. The vtk folder is like the Java package. Thus, you must create a new project in Eclipse (File

 New Project Java Project).

The name of the project is not important (in my case it was New) as you are going to use it only to compile the classes. After creating the new, empty project in Eclipse, add the source folder src. Then copy (do not cut it better) the whole folder vtk from the java folder (E:\vtk-5.0.2\VTKbin\java) to the newly created project in the Eclipse workspace (in my case the path 123

was: D:\Program Files\eclipse-SDK-3.2\eclipse\workspace\New\src). Then come back to Eclipse environment, right-click on the New project and choose the Refresh option. Afterwards, you will have all your classes built to the *.class les in your Eclipse project path, in the folder bin. Now you must copy all the *.class les to the your vtk folder in the binaries VTK (E:\vtk5.0.2\VTKbin\java\vtk). Now, you can create the *.zip archive, packing the vtk folder into it. It will contain both, source and binary les. In my case I created vtk.zip archive.See compact version of steps: 1. Unzip your vtk.zip package 2. Create in Eclipse empty project (any name) 3. Add to this project in Eclipse, the Source folder called src 4. Copy the unpacked vtk folder to the src folder 5. Click right on the project in Eclipse and press Refresh (Eclipse builds the les automatically to folder bin/vtk in Eclipse) 6. There can be one error in this package (for VTKJavaWrapped le) so delete it 7. Copy all the les from bin/vtk to src/vtk replacing all the les with the same name 8. Zip the vtk package from /src/vtk path to the le vtk.zip 9. Include this package to any VTK project and... SUCCESS, it should work! Now, you can run your rst Java - VTK application. Let's say you want to run the Cone.java example given by the VTK. You can nd it in the VTK folder, which in my case was: E:\vtk-5.0.2\VTK\Examples\Tutorial\Step1\Java

124

Therefore, create the new project in Eclipse called Cone. Then add the source folder as earlier and copy the Cone.java into it. Refresh the project under Eclipse. Now, you will need to include the vtk.zip archive containing all the VTK classes into the project. Just right-click on the project, choose Properties and then in Java Build Path press Add External Jar button. Find you vtk.zip archive and accept. Now you can compile and run your rst JavaVTK application without any problem.

A.5 Summary You can nd more information about installing VTK on UNIX from webpage `Building VTK on Linux with Java Support': http://www.duke.edu/∼ iwd/howto/VTK-Linux-Java_HOWTO.html Any Java with VTK examples you can nd on [39]. Some addition to the previous ones: the 3D data viewer based on VTK and Java called CASSANDRA, look at: http://dev.artenum.com/projects/cassandra I would be very grateful with any feedback and comments. Please do not hesitate to send me an e-mail ([email protected]). With your suggestions this manual can be better and better. Send me also some useful links connected with VTK. Thanks in advance.

125

B. ARTICLE ABOUT MIAWARE SOFTWARE SUBMITTED TO BIOSTEC 2008 - INTERNATIONAL JOINT CONFERENCE ON BIOMEDICAL ENGINEERING SYSTEMS AND TECHNOLOGIES

MIAWARE SOFTWARE 3D Medical Image Analysis With Automated Reporting Engine and Ontology-based Search Bartłomiej Wilkowski Department of Microelectronics and Computer Science, Technical University of Ł´od´z, al. Politechniki 11, Poland [email protected]

´ Oscar Pereira, Paulo Dias Institute of Electronics and Telematics Engineering of Aveiro , University of Aveiro, Campus Universitrio de Santiago, Portugal [email protected], [email protected]

Keywords:

Computed axial tomography; Ontology; Radiological report; Image visualization;

Abstract:

This article refers to MIAWARE software (Medical Image Analysis With Automated Reporting Engine), which was designed and developed for doctor/radiologist assistance. It allows to analyze an image stack from computed axial tomography scan of lungs (thorax) and, in the same time, to mark all the pathologies on the images and report their characteristics. The reporting process is normalized - radiologist cannot describe pathological changes with his own words, but can only use some terms from a specific vocabulary set provided by the software. Consequently, a normalized radiological report is automatically generated. Furthermore, MIAWARE software is accompanied with an intelligent search engine for medical reports, based on the relations between the parts of the lungs. A logical structure of the lungs is introduced to the search algorithm through the specially developed ontology. As a result, a deductive report search was obtained, which may be helpful for doctors while diagnosing patients’ cases.

1

INTRODUCTION

The major objective of this article is to present MIAWARE software, which enables doctors and radiologists to carry out an examination of the patient’s lungs state through the close analysis of the computed axial tomography images and then, in parallel, to perform health state reporting process. Secondly, an intelligent search engine for medical reports is presented, together with all its advantages over the ordinary searching schemas. The screenshot of the MIAWARE’s application graphical user interface is shown in Figure 1. MIAWARE is the software prepared completely in Java programming language together with some embedded native code wrappers used. Nowadays, it is very common that a radiologist performs the analysis of the radiological images in its own, favourite manner. Some of the radiologists report all pathological changes encountered in the radiological images speaking to the microphone and recording their voice. Afterwards, the recorded tape is listened out and a medical text report is produced. Another radiologists write reports alone in the mo-

ment of performing analysis. There can be found some serious shortcomings in such reporting schemas, which may affect the accuracy of the medical diagnoses. The main problem is that reports differ in structure from radiologist to radiologist. Every human has different way of thinking, different way of expressing things, remarks and observations. It means that given the same medical data, the same patient’s case to many radiologists in order to make analysis, it can and will, almost surely, produce many different reports with different layouts and various observations on the patient’s health. As a result, a doctor may interpret each of such reports differently, what is surely not desired. This is the reason why MIAWARE software’s main objective is to generate medical reports in a normalized way. Such reports should contain only of pure medical data, which describes in details the encountered pathologies using a standarized layout, which will be always maintained the same. The radiologist does not use his words in order to report a pathology, but oppositely, he fills up a provided reporting form by choosing suitable medical terms. Moreover, pathology reporting in MIAWARE is

Figure 1: The graphical user interface of the MIAWARE software.

2

performed in the moment of image analysis using interactive graphical user interface. Radiologist can mark the location of the pathology on the image and associate with this point a necessary description. This allows him to be always concentrated on the images. Finally, a normalized medical report over all pathological changes is generated by the software. The full description of how a normalized reporting process is performed with MIAWARE software is described in details in section 3.

VISUALIZATION OF IMAGES

The visualization of the CAT scan stack images and 3D model creation is performed using the Visual Toolkit (VTK). VTK is made in C++ language, but it provides suitable wrapper classes for Java. Moreover, ImageJ software classes are used in order to obtain properties of the analyzed CAT image stack. MIAWARE graphical user interface provides the 3D view of the radiological stack of images. This is generated using VTK wrapper classes, which create a special pipeline. After loading image data into memory, a contour filter is applied to it followed by proper mapping of polygonal data and graphics primitives. Finally, an 3D actor is added to the special panel, which actually is a rendering window for threedimensional scene. The visualization in MIAWARE consist also of three 2D cross-sectional image views. They are generated by three widgets, present on the 3D scene,

Normalization of the reports improves significantly its further processing possibilities. One example can be the developed search engine for the MIAWARE medical reports. An efficient search engine for medical reports can be considered as very useful and may help the doctor while making diagnoses. Further sections will describe in details the architecture of the MIAWARE software and the functionality of its modules.

128

Step 1. 2. 3. 4. 5. 6.

Combobox title Morphophysiological process Neoplastic process Location Left lung location Left lung upper lobe location Left lung upper lobe lingula location

Selected value Neoplastic process Mass Left lung Upper lobe Lingula Superior segment

Table 1: Example pathology definition steps in MIAWARE.

which are able to cut the model in three plane directions and provide 2D image data for the crosssectional views. Widgets can be easily moved by the radiologist along its respective direction axis in order to perform model cutting. It should be mentioned, that CAT scan provides the radiologist with image stack in axial plane. The image data for two other 2D planes and the 3D model are obtained and rendered by the software after a proper initial stack data processing. Finally, the panels, which display cross-sectional views of the model are enhanced with a very important feature. Radiologist is able to mark any pathology, encountered during the analysis, directly on the 2D view by a simple left mouse button click over that location. The clicked point is automatically marked on all three cross-sectional views (as a yellow circle) and 3D scene (yellow sphere). Afterwards, the radiologist is able to attach precise information and description of that physiological change to the marked point. The description of how the pathology information is defined and added to the specified location is presented in section 3. It should be also remembered that all the pathologies defined by the radiologist can be saved to the disk and retrieved during further analysis of the same CAT stack.

3

but can use only the specific medical vocabulary provided by the application. Consequently, MIAWARE software is able to create normalized medical reports according to the information about all pathologies introduced earlier by the radiologist. Arrangement and selection of the vocabulary was made after the consultation with doctor Miguel Castro working in hospital in Beja (Centro Hospitalar do Baixo Alentejo - Hospital Jos´e Joaquim Fernandes de Beja) and RadLex (A Lexicon for Uniform Indexing and Retrieval of Radiology Information Resources) term browser, which can be found on the Radiological Society of North America web page (RSNA.org, 2007). RadLex term browser was created in order to unify the radiological vocabulary used during image analysis and reporting procedures. The entire vocabulary is kept in the XML file together with a declaration of the vocabulary for all comboboxes (set of medical terms), which are presented to the radiologist during the pathology definition. A vocabulary set presented in any subsequent combobox is dependent of a previous radiologist’s choice. For example, if radiologist has defined that the pathology is located in the left lung, the next combobox will offer him to choose all subparts (lobes) of left lung. The example pathology definition steps in MIAWARE is presented in Figure 1. When the analysis of the CAT stack is finished, radiologist is able to generate a final medical report over all the pathologies already defined. It is done by pressing the Generate reports button. This action produces reports in two formats: plain text format (TXT) and RDF, computer understandable format. The first one can be verified and analyzed later by the doctor in order to make diagnosis. It is easily visible that the generated text report has a defined structure and its layout differs significantly from the recently created reports. The format of medical reports requires still some discussion over its layout and the ways how it should be created. MIAWARE text report format is only a suggestion, which is intended for further improvement and development. A short fragment of the sample MIAWARE text report is presented here:

PATHOLOGY REPORTING

As it was mentioned in the introductory part of this article, the reports generated with MIAWARE software are normalized. This is achieved thanks to a special pathology reporting form implemented in this software. According to the previous section, radiologist is able to mark any location on the 2D image in order to define and describe the encountered pathology. Such an information is added through a combobox-based form, which provides radiologist with medical terms necessary for an effective name, type and pathology location specification. The most innovative here is the fact that unlike to the present habits, the radiologist cannot describe those findings with his own words,

**** MIAWARE REPORT ******* Generation date: Jun/27/2007

129

as normal, lexical group of sentences describing any pathology found. For example:

***************************** Control Point no. 1 : (x,y,z) = (178,282,52) Specifications: Morphophysiological process: General process General process: Peribronchial condensation Location: Right lung Right lung location: Lower lobe Right lung lower lobe location: ->Lateral basal segment

‘A morphophysiological process was found. It is in the form of a neoplastic process of the type mass. It is located in the left lung, in its upper lobe, exactly in the superior segment of the lingula.’ Such a group of sentences can be represented as resource-property-statement model and it is used in MIAWARE medical RDF reports. In this case, the first underlined word is a resource and the rest is a statement. As our statement consists of group of resources it has to be analyzed further. Then the first resource of the previous statement is a resource and the rest group another statement. Such embedded structure of the resource-property-statement is created through RDF reified statements. It should be mentioned that the properties (which connect resources with the statements) in the above example are: in form of, of the type, etc. The RDF reports generated by MIAWARE software keep the pathology information in the manner presented above. It should be only mentioned that the role of properties in our reports play titles of the subsequent comboboxes. These names are taken from the XML file used by pathology definition form, described in section 3.

****************************** Control Point no. 2 : (x,y,z) = (172,220,47) Specifications: Morphophysiological process: General process General process: Post-therapeutic alteration Location: Right lung Right lung location: Middle lobe Right lung middle lobe location: ->Medial segment ... ****** END OF REPORT ******

The second type of reports, this in RDF format, is created for further processing of its content (report searching). It is described in details in section 4.

4

MEDICAL REPORT SEARCHING

4.2 Ontology-based report searching This section describes the structure of the RDF medical reports and the MIAWARE search engine together with an ontology for lungs developed specially for this purpose.

It must be understood that radiological examinations are carried out quite often in such places as hospitals, private and public surgeries or any other medical institutions. As a result, it produces a great amount of medical reports in relatively short time. Such reports should be kept and gathered together for future usage as references to previously encountered and defined pathologies or diseases. Manual searching of great amount of documents is really time-consuming. The automated generation of reports in some specific file format easy to read for computer (e.g. RDF) is a milestone in development of state-of-the-art computerbased searching. Consequently, intelligent search engine of medical reports can significantly speed-up the disease recognition process, as, considering given criteria, it would immediately result in sets of references to the archive reports with similar pathological symptoms in other patients, the resultant diagnoses and applied treatments. The search engine for medical reports developed together with MIAWARE software is able to find all reports where exist some specified pathology defined in a lung part (specified as a search criteria) or any of its subparts. This adds some intelligence to the

4.1 RDF reports As it was already mentioned, the RDF format for medical reports is required for further information processing and searching. RDF model introduces description of resources by statements and its data model contains of three components: resources, properties and statements (called as triples). Resources are any datatype items, which can obtain any value definition (statement) through some given relation (property). Any statement can consist of a new triple resource-property-statement. “Just as an English sentence usually comprises a subject, a verb and objects, RDF statements consist of subjects, properties and objects” (Gomez-Perez et al., 2004). Table 1 represents one example of pathology definition. The final medical report will usually contain more such definitions grouped in some specific way. The data gathered in the Table 1 can be represented

130

searching process, what is explained on the simple example. Let’s suppose that a doctor wants to find all reports with a definition of a tumour (first search criterion), which had been found in left lung. Let’s have a report with two pathologies defined:

that such filtering of radiological reports may improve doctor’s diagnosis and speed-up his decisions.

5

• Polypus in Right lung

CONCLUSIONS

The presented software is only a prototype, which cannot be applied in real life yet. One of the reason for this is the fact that the vocabulary used during pathology reporting is not sufficient and requires significant expansion and redefinition. However, this software can be considered as a strong fundament for future development in order to achieve the final market product. The ideas presented herein are considered as a potential improvement for image-based medicine and radiological analysis course. MIAWARE software facilitates radiologists with simultaneous analysis of the CAT stack images and pathology reporting without looking away from the monitor. Consequently, the radiologist can be concentrated all the time on the examined images. Moreover, pathologies can be marked on the images and possess the necessary characteristics of respective pathology. Furthermore, the radiological reports generated with MIAWARE software are always normalized, keeping identical structure and layout independently on the person who performs the analysis. Such a normalization, may help the doctors in better understanding of the reports and it makes room for further report processing and searching. The intelligent search engine allows rapid medical reports filtering according to the pathologies defined in there. Providing MIAWARE search engine with the knowledge about the parts relationship in the lungs, it is able to deduce internal elements of the specified lung part and to perform report searching of the pathologies not only in the determined lung location, but also in its subparts. This can actually be described as a logical searching of pathologies in the medical reports. All the features presented by MIAWARE software can lead to the assumption that their implementation into real life may result in more efficient medical diagnosis and faster disease recognition process.

• Tumour in Lung lingula An ordinary lexical search will respond that this report does not match search criteria as the first pathology is not a tumour and the second pathology, which is a tumour, is not located in left lung. Oppositely, the MIAWARE search engine will accept this report as matching the given criteria, because it can deduce that Lung lingula is a subpart of the left lung. Such a logical deduction is performed by our search engine thanks to the lungs ontology, which defines and provides the part-whole relations between the elements of the lungs. This ontology was developed using Jena and Prot´eg´e software. Sample visualization of the taxonomy of the classes taken from our lungs ontology is presented in Figure 2.

Figure 2: Lungs ontology - hierarchy of classes (created with OWLViz (Drummond, 2007)).

During the ontology development, the set of articles (Mejino Jr et al., 2003) (Donnelly et al., 2006) (Guarino and Welty, 2000) (Guarino and Welty, 2004) (Guarino et al., 2000) (Guarino and Welty, 2002) (Michael et al., 2001) (Knublauch, 2004) and book positions (Gomez-Perez et al., 2004) (Horridge, 2004) referring to ontological engineering and medical ontology creation was used as a reference. Moreover, the information about anatomical structure of the lungs was taken from Anatomy and Physiology book (Seeley et al., 2005). The search algorithm takes as the criteria the name of the pathology and its location in the lungs. Next, it deduces from the ontology all the subparts of the given lung location and performs comparison of every single pathology description taken from any medical report (in RDF format) with the search criteria. If there exists at least one such a pathology definition which agrees with criteria, it deisplays respective report as a result. Consequently, the doctor can view and read such a report very easily. We suppose

REFERENCES Donnelly, M., Bittner, T., and Rosse, C. (2006). A formal theory for spatial representation and reasoning in biomedical ontologies. Drummond, N. (2007). Owlviz - ontology visualization tool webpage.

131

Gomez-Perez, A., Corcho, O., and Fernandez-Lopez, M. (2004). Ontological Engineering : with examples from the areas of Knowledge Management, e-Commerce and the Semantic Web. First Edition (Advanced Information and Knowledge Processing). Springer. Guarino, N. and Welty, C. A. (2000). A formal ontology of properties. Guarino, N. and Welty, C. A. (2002). Evaluating ontological decisions with ONTOCLEAN. Guarino, N. and Welty, C. A. (2004). An overview of ontoclean. Guarino, N., Welty, C. A., and Partridge, C. (2000). Towards a methodology for ontology based model engineering. Horridge, M. (2004). A Practical Guide To Building OWL Ontologies With The Protege-OWL Plugin. University of Manchester, 1 edition. Knublauch, O. H. (2004). Weaving the biomedical semantic web with the protege owl plugin. Mejino Jr, J. L., Agoncillo, A. V., Rickard, K. L., and Rosse, C. (2003). Representing complexity in part-whole relationships within the foundational model of anatomy. Michael, J., Mejino Jr, J. L., and Rosse, C. (2001). The role of definitions in biomedical concept representation. RSNA.org (2007). Rsna - radlex term browser webpage. Seeley, R. R., Stephens, T. D., and Tate, P. (2005). Anatomy and Physiology. McGraw-Hill Higher Education.

132