Enabling parallel computing of a brain connectivity map using the MediGRIDInfrastructure and FSL
Romanus Grütz1 (
[email protected]), Niels K Focke², Andreas Hoheisel³, Dagmar Krefting4, Benjamin Löhnhardt5, Fred Viezens6, Frank Dickmann1 1
Department of Medical Informatics, UMG
4
University of Applied Science Berlin
2
Department of Clinical Neurophysiology, UMG
5
Department of Information Technology, UMG
3
Professional Services, inubit AG
6
Study Deanery, UMG
Agenda Background and previous approaches Grid Environment Implementation •
Generic Workflow Execution Service
•
Wrapper Scripts
•
Workflows
Test Environment & Performance Conclusion & Outlook
PDP 2012 - Enabling parallel computing of a brain connectivity map using the MediGRID-Infrastructure and FSL
2
Medical Background Epilepsy •
is a disorder of the brain characterized by enduring predisposition to generate epileptic seizures […]
Epilepsy Research •
attempts to detect patterns of brain activities and structure
•
scans the brain of the patients by using MRI/DTI technology
Solution: Diffusion Tensor Imaging (DTI) 1
•
measures the diffusion rates and preferred directions of water molecules
•
based on MRI (not invasive tissue scan) 1
http://news.byu.edu/releases/archive08/Mar/concussion/2008WildeetalproofNeurologymildTBI%282%29.jpg
PDP 2012 - Enabling parallel computing of a brain connectivity map using the MediGRID-Infrastructure and FSL
3
Problem with DTI The cellular structure of the brain is much more smaller than the resolution of the scanners Regions / Voxels containing many crossing fibers Similarity of areas without water flow and with water flow in many directions (diffusion is isotropic)
No fiber(s): Isotropic diffusion
One fiber direction: Anisotropic diffusion
Many fiber directions: Isotropic diffusion
PDP 2012 - Enabling parallel computing of a brain connectivity map using the MediGRID-Infrastructure and FSL
4
Connectivity Map and the previous processing approach Connectivity Map •
depicts the degree of connectivity of each voxel within a MRI/DTI scan
•
degree of connectivity sum of the probabilities to be connected to another voxel
•
based on Fiber Tracking (based on DTI) in-vivo tracking of (neural) fibers in tissues e.g. the brain
•
usually applied in the neuroimaging sciences
Previous processing approach •
Hardware:
4 standard personal computers (PC)
•
Software:
Linux, SGE, FSL, Self-made Web-Frontend and scripts
PDP 2012 - Enabling parallel computing of a brain connectivity map using the MediGRID-Infrastructure and FSL
5
FMRIB Software Library (FSL) Library of analysis tools for FMRI, MRI and DTI brain imaging data Mainly developed by the University of Oxford as free for non-commercial use software
FMRIB’s Diffusion Toolbox (FDT) •
Bedpostx
estimates diffusion parameters
•
Probtrackx
generates connectivity distributions
FSLUTILS •
Fslmaths
enables simple mathematical operations on images
•
Fslslice
slice a 3D image into 2D images
•
Fslmerge
combines multiple 2D images to one 3D image
PDP 2012 - Enabling parallel computing of a brain connectivity map using the MediGRID-Infrastructure and FSL
6
Generic Workflow Execution Service (GWES) Developed by Fraunhofer FIRST (not free) Based on the Generic Workflow Description Language •
based on High-Level Petri Nets (HLPN)
•
represented in XML
Capable of performing basic instructions (+, -, …)
Elements of distributed algorithms, Wolfgang Reisig, Springer 1998, Berlin PDP 2012 - Enabling parallel computing of a brain connectivity map using the MediGRID-Infrastructure and FSL
7
Generic Workflow Execution Service (GWES)
http://www.gridworkflow.org/kwfgrid/gwes-web/images/workflow-refinement_0640x0480.png PDP 2012 - Enabling parallel computing of a brain connectivity map using the MediGRID-Infrastructure and FSL
8
Wrapper – Scripts Connection between application and Workflow Engine Usually a shell script to translation the parameters from the syntax of the GWorkflowDL to the syntax of the application Common structure 1.
Header
2.
Parsing and checking parameters
3.
Special preprocessing
4.
Application call with parameters
5.
Special postprocessing / cleanup
PDP 2012 - Enabling parallel computing of a brain connectivity map using the MediGRID-Infrastructure and FSL
9
Workflow – Overview
2
1. Bedpostx preprocessing (BPx): FDT: Bedpostx, FSLUTILS: fslslice, fslmerge
2. Processing bundle generation(GenCoords)
4. Consolidation I + II FSLUTILS: fslmaths
5. Connectivity Map transfer (PTxTrans)
3. Probtrackx processing (PTx) FDT: Probtrackx 2 Nutzung
der MediGRID-Infrastruktur für die Berechnung von Connectivity-Maps des Gehirns, Romanus Grütz
PDP 2012 - Enabling parallel computing of a brain connectivity map using the MediGRID-Infrastructure and FSL
10
Workflow – Processing Bundle Generation
1. Bedpostx preprocessing (BPx): FDT: Bedpostx, FSLUTILS: fslslice, fslmerge
2. Processing bundle generation(GenCoords)
4. Consolidation I + II FSLUTILS: fslmaths
5. Connectivity Map transfer (PTxTrans)
3. Probtrackx processing (PTx) FDT: Probtrackx PDP 2012 - Enabling parallel computing of a brain connectivity map using the MediGRID-Infrastructure and FSL
11
Workflow – Probtrackx Processing
1. Bedpostx preprocessing (BPx): FDT: Bedpostx, FSLUTILS: fslslice, fslmerge
2. Processing bundle generation(GenCoords)
4. Consolidation I + II FSLUTILS: fslmaths
5. Connectivity Map transfer (PTxTrans)
3. Probtrackx processing (PTx) FDT: Probtrackx PDP 2012 - Enabling parallel computing of a brain connectivity map using the MediGRID-Infrastructure and FSL
12
Workflow – Consolidation I + II
1. Bedpostx preprocessing (BPx): FDT: Bedpostx, FSLUTILS: fslslice, fslmerge
2. Processing bundle generation(GenCoords)
4. Consolidation I + II FSLUTILS: fslmaths
5. Connectivity Map transfer (PTxTrans)
3. Probtrackx processing (PTx) FDT: Probtrackx PDP 2012 - Enabling parallel computing of a brain connectivity map using the MediGRID-Infrastructure and FSL
13
ConBrain Easy access for the users Based on Java Web Start and many Java libraries licensed under a non-commercial freeto-use Agreement. Uses the Grid Security Infrastructure (GSI)
select data set
upload to a MediGRID resource
adjust the application and workflow parameters
generate and execute workflow
monitor workflow(s)
PDP 2012 - Enabling parallel computing of a brain connectivity map using the MediGRID-Infrastructure and FSL
14
Test Environment
Property
Local cluster Worker Node
Fat Node
Head Node
Storage Server
8
1
1
1
RAM
16 GB
128 GB
16 GB
8 GB
CPU
Intel Xeon 5420
AMD™ 8358
Intel Xeon 5450
Intel Xeon 5430
#CPUs
2
8
2
2
#cores/CPU
4
4
4
4
CPU speed
2.5 GHz
2.4 GHz
3.0 GHz
2.66 GHz
#maschines
PDP 2012 - Enabling parallel computing of a brain connectivity map using the MediGRID-Infrastructure and FSL
15
Performance Resolution of the DTI dataset: 128x128x60 Dimension of each processing bundle: 10x10x5 = up to 500 fiber tracking processes per probtrackx job Total time: Total time spend in queues:
27h
1min
12sec
4h
38min
22sec
Crunching Factor: 14.6
PDP 2012 - Enabling parallel computing of a brain connectivity map using the MediGRID-Infrastructure and FSL
16
Conclusion & Outlook Migration of processing connectivity maps from a small cluster to an interinstitutional grid implementation By utilizing the GWES and development of a workflow and multiple wrapper scripts
Adaption to a new workflow engine (e.g. via SHIWA) Testing the stability and performance Streamline / Optimize the workflow(s) and wrapper scripts http://www.d-grid-ggmbh.de/index.php?id=10 PDP 2012 - Enabling parallel computing of a brain connectivity map using the MediGRID-Infrastructure and FSL
17
Contacts
University Medical Center Göttingen Department of Medical Informatics http://www.mi.med.uni-goettingen.de/
Romanus Grütz
[email protected] Tel.: +49 551 39 - 6981
http://www.labimi-f.de
http://www.wissgrid.de
18