Similarity Search in Medical Data

www.lmu.de | UB | Blättern | FAQ

Zur erweiterten Suche

English

Zur erweiterten Suche

The ongoing automation in our modern information society leads to a tremendous rise in the amount as well as complexity of collected data. In medical imaging for example the electronic availability of extensive data collected as part of clinical trials provides a remarkable potentiality to detect new relevant features in complex diseases like brain tumors. Using data mining applications for the analysis of the data raises several problems. One problem is the localization of outstanding observations also called outliers in a data set. In this work a technique for parameter-free outlier detection, which is based on data compression and a general data model which combines the Generalized Normal Distribution (GND) with independent components, to cope with existing problems like parameter settings or implicit data distribution assumptions, is proposed. Another problem in many modern applications amongst others in medical imaging is the efficient similarity search in uncertain data. At present, an adequate therapy planning of newly detected brain tumors assumedly of glial origin needs invasive biopsy due to the fact that prognosis and treatment, both vary strongly for benign, low-grade, and high-grade tumors. To date differentiation of tumor grades is mainly based on the expertise of neuroradiologists examining contrast-enhanced Magnetic Resonance Images (MRI). To assist neuroradiologist experts during the differentiation between tumors of different malignancy we proposed a novel, efficient similarity search technique for uncertain data. The feature vector of an object is thereby not exactly known but is rather defined by a Probability Density Function (PDF) like a Gaussian Mixture Model (GMM). Previous work is limited to axis-parallel Gaussian distributions, hence, correlations between different features are not considered in these similarity searches. In this work a novel, efficient similarity search technique for general GMMs without independence assumption is presented. The actual components of a GMM are approximated in a conservative but tight way. The conservativity of the approach leads to a filter-refinement architecture, which guarantees no false dismissals and the tightness of the approximations causes good filter selectivity. An extensive experimental evaluation of the approach demonstrates a considerable speed-up of similarity queries on general GMMs. Additionally, promising results for advancing the differentiation between brain tumors of different grades could be obtained by applying the approach to four-dimensional Magnetic Resonance Images of glioma patients.

Glioma Grading, Data Mining, Similarity Search, Uncertain Data, Gaussian Mixture Model, Clustering

Haegler, Katrin

08. Nov. 2011

2011

Englisch

Universitätsbibliothek der Ludwig-Maximilians-Universität München

https://nbn-resolving.org/urn:nbn:de:bvb:19-136645

Haegler, Katrin (2011): Similarity Search in Medical Data. Dissertation, LMU München: Fakultät für Mathematik, Informatik und Statistik

Vorschau

PDF
Haegler_Katrin.pdf
14MB

DOI: 10.5282/edoc.13664

URN: urn:nbn:de:bvb:19-136645

Abstract

Dokumententyp:	Dissertationen (Dissertation, LMU München)
Keywords:	Glioma Grading, Data Mining, Similarity Search, Uncertain Data, Gaussian Mixture Model, Clustering
Themengebiete:	000 Allgemeines, Informatik, Informationswissenschaft > 004 Informatik 000 Allgemeines, Informatik, Informationswissenschaft
Fakultäten:	Fakultät für Mathematik, Informatik und Statistik
Sprache der Hochschulschrift:	Englisch
Datum der mündlichen Prüfung:	8. November 2011
1. Berichterstatter:in:	Böhm, Christian
MD5 Prüfsumme der PDF-Datei:	60ca19a7fffd2f14632e025ac5cc0698
Signatur der gedruckten Ausgabe:	0001/UMC 19916
ID Code:	13664
Eingestellt am:	14. Dec. 2011 09:59
Letzte Änderungen:	24. Oct. 2020 03:21

Nur für Administratoren und Editoren: Dokument bearbeiten