Efficient and Effective Similarity Search on Complex Objects

www.lmu.de | UB | Blättern | FAQ

Zur erweiterten Suche

English

Zur erweiterten Suche

Due to the rapid development of computer technology and new methods for the extraction of data in the last few years, more and more applications of databases have emerged, for which an efficient and effective similarity search is of great importance. Application areas of similarity search include multimedia, computer aided engineering, marketing, image processing and many more. Special interest adheres to the task of finding similar objects in large amounts of data having complex representations. For example, set-valued objects as well as tree or graph structured objects are among these complex object representations. The grouping of similar objects, the so-called clustering, is a fundamental analysis technique, which allows to search through extensive data sets. The goal of this dissertation is to develop new efficient and effective methods for similarity search in large quantities of complex objects. Furthermore, the efficiency of existing density-based clustering algorithms is to be improved when applied to complex objects. The first part of this work motivates the use of vector sets for similarity modeling. For this purpose, a metric distance function is defined, which is suitable for various application ranges, but time-consuming to compute. Therefore, a filter refinement technology is suggested to efficiently process range queries and k-nearest neighbor queries, two basic query types within the field of similarity search. Several filter distances are presented, which approximate the exact object distance and can be computed efficiently. Moreover, a multi-step query processing approach is described, which can be directly integrated into the well-known density-based clustering algorithms DBSCAN and OPTICS. In the second part of this work, new application ranges for density-based hierarchical clustering using OPTICS are discussed. A prototype is introduced, which has been developed for these new application areas and is based on the aforementioned similarity models and accelerated clustering algorithms for complex objects. This prototype facilitates interactive semi-automatic cluster analysis and allows visual search for similar objects in multimedia databases. Another prototype extends these concepts and enables the user to analyze multi-represented and multi-instance data. Finally, the problem of music genre classification is addressed as another application supporting multi-represented and multi-instance data objects. An extensive experimental evaluation examines efficiency and effectiveness of the presented techniques using real-world data and points out advantages in comparison to conventional approaches.

data analysis, similarity search, clustering, complex objects

Brecheisen, Stefan

22. Feb. 2007

2007

Englisch

Universitätsbibliothek der Ludwig-Maximilians-Universität München

https://nbn-resolving.org/urn:nbn:de:bvb:19-66111

Brecheisen, Stefan (2007): Efficient and Effective Similarity Search on Complex Objects. Dissertation, LMU München: Fakultät für Mathematik, Informatik und Statistik

Vorschau

PDF
Brecheisen_Stefan.pdf
2MB

DOI: 10.5282/edoc.6611

URN: urn:nbn:de:bvb:19-66111

Abstract

Dokumententyp:	Dissertationen (Dissertation, LMU München)
Keywords:	data analysis, similarity search, clustering, complex objects
Themengebiete:	500 Naturwissenschaften und Mathematik > 510 Mathematik 500 Naturwissenschaften und Mathematik
Fakultäten:	Fakultät für Mathematik, Informatik und Statistik
Sprache der Hochschulschrift:	Englisch
Datum der mündlichen Prüfung:	22. Februar 2007
1. Berichterstatter:in:	Kriegel, Hans-Peter
MD5 Prüfsumme der PDF-Datei:	40cb4f08ad6e5e1252ca97fd5d7e5901
Signatur der gedruckten Ausgabe:	0001/UMC 16028
ID Code:	6611
Eingestellt am:	16. Mar. 2007
Letzte Änderungen:	24. Oct. 2020 08:39

Nur für Administratoren und Editoren: Dokument bearbeiten