Advanced search
Start date
Betweenand

Development of efficient techniques for similarity search meeting user's interest in relational DBMS

Grant number: 12/08128-3
Support type:Scholarships in Brazil - Post-Doctorate
Effective date (Start): November 01, 2012
Effective date (End): June 30, 2015
Field of knowledge:Physical Sciences and Mathematics - Computer Science
Principal Investigator:Caetano Traina Junior
Grantee:Mônica Ribeiro Porto Ferreira
Home Institution: Instituto de Ciências Matemáticas e de Computação (ICMC). Universidade de São Paulo (USP). São Carlos , SP, Brazil

Abstract

The Database Management Systems (DBMS) were developed to store large data volumes, allowing their efficient retrieval whenever needed. Three basic techniques are employed to speed up retrieval: query rewriting, indexing structures and data restructuring. However, current DBMS are being developed aiming the scalar data domains, such as numbers and small character strings. More complex data, such as images, large texts or temporal series have received little support. Indeed, the bulk of the techniques developed over more than forty years of intense research efforts over scalar data types cannot, in its majority, be employed to handle complex data. Therefore, a strong effort has been undertaken by the international research groups aiming to provide the techniques required to handle complex data. The Image and Data Bases Group (GBdI) has participated in that effort, and the proposal candidate in particular have contributed to it. This project aims at contributing to this research effort developing techniques to speed up answering queries that combines similarity predicates over complex data and equality or relational predicates over scalar data. Those techniques can potentially contribute to significantly improve the efficiency of the query processing, although they have, up to now, received very little attention regarding similarity query processing. To this intent, it is planned to develop new data structures able to index both complex attributes subjected to similarity search and scalar attributes at the same structure, that we call "Hybrid Indexes". Those structures will be employed to take advantage of the pure metric access methods already existing (whose development already had contribution from GBdI) and also of the algebraic properties and cost and selectivity estimation models developed by the candidate during her doctorate program. Collectively, those techniques will provide a good set of tools to improve the efficiency of similarity query execution over complex data, and a prototype of a similarity-enabled DBMS will be implemented that will, for the first time, incorporate all the three main concepts useful to speed up similarity queries.