Advanced search
Start date
Betweenand

Simlarity in Big Data

Grant number: 13/01517-7
Support type:Scholarships in Brazil - Doctorate
Effective date (Start): June 01, 2013
Effective date (End): January 02, 2017
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques
Principal Investigator:Caetano Traina Junior
Grantee:Lúcio Fernandes Dutra Santos
Home Institution: Instituto de Ciências Matemáticas e de Computação (ICMC). Universidade de São Paulo (USP). São Carlos , SP, Brazil

Abstract

The data being collected and generated nowadays increases not only in volume, but also in complexity, leading to the need of new query operators. Health centers collecting image exams and remote sensing from satellites and from earth-based stations are examples of applications domains where more powerful and flexible operators are required.Storing, retrieving and analyzing data that are huge in volume, structure, complexity and distribution are now being refereed to as big data. Representing and querying big data using only the traditional scalar data types are not enough any more. Similarity queries are the most pursued resources to retrieve complex data, but until recently, they were not available in the Database Management Systems. Now they are stating to become available, but its first uses to develop real systems make it clear that the basic similarity query operators are not enough to meet the requirements of the target applications. The main reason is that similarity is a concept formulated considering only small amounts of data elements. When the volume of the data increases, both the query efficacy and the efficiency to obtain it (the quality and the speed of the query processing) are compromised. Nowadays, researchers are targeting handling big data mainly using parallel architectures, and only few studies exist targeting the efficacy of the query answers. This project aims at studing and developing vcariations over the basic similarity operators to propose better suited similarity operators. The results will be validated over two application domais: large collections of images from medical exams and images and time series from remote sensing data of climate and agricultural enteprises.

Academic Publications
(References retrieved automatically from State of São Paulo Research Institutions)
SANTOS, Lúcio Fernandes Dutra. Similarity in big data. 2017. Doctoral Thesis - Universidade de São Paulo (USP). Instituto de Ciências Matemáticas e de Computação São Carlos.

Please report errors in scientific publications list by writing to: cdi@fapesp.br.