Advanced search
Start date
Betweenand

On approximate solutions to scalable data mining algorithms for complex data problems

Grant number: 10/14536-1
Support Opportunities:Scholarships in Brazil - Master
Start date: March 01, 2011
End date: September 30, 2011
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques
Principal Investigator:Elaine Parros Machado de Sousa
Grantee:Alexander Victor Ocsa Mamani
Host Institution: Instituto de Ciências Matemáticas e de Computação (ICMC). Universidade de São Paulo (USP). São Carlos , SP, Brazil

Abstract

The increasing availability of data in many domains has motivated the development of techniques to discovery knowledge in huge volumes of complex data. Recent works have suggested that searching in complex data is an interesting research field because many data mining tasks, such as classification, clustering and motif discoverym depend on nearest neighbor search algorithms. Many deterministic approaches have been proposed to solve this problem, while probabilistic algorithms have been slightly explored. Recently, new techniques are trying to reduce the computational cost relaxing the quality of the query results. In that direction, a recent technique named Locality Sensitive hashing (LSH) showed to be one of the approaches with the best tradeoff between the query cost and the quality of the query results for high dimensional data. So, search methods based on approximate nearest neighbor are feasible solutions to improve the performance of data mining tasks on complex data. However, current LSH implementations either incur in expensive space and query cost, or abandon the theoretical guarantee on the quality of query results. On this context, this project aims to: (i) study and develop solutions for the inherent LSH implementation problems; (ii) improve the performance of data mining tasks on complex data using the proposed techniques to solve the approximate nearest neighbor problem. In particular, it is expected to develop scalable solutions for clustering and motif discovery, initially applied to images and time series in the agrometeorology area.

News published in Agência FAPESP Newsletter about the scholarship:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)

Academic Publications
(References retrieved automatically from State of São Paulo Research Institutions)
MAMANI, Alexander Victor Ocsa. On approximate solutions to scalable data mining algorithms for complex data problems using GPGPU. 2011. Master's Dissertation - Universidade de São Paulo (USP). Instituto de Ciências Matemáticas e de Computação (ICMC/SB) São Carlos.