Advanced search
Start date

Evaluating similarity join algorithms for data streams: case study in the analysis of financial market data

Grant number: 17/21512-0
Support Opportunities:Scholarships in Brazil - Scientific Initiation
Effective date (Start): February 01, 2018
Effective date (End): November 30, 2018
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques
Principal Investigator:Robson Leonardo Ferreira Cordeiro
Grantee:Matheus Araujo Jorge
Host Institution: Instituto de Ciências Matemáticas e de Computação (ICMC). Universidade de São Paulo (USP). São Carlos , SP, Brazil
Associated research grant:16/17078-0 - Mining, indexing and visualizing Big Data in clinical decision support systems (MIVisBD), AP.TEM


Given the stock prices for many companies in financial markets over the time, how to spot start-ups behaving similarly to the initial development stage of consolidated companies, so to support the forecast of future behaviour? Is it possible to use similarity joins to help tackling the problem? The similarity join operation is one of the main tools used to support the analysis, understanding and extraction of knowledge from the very large collections of complex data objects generated by many real applications, such as, collections of magnetic resonance images in medicine, sets of fingerprints from security systems, Web-scale graphs, among many others. Due to its importance, the current literature includes many works about the similarity join, nevertheless, the vast majority of these works focus on the analysis of static datasets. On the other hand, as it happens in our example with financial markets, many real applications collect/generate data in a continuous and potentially infinite process, in which values for the attributes studied are repetitively registered according to events triggered in regular time steps, or in irregular steps that depend on the actions of users. The resulting data are known as multidimensional data streams. Unfortunately, the state-of-the-art literature for similarity joins in streams is considerably limited when compared with the works developed for static data. Algorithms well-suited to process streams are rare, and a number of issues still exist, for example, efficiency limitations and the absence of survey-like works about the few existing algorithms. In this project, we propose to reduce the aforementioned problem by studying and comparing state-of-the-art similarity join algorithms for multidimensional data streams, especially focused on the analysis of financial market data. (AU)

News published in Agência FAPESP Newsletter about the scholarship:
Articles published in other media outlets (0 total):
More itemsLess items

Please report errors in scientific publications list using this form.