Advanced search
Start date
Betweenand

Complexity-invariance for classification, clustering and motif discovery in time series

Grant number: 12/07295-3
Support type:Regular Research Grants
Duration: July 01, 2012 - June 30, 2014
Field of knowledge:Physical Sciences and Mathematics - Computer Science
Principal Investigator:Gustavo Enrique de Almeida Prado Alves Batista
Grantee:Gustavo Enrique de Almeida Prado Alves Batista
Home Institution: Instituto de Ciências Matemáticas e de Computação (ICMC). Universidade de São Paulo (USP). São Carlos , SP, Brazil
Assoc. researchers:Eamonn John Keogh ; Ronaldo Cristiano Prati ; Solange Oliveira Rezende
Associated grant(s):13/50379-6 - Research on geo-spatial marine biology data mining using time series, text mining and visualization, AP.R

Abstract

Recently, there is an increasing interest in time series processing due to the large number of application domains that generate data with such property. Such interest can be measured by the vast amount of methods recently proposed in literature to tasks such as classification, clustering, summarization, abnormality detection and motif discovery. Recent studies have shown for several problems that methods based on similarity present an efficacy that is hardly surpassed, even when compared to more sophisticated methods. This is mainly due to the fact that the community has studied and proposed several invariances to distance measures for time series. The invariances make the distance measures ignore certain undesired data properties. The most well-known example is the invariance to local differences in time scale, obtained with the warping technique. Other invariances include the invariance to differences in amplitude and offset, phase and occlusion. Recently, we demonstrated to the scientific community that time series similarity classification methods can be largely benefited by a new invariance: complexity invariance. The main objective of this research project is to investigate new complexity-invariant distance measures and assess how such measures can improve the efficacy especially of clustering and motif discovery algorithms. (AU)

Scientific publications
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
PRATI, RONALDO C.; BATISTA, GUSTAVO E. A. P. A.; SILVA, DIEGO F. Class imbalance revisited: a new experimental setup to assess the performance of treatment methods. KNOWLEDGE AND INFORMATION SYSTEMS, v. 45, n. 1, p. 247-270, OCT 2015. Web of Science Citations: 48.
BATISTA, GUSTAVO E. A. P. A.; KEOGH, EAMONN J.; TATAW, OBEN MOSES; DE SOUZA, VINICIUS M. A. CID: an efficient complexity-invariant distance for time series. DATA MINING AND KNOWLEDGE DISCOVERY, v. 28, n. 3, p. 634-669, MAY 2014. Web of Science Citations: 76.
SILVA, DIEGO FURTADO; ALVES DE SOUZA, VINICIUS MOURAO; PRADO ALVES BATISTA, GUSTAVO ENRIQUE DE ALMEIDA. A comparative study between MFCC and LSF coefficients in automatic recognition of isolated digits pronounced in Portuguese and English. ACTA SCIENTIARUM-TECHNOLOGY, v. 35, n. 4, p. 621-628, 2013. Web of Science Citations: 2.

Please report errors in scientific publications list by writing to: cdi@fapesp.br.