Advanced search
Start date
Betweenand

Active learning in hierarchical classification of transposable elements

Grant number: 17/19264-9
Support Opportunities:Scholarships abroad - Research Internship - Master's degree
Effective date (Start): November 01, 2017
Effective date (End): April 30, 2018
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques
Principal Investigator:Ricardo Cerri
Grantee:Felipe Kenji Nakano
Supervisor: Celine Vens
Host Institution: Centro de Ciências Exatas e de Tecnologia (CCET). Universidade Federal de São Carlos (UFSCAR). São Carlos , SP, Brazil
Research place: University of Leuven, Kulak Kortrijk (KU Leuven), Belgium  
Associated to the scholarship:16/12489-2 - Deep learning for hierarchical classification of transposable elements, BP.MS

Abstract

Tranposable Elements (TEs) are DNA sequences capable of moving within a cell's genome. Such movement causes genetic variability, and changes in gene's functionality. Usually TEs classification is performed using homology tools. Homology tries to find similar sequences by matching then in a string like fashion, however, such method ignores many biochemical and hierarchical properties. Nonetheless, recently, TEs were proposed as Machine Learning (ML) classification problem. More specifically, TEs are classified using Hierarchical Classification(HC) methods. Differently from traditional classification, HC addresses problems whose classes are structured in a hierarchy. Such methods have proved to be more efficient and feasible than homology, however ML methods require labelled data. TEs' labelling is not an easy task. Repbase, the most academic received TEs repository, employs massive validation and multiple tools for TEs classification. This process is computationally and financially demanding, resulting in plenty of unlabelled sequences. As a countermeasure, the field of Active Learning (AL) provides methods for using unlabelled data. Basically, an AL algorithm employs strategies that select the most valuable unlabelled data to be labelled. Hence the cost of labelling the data is reduced, and classifiers are likely to learn from the most representative instances. In this research, we plan to investigate AL algorithms for HC, in special, we will merge AL into the state-of-art method for HC, Clus-HMC. (AU)

News published in Agência FAPESP Newsletter about the scholarship:
Articles published in other media outlets (0 total):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)

Scientific publications
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
NAKANO, FELIPE KENJI; CERRI, RICARDO; VENS, CELINE. Active learning for hierarchical multi-label classification. DATA MINING AND KNOWLEDGE DISCOVERY, v. 34, n. 5, SI, . (16/12489-2, 17/19264-9)

Please report errors in scientific publications list by writing to: cdi@fapesp.br.