Advanced search
Start date
Betweenand

Deep learning for hierarchical classification of transposable elements

Grant number: 16/12489-2
Support type:Scholarships in Brazil - Master
Effective date (Start): January 01, 2017
Effective date (End): August 31, 2018
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques
Cooperation agreement: Coordination of Improvement of Higher Education Personnel (CAPES)
Principal Investigator:Ricardo Cerri
Grantee:Felipe Kenji Nakano
Home Institution: Centro de Ciências Exatas e de Tecnologia (CCET). Universidade Federal de São Carlos (UFSCAR). São Carlos , SP, Brazil
Associated scholarship(s):17/19264-9 - Active learning in hierarchical classification of transposable elements, BE.EP.MS

Abstract

Transposable Elements (TEs) are DNA sequences that can change its location within a cell's genome. They contribute directly to the genetic variety of species. Besides, their transposition mechanisms can affect the functionality of genes. The correct identification and classification of TEs play a central role in comprehension of genomes. Generally, identification and classification of TEs are performed using tools that utilizes homology, by comparing a sequence to many sequences from a labeled TE database. This method is limited, since the homology ignore sequences' biochemical properties and relations among different TE classes. Since the literature proposes hierarchical taxonomies to classify TEs according to classes and subclasses, this project aims to develop new classification methods employing Machine Learning, considering hierarchical relationships among different classes. More specifically, artificial neural networks trained using Deep Learning concepts will be investigated. As the first step, datasets will be constructed from TEs sequences already identified. In order to build such datasets, Bioinformatic tools, capable of identifying the presence of signatures and biochemical characteristics, will be used. Also, different strategies will be used to convert sequences to attributes suited for Machine Learning. Afterwards, the datasets will be structured in a hierarchical fashion, according to TEs families and superfamilies. The new proposed classification methods will be compared to state-of-art methods from literature, and evaluated using measures specifically designed for hierarchical classification problems. (AU)

Scientific publications
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
NAKANO, FELIPE KENJI; CERRI, RICARDO; VENS, CELINE. Active learning for hierarchical multi-label classification. DATA MINING AND KNOWLEDGE DISCOVERY, JUL 2020. Web of Science Citations: 0.

Please report errors in scientific publications list by writing to: cdi@fapesp.br.