Advanced search
Start date
Betweenand

Classification and analysis of bacterial non-coding RNA sequences using machine-learning techniques

Grant number: 21/08561-8
Support Opportunities:Scholarships in Brazil - Scientific Initiation
Start date: September 01, 2021
End date: June 30, 2022
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques
Principal Investigator:André Carlos Ponce de Leon Ferreira de Carvalho
Grantee:Breno Livio Silva de Almeida
Host Institution: Instituto de Ciências Matemáticas e de Computação (ICMC). Universidade de São Paulo (USP). São Carlos , SP, Brazil

Abstract

When it was discovered that at least 80% of the human genome is biologically active, the non-coding content of the human genome was no longer viewed solely as "junk" DNA. Non-coding sequences have eventually gained increasing space in recent research for important roles involving physiological processes in organisms. Small non-coding RNAs then gain a greater relevance to fundamentally understand these processes. For the identification of biological sequences such as small RNAs, it is possible to extract features by means of mathematical descriptors, which can help in pointing out specificities in different types of sequences. This task can be allied to Machine Learning techniques, which can automatically learn from the patterns of the mathematical descriptors of the sequences, performing classifications, besides pointing out which descriptors are really important for the classification. Thus, the project aims to use these techniques and feature extraction to classify and analyze bacterial sequences of small non-coding RNAs, intending to study the effective use of these descriptors and differentiate these sequences among others. (AU)

News published in Agência FAPESP Newsletter about the scholarship:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)

Scientific publications
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
SILVA DE ALMEIDA, BRENO LIVIO; QUEIROZ, ALVARO PEDROSO; AVILA SANTOS, ANDERSON PAULO; BONIDIA, ROBSON PARMEZAN; DA ROCHA, ULISSES NUNES; SANCHES, DANILO SIPOLI; DE LEON FERREIRA DE CARVALHO, ANDRE CARLOS PONCE; STADLER, PF; WALTER, MEMT; HERNANDEZ-ROSALES, M; et al. Feature Importance Analysis of Non-coding DNA/RNA Sequences Based on Machine Learning Approaches. ADVANCES IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, BSB 2021, v. 13063, p. 12-pg., . (21/08561-8)
BONIDIA, ROBSON P.; AVILA SANTOS, ANDERSON P.; DE ALMEIDA, BRENO L. S.; STADLER, PETER F.; DA ROCHA, ULISSES NUNES; SANCHES, DANILO S.; DE CARVALHO, ANDRE C. P. L. F.. Information Theory for Biological Sequence Classification: A Novel Feature Extraction Technique Based on Tsallis Entropy. Entropy, v. 24, n. 10, p. 17-pg., . (13/07375-0, 21/08561-8)
BONIDIA, ROBSON P.; AVILA SANTOS, ANDERSON P.; DE ALMEIDA, BRENO L. S.; STADLER, PETER F.; DA ROCHA, ULISSES N.; SANCHES, DANILO S.; DE CARVALHO, ANDRE C. P. L. F.. BioAutoML: automated feature engineering and metalearning to predict noncoding RNAs in bacteria. BRIEFINGS IN BIOINFORMATICS, v. N/A, p. 13-pg., . (13/07375-0, 21/08561-8)