Advanced search
Start date
Betweenand

Development of a persistent biological index based on generalized suffix arrays

Grant number: 11/15423-9
Support Opportunities:Scholarships in Brazil - Master
Start date: March 01, 2012
End date: December 31, 2013
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques
Principal Investigator:Cristina Dutra de Aguiar
Grantee:Felipe Alves da Louza
Host Institution: Instituto de Ciências Matemáticas e de Computação (ICMC). Universidade de São Paulo (USP). São Carlos , SP, Brazil
Associated scholarship(s):13/01752-6 - Suffix arrays and genome assembly, BE.EP.MS

Abstract

Due to technological advances, the amount of biological data (i.e., DNA and proteins sequences) collected, stored in biological databases (BDBs) and available for analysis has increased exponentially. Since many advances in medicine have been obtained through similarity search, there is the challenge of efficiently search these voluminous BDBs. In bioinformatics, similarity search is usually aided by indices based on the use of a data structure called suffix array. Challenges related to the use of suffix arrays refer to the construction of the suffix array for large input sequences, the efficient storage and handling of this data structure on disk, and the indexing of several biologic sequences through generalized suffix arrays. In the literature, works that use suffix arrays to index biological sequences have limitations that motivate the development of new research. On the one hand, works that use generalized suffix arrays are aimed at the main memory and therefore do not focus on the persistent storage of the suffix array on disk. On the other hand, works aimed at manipulating suffix arrays on disk provide a better organization of the index to reduce the number of disk access, but do not focus on generalized suffix arrays. These limitations motivate the development of this master's research, which aims to propose a new persistent biological index based on generalized suffix arrays. The proposed index advances in state-of-the-art since it focuses on this gap in the literature. (AU)

News published in Agência FAPESP Newsletter about the scholarship:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)

Scientific publications
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
LOUZA, FELIPE A.; TELLES, GUILHERME P.; HOFFMANN, STEVE; CIFERRI, CRISTINA D. A.. Generalized enhanced suffix array construction in external memory. Algorithms for Molecular Biology, v. 12, . (11/23904-7, 17/09105-0, 11/15423-9)