Advanced search
Start date

Suffix sorting and string similarity measures

Grant number: 17/09105-0
Support type:Scholarships in Brazil - Post-Doctorate
Effective date (Start): August 01, 2017
Effective date (End): August 13, 2019
Field of knowledge:Physical Sciences and Mathematics - Computer Science
Principal Investigator:Zhao Liang
Grantee:Felipe Alves da Louza
Home Institution: Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto (FFCLRP). Universidade de São Paulo (USP). Ribeirão Preto , SP, Brazil
Associated research grant:15/50122-0 - Dynamic phenomena in complex networks: basics and applications, AP.TEM
Associated scholarship(s):18/21509-2 - Burrows-Wheeler transform and succinct de Bruijn graphs, BE.EP.PD


Suffix sorting is a fundamental problem in string processing that arises in many applications of Bioinformatics, Information Retrieval and Text Mining. This problem is related to the suffix array construction and the Burrows-Wheeler transform, which play an important role in text indexing (e.g. FM-index) and data compression (e.g. bzip). Several algorithms have been proposed to construct the suffix array for a single string and to obtain the Burrows-Wheeler transform. However, in many situations we are interested in processing a string collection and obtain directly the Burrows-Wheeler transform, that is, without constructing the suffix array. Less emphasis has been put on specific algoritmos for this problems. Therefore, the main goal of this project is to investigate the direct computation of the Burrows-Wheeler transform for a collection of strings. Moreover, we will investigate string similarity measures that use the Burrows-Wheeler transform to compare and classify strings.

Scientific publications (4)
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
LOUZA, FELIPE A.; TELLES, GUILHERME P.; GOG, SIMON; ZHAO, LIANG. Algorithms to compute the Burrows-Wheeler Similarity Distribution. THEORETICAL COMPUTER SCIENCE, v. 782, p. 145-156, AUG 23 2019. Web of Science Citations: 1.
EGIDI, LAVINIA; LOUZA, FELIPE A.; MANZINI, GIOVANNI; TELLES, GUILHERME P. External memory BWT and LCP computation for sequence collections with applications. Algorithms for Molecular Biology, v. 14, MAR 8 2019. Web of Science Citations: 1.
LOUZA, FELIPE A.; SMYTH, W. F.; MANZINIC, GIOVANNI; TELLES, GUILHERME P. Lyndon array construction during Burrows-Wheeler inversion. JOURNAL OF DISCRETE ALGORITHMS, v. 50, p. 2-9, MAY 2018. Web of Science Citations: 0.
LOUZA, FELIPE A.; TELLES, GUILHERME P.; HOFFMANN, STEVE; CIFERRI, CRISTINA D. A. Generalized enhanced suffix array construction in external memory. Algorithms for Molecular Biology, v. 12, DEC 7 2017. Web of Science Citations: 3.

Please report errors in scientific publications list by writing to: