Advanced search
Start date
(Reference retrieved automatically from Web of Science through information on FAPESP grant and its corresponding number as mentioned in the publication by the authors.)

Algorithms to compute the Burrows-Wheeler Similarity Distribution

Full text
Louza, Felipe A. [1] ; Telles, Guilherme P. [2] ; Gog, Simon [3] ; Zhao, Liang [1]
Total Authors: 4
[1] Univ Sao Paulo, Dept Comp & Math, Ribeirao Preto - Brazil
[2] Univ Estadual Campinas, Inst Comp, Campinas, SP - Brazil
[3] eBay Inc, San Jose, CA - USA
Total Affiliations: 3
Document type: Journal article
Source: THEORETICAL COMPUTER SCIENCE; v. 782, p. 145-156, AUG 23 2019.
Web of Science Citations: 0

The Burrows-Wheeler transform (BWT) is a well studied text transformation widely used in data compression and text indexing. The BWT of two strings can also provide similarity measures between them, based on the observation that the more their symbols are intermixed in the transformation, the more the strings are similar. In this article we present two new algorithms to compute similarity measures based on the BWT for string collections. In particular, we present practical and theoretical improvements to the computation of the Burrows-Wheeler Similarity Distribution for all pairs of strings in a collection. Our algorithms take advantage of the BWT computed for the concatenation of all strings, and use compressed data structures that allow reducing the running time with a small memory footprint, as shown by a set of experiments with real and artificial datasets. (C) 2019 Elsevier B.V. All rights reserved. (AU)

FAPESP's process: 18/21509-2 - Burrows-Wheeler transform and succinct de Bruijn graphs
Grantee:Felipe Alves da Louza
Support type: Scholarships abroad - Research Internship - Post-doctor
FAPESP's process: 17/09105-0 - Suffix sorting and string similarity measures
Grantee:Felipe Alves da Louza
Support type: Scholarships in Brazil - Post-Doctorate
FAPESP's process: 15/50122-0 - Dynamic phenomena in complex networks: basics and applications
Grantee:Elbert Einstein Nehrer Macau
Support type: Research Projects - Thematic Grants