Advanced search
Start date
Betweenand

Space-efficient computation of minimal absent words of a collection of strings

Grant number: 25/08034-9
Support Opportunities:Scholarships abroad - Research Internship - Doctorate
Start date: January 01, 2026
End date: December 31, 2026
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Theory of Computation
Principal Investigator:João Meidanis
Grantee:Lucas Peres Oliveira
Supervisor: Solon Pissis
Host Institution: Instituto de Computação (IC). Universidade Estadual de Campinas (UNICAMP). Campinas , SP, Brazil
Institution abroad: Centrum Wiskunde & Informatica, Netherlands  
Associated to the scholarship:23/05887-5 - Alignment-free and targeted assembly methods for gene fusions detection from RNA-seq data, BP.DR

Abstract

Minimal absent words (MAWs) of a string are absent strings whose proper substrings occur in the string. MAWs have been widely studied due to their combinatorial properties and applications in data compression and bioinformatics. A related concept, specific words, refers to MAWs of a reference string that occur in a target string and has been used to detect disease-related mutations. While several algorithms exist for computing MAWs, they are often impractical for large datasets due to high space requirements and do not handle multiple strings. For specific words, however, space-efficient solutions do exist, but they often incur a slowdown in processing time, whereas recent solutions that achieve optimal time require excessive space. Therefore, this project aims to develop space-efficient algorithms for computing MAWs and specific words across multiple strings, extending the current state of the art suffix array-based algorithms using space reduction techniques. We hope that these advances will enable the application of MAWs and specific words to larger datasets obtained from high-throughput sequencing in the context of bioinformatics.

News published in Agência FAPESP Newsletter about the scholarship:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)