Advanced search
Start date
Betweenand
(Reference retrieved automatically from Web of Science through information on FAPESP grant and its corresponding number as mentioned in the publication by the authors.)

GenSeed-HMM: A Tool for Progressive Assembly Using Profile HMMs as Seeds and its Application in Alpavirinae Viral Discovery from Metagenomic Data

Full text
Author(s):
Show less -
Alves, Joao M. P. [1] ; de Oliveira, Andre L. [1] ; Sandberg, Tatiana O. M. [1] ; Moreno-Gallego, Jaime L. [2] ; de Toledo, Marcelo A. F. [1] ; de Moura, Elisabeth M. M. [3] ; Oliveira, Lilane S. [1, 4] ; Durham, Alan M. [4] ; Mehnert, Dolores U. [3] ; Zanotto, Paolo M. de A. [3] ; Reyes, Alejandro [5, 6] ; Gruber, Arthur [1]
Total Authors: 12
Affiliation:
[1] Univ Sao Paulo, Inst Biomed Sci, Dept Parasitol, Sao Paulo - Brazil
[2] Univ Los Andes, Grad Program Computat Biol, Bogota - Colombia
[3] Univ Sao Paulo, Inst Biomed Sci, Dept Microbiol, Sao Paulo - Brazil
[4] Univ Sao Paulo, Inst Math & Stat, Dept Comp Sci, Sao Paulo - Brazil
[5] Univ Los Andes, Dept Biol Sci, Bogota - Colombia
[6] Washington Univ, Ctr Genome Sci & Syst Biol, Dept Pathol & Immunol, St Louis, MO - USA
Total Affiliations: 6
Document type: Journal article
Source: FRONTIERS IN MICROBIOLOGY; v. 7, MAR 4 2016.
Web of Science Citations: 10
Abstract

This work reports the development of GenSeed-HMM, a program that implements seed-driven progressive assembly, an approach to reconstruct specific sequences from unassembled data, starting from short nucleotide or protein seed sequences or profile Hidden Markov Models (HMM). The program can use any one of a number of sequence assemblers. Assembly is performed in multiple steps and relatively few reads are used in each cycle, consequently the program demands low computational resources. As a proof-of-concept and to demonstrate the power of HMM-driven progressive assemblies, GenSeed-HMM was applied to metagenomic datasets in the search for diverse ssDNA bacteriophages from the recently described Alpavirinae subfamily. Profile HMMs were built using Alpavinnae-specific regions from multiple sequence alignments (MSA) using either the viral protein 1 (VP1; major capsid protein) or VP4 (genome replication initiation protein). These profile HMMs were used by GenSeed-HMM (running Newbler assembler) as seeds to reconstruct viral genomes from sequencing datasets of human fecal samples. All contigs obtained were annotated and taxonomically classified using similarity searches and phylogenetic analyses. The most specific profile HMM seed enabled the reconstruction of 45 partial or complete Alpavinnae genomic sequences. A comparison with conventional (global) assembly of the same original dataset, using Newbler in a standalone execution, revealed that GenSeed-HMM outperformed global genomic assembly in several metrics employed. This approach is capable of detecting organisms that have not been used in the construction of the profile HMM, which opens up the possibility of diagnosing novel viruses, without previous specific information, constituting a de novo diagnosis. Additional applications include, but are not limited to, the specific assembly of extrachromosomal elements such as plastid and mitochondrial genomes from metagenomic data. Profile HMM seeds can also be used to reconstruct specific protein coding genes for gene diversity studies, and to determine all possible gene variants present in a metagenomic sample. Such surveys could be useful to detect the emergence of drug-resistance variants in sensitive environments such as hospitals and animal production facilities, where antibiotics are regularly used. Finally, GenSeed-HMM can be used as an adjunct for gap closure on assembly finishing projects, by using multiple contig ends as anchored seeds. (AU)

FAPESP's process: 10/04609-1 - GenSeed-HMM: development of a platform for sequence reconstruction and application on novel virus discovery
Grantee:André Luiz de Oliveira
Support Opportunities: Scholarships in Brazil - Master
FAPESP's process: 13/14622-3 - Comparative genomics of Trypanosomatidae
Grantee:João Marcelo Pereira Alves
Support Opportunities: Research Grants - Young Investigators Grants