Advanced search
Start date
Betweenand
(Reference retrieved automatically from Web of Science through information on FAPESP grant and its corresponding number as mentioned in the publication by the authors.)

MARVEL, a Tool for Prediction of Bacteriophage Sequences in Metagenomic Bins

Full text
Author(s):
Amgarten, Deyvid [1] ; Braga, Lucas P. P. [1, 2] ; da Silva, Aline M. [1] ; Setubal, Joao C. [1, 3]
Total Authors: 4
Affiliation:
[1] Univ Sao Paulo, Dept Bioquim, Inst Quim, Sao Paulo - Brazil
[2] INRA, UMR 1347, Agroecol, Dijon - France
[3] Virginia Tech, Biocomplex Inst, Blacksburg, VA 24061 - USA
Total Affiliations: 3
Document type: Journal article
Source: FRONTIERS IN GENETICS; v. 9, AUG 7 2018.
Web of Science Citations: 13
Abstract

Here we present MARVEL, a tool for prediction of double-stranded DNA bacteriophage sequences in metagenomic bins. MARVEL uses a random forest machine learning approach. We trained the program on a dataset with 1,247 phage and 1,029 bacterial genomes, and tested it on a dataset with 335 bacterial and 177 phage genomes. We show that three simple genomic features extracted from contig sequences were sufficient to achieve a good performance in separating bacterial from phage sequences: gene density, strand shifts, and fraction of significant hits to a viral protein database. We compared the performance of MARVEL to that of VirSorter and VirFinder, two popular programs for predicting viral sequences. Our results show that all three programs have comparable specificity, but MARVEL achieves much better performance on the recall (sensitivity) measure. This means that MARVEL should be able to identify many more phage sequences in metagenomic bins than heretofore has been possible. In a simple test with real data, containing mostly bacterial sequences, MARVEL classified 58 out of 209 bins as phage genomes; other evidence suggests that 57 of these 58 bins are novel phage sequences. (AU)

FAPESP's process: 14/16450-8 - Diversity analysis of bacteriophages associated to microbial community during the composting process
Grantee:Deyvid Emanuel Amgarten
Support Opportunities: Scholarships in Brazil - Master
FAPESP's process: 11/50870-6 - Studies of microbial diversity in the Zoological Park of the State of São Paulo
Grantee:João Carlos Setubal
Support Opportunities: BIOTA-FAPESP Program - Thematic Grants