Busca avançada
Ano de início
Entree
(Referência obtida automaticamente do Web of Science, por meio da informação sobre o financiamento pela FAPESP e o número do processo correspondente, incluída na publicação pelos autores.)

MARVEL, a Tool for Prediction of Bacteriophage Sequences in Metagenomic Bins

Texto completo
Autor(es):
Amgarten, Deyvid [1] ; Braga, Lucas P. P. [1, 2] ; da Silva, Aline M. [1] ; Setubal, Joao C. [1, 3]
Número total de Autores: 4
Afiliação do(s) autor(es):
[1] Univ Sao Paulo, Dept Bioquim, Inst Quim, Sao Paulo - Brazil
[2] INRA, UMR 1347, Agroecol, Dijon - France
[3] Virginia Tech, Biocomplex Inst, Blacksburg, VA 24061 - USA
Número total de Afiliações: 3
Tipo de documento: Artigo Científico
Fonte: FRONTIERS IN GENETICS; v. 9, AUG 7 2018.
Citações Web of Science: 13
Resumo

Here we present MARVEL, a tool for prediction of double-stranded DNA bacteriophage sequences in metagenomic bins. MARVEL uses a random forest machine learning approach. We trained the program on a dataset with 1,247 phage and 1,029 bacterial genomes, and tested it on a dataset with 335 bacterial and 177 phage genomes. We show that three simple genomic features extracted from contig sequences were sufficient to achieve a good performance in separating bacterial from phage sequences: gene density, strand shifts, and fraction of significant hits to a viral protein database. We compared the performance of MARVEL to that of VirSorter and VirFinder, two popular programs for predicting viral sequences. Our results show that all three programs have comparable specificity, but MARVEL achieves much better performance on the recall (sensitivity) measure. This means that MARVEL should be able to identify many more phage sequences in metagenomic bins than heretofore has been possible. In a simple test with real data, containing mostly bacterial sequences, MARVEL classified 58 out of 209 bins as phage genomes; other evidence suggests that 57 of these 58 bins are novel phage sequences. (AU)

Processo FAPESP: 14/16450-8 - Análise da diversidade de bacteriófagos associada à comunidade microbiana durante o processo de compostagem
Beneficiário:Deyvid Emanuel Amgarten
Modalidade de apoio: Bolsas no Brasil - Mestrado
Processo FAPESP: 11/50870-6 - Estudos da diversidade microbiana no Parque Zoológico do Estado de São Paulo
Beneficiário:João Carlos Setubal
Modalidade de apoio: Auxílio à Pesquisa - Programa BIOTA - Temático