Development of a bioinformatics pipeline for the i... - BV FAPESP
Advanced search
Start date
Betweenand


Development of a bioinformatics pipeline for the identification of emerging infectious diseases in patients submitted to chronic transfusion

Full text
Author(s):
Ian Nunes Valença
Total Authors: 1
Document type: Master's Dissertation
Press: Ribeirão Preto.
Institution: Universidade de São Paulo (USP). Faculdade de Medicina de Ribeirão Preto (PCARP/BC)
Defense date:
Examining board members:
Svetoslav Nanev Slavov; Marta Giovanetti; Flávia Leite Souza Santos
Advisor: Svetoslav Nanev Slavov
Abstract

The emergence of new infectious diseases has been a factor of concern for several decades. Currently, many conditions, such as the high density of animals for meat consumption, facilitated aerial mobility and the rate of deforestation, corroborate to a reality where new diseases may emerge in increasingly short periods of time, especially in countries located in the tropical regions, where viruses of zoonotic origin have even more favorable conditions to emerge due to the very close contact between the population and the environment. In this scenario, the threat of the emergence of these viruses defies biosafety measures in health systems, including blood banks, due to the possibility of transfusion transmission of these agents. This study aims to develop and improve a pipeline for computational analysis of Next Generation Sequencing (SNG) data from patients with hemoglobinopathies undergoing chronic transfusion regimen. The pipeline introduced demonstrated success in fulfilling three main steps for bioinformatics analysis with the subsequent programs used: Quality control (FastQC), Filtration (Trimmomatic, Prinseq, Deconseq), Classification and assembly (Kraken 2, CLARK, SPAdes, BLASTn, BLASTx ) and Phylogenetic Analysis (MAFFT, IQ-Tree, Tree-Puzzle, Bioedit, FigTree). In this project, 75 samples were evaluated, from patients with hemoglobinopathies (30 from patients with beta-thalassemia and 45 with sickle cell anemia) who underwent a chronic transfusion regimen, 76 samples from blood donors as a control group, and 5 samples from patients with hemophilia treated with plasma factors. The implemented pipeline was able to identify two predominant viral families: Anneloviridae and Flaviviridae. Firstly, anelloviruses were the most representative viruses in patients with beta-thalassemia. Second, we were able to identify the human pegivirus flavivirus 1 or HPgV-1 (formerly referred to as GBV-C or hepatitis G virus). In the group of patients with beta-thalassemia, we were able to identify Hepatitis B and Hepatitis C viruses. In the group of patients with sickle cell anemia, the HPgV-1 virus was the most representative. It was possible to identify agents of contaminating origin as plasmids, mostly corresponding to cloning vectors used in the laboratory. Phylogenetic analysis confirmed the most prevalent genotypes of each of the identified viral groups. There were no significant differences between the viral families identified in the groups of poly-transfused patients and blood donors. (AU)

FAPESP's process: 19/07861-8 - Development of a bioinformatic pipeline for identification of emerging infectious diseases among patients in regimen of chronic transfusion
Grantee:Ian Nunes Valença
Support Opportunities: Scholarships in Brazil - Master