Advanced search
Start date
Betweenand

Development and validation of an integrated system for analysis of next generation sequencing (NGS) data, using cystic Fribrosis as model of study

Grant number: 16/01022-6
Support type:Research Grants - Innovative Research in Small Business - PIPE
Duration: September 01, 2016 - May 31, 2017
Field of knowledge:Biological Sciences - Genetics
Principal Investigator:Patricia de Campos Pieri
Grantee:Patricia de Campos Pieri
Company:Pensabio Instrumentos de Biotecnologia Ltda
City: São Paulo
Co-Principal Investigators:Raul Torrieri
Assoc. researchers: Antonio Marcondes Lerario ; Danuza Rossi

Abstract

The next generation sequencing (NGS) has revolutionized the genetic analysis, assisting in the molecular characterization of various diseases. However, analysis of massive amount of information generated in a single experiment has been a limitation to the implementation of diagnostic laboratory. The analysis of NGS data are usually held in large-capacity computers using the Linux operating system and academic software on command prompts. Although extremely flexible, the Linux command lines are not intuitive, require previous knowledge and give room for error. This reality imposes enormous challenges in reproducibility and traceability of analyzes. The objective of this project is to create and validate a computational tool that integrates each step and software required for complete analysis of NGS data using the CFTR gene (associated to cystic fibrosis) as a model. The tool should allow the systematization of analyzes involving multiple samples, traceability of all stages, and the optimization of available computing resources. The choice of Cystic Fibrosis (CF) as a model was based on the fact that it is a serious and incurable disease of high therapeutic cost, that justify neonatal screening policies, with better prognosis with earlier diagnosis. The diagnosis of CF is clinical, but requires confirmation by the identification of mutations in both alleles or the "sweat test", which identifies only the typical cases. Today the available molecular diagnosis is based on research of a limited number of mutations, whose selection impacts the detection power in ethnically heterogeneous population like Brazilian's, and omit recognized clinically relevant variants as the intronic VNTR TG (n) T ( n), difficult to analyze. The sequencing of the entire CFTR gene by the traditional method of Sanger, standard for identification and confirmation of mutations, has some limitations such as the need for a large number of PCR reactions for full gene study (~ 40-60 reactions/sample), laborious analysis totally dependent on a technical trained professional and high cost. The NGS technology overcomes these difficulties by allowing sequencing the entire gene of multiple samples in a single run. However, this technology still poses challenges for data analysis, demonstrating the need for a bioinformatics pipeline and an integrated tool, especially for analysis of VNTRs regions in multiple samples. Thus, the development of a commercial test NGS-based that overcomes the technical and clinical challenges imposed by the tools and resources available today, is essential to implement the diagnosis and neonatal screening policies. Therefore, this project aims to develop a genetic panel to target the 27 exons of the CFTR gene, its promoter region, IVS22 region and TG (n) T (n). The results of the sequencing of this gene in 192 patients, obtained in MiSeq (Illumina), will be used to create and validate an integrated bioinformatics tool capable of simultaneously processing all the data generated at the end of an easily interpreted report. Thus, this research project aims to develop, optimize and validate an analytical pipeline capable of delivering "results in 1-click" not just for CFTR, but also as a diagnostic tool for systematization of any NGS data analysis. (AU)