Advanced search
Start date

Open source software statistical tools to aid in analyzing and integrating large cancer epigenomic datasets in order to decipher and understand regulatory networks


Genomic and epigenomic features in coding and non-coding DNA have recently been uncovered through advancements in DNA sequencing technologies. Large multi-national consortia (The Cancer Genome Atlas (TCGA), NIH Roadmap and ENCODE) who have spent millions of US dollars in hopes to advance our understanding of human genome across commonly used research cell-lines (e.g. MCF-7, HMEC, etc.), primary normal (e.g. human stem cells) and disease tissues (e.g. brain cancer). The multi-dimensional genomic data consists of more than 10,000 experiments (>100 terabases of data from 1000s of whole- genome, RNAseq, ChIPseq to Methyl-seq) profiled across more than 10,000 cell lines/tissues. All of these data have been deposited within the public domain, providing an invaluable resource for research laboratories, because it allows one to compare and contrast the genomic and epigenomic features to their own sequencing experiments. Despite its prominent availability, the data are deposited in different repositories and format making it a challenge to locate and identify relevant features. Many novice- advanced computational researchers, including our own team, have successfully harnessed some of these freely available data and through advanced integration and scientific insight enable the identification of biologically-relevant epigenomic changes (Berman et al. Nature Genetics 2012, Coetzee et al. Nucleic Acids Research 2012 and Noushmehr et al. Springer 2013). However, among the many issues facing most researchers is the lack of proper bioinformatic tools or skills to effectively integrate their sequencing data with these invaluable biological sequencing data. In partnership with our national collaborators (Life Science/Health co-PI), we will generate more than 200 methylomic and transcriptomic data. With our international collaborators we will develop automated tools for unifying the various gene regulatory databases, and develop powerful yet user-friendly methylation pipelines using the open/source R/Bioconductor structure, and web-based Rstudio Shiny system. Standard workflows will use the methods we have developed for the TCGA, Roadmap and ENCODE project to import and analyze large numbers of raw methylation data files from either the Illumina Infinium or Bisulfite-seq platforms. We will also allow import of arbitrary sample metadata so users can perform two-way or multi-way comparisons between cancer subtypes or clinical covariates. Our workflows will be driven by the most current understanding of the chromatin landscape, which includes using histone modifications and DNase hypersensitivity data to define focal chromatin state. Recent work by our lab and others suggests that methylation changes at cis- regulatory elements such as enhancers and insulators are driven primarily by binding of individual transcription factors, and thus reflect direct targeting of genes by specific transcriptional networks. We will use combined ChIP-seq and DNA binding motif analyses available from ENCODE to analyze user methylation data at the level of the individual protein/DNA interaction site. Finally, because the success of this effort will be measured by the degree of adoption within the cancer genomics community, we will engage several large-scale cancer genomics groups to act as beta testers and help us improve our workflows. (AU)

Articles published in Agência FAPESP Newsletter about the research grant:
Articles published in other media outlets (0 total):
More itemsLess items

Scientific publications (14)
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
COLAPRICO, ANTONIO; SILVA, TIAGO C.; OLSEN, CATHARINA; GAROFANO, LUCIANO; CAVA, CLAUDIA; GAROLINI, DAVIDE; SABEDOT, THAIS S.; MALTA, TATHIANE M.; PAGNOTTA, STEFANO M.; CASTIGLIONI, ISABELLA; et al. TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data. Nucleic Acids Research, v. 44, n. 8, . (15/02844-7, 14/02245-3, 15/07925-5)
MALTA, TATHIANE M.; DE SOUZA, CAMILA F.; SABEDOT, THAIS S.; SILVA, TIAGO C.; MOSELLA, MARITZA S.; KALKANIS, STEVEN N.; SNYDER, JAMES; CASTRO, ANA VALERIA B.; NOUSHMEHR, HOUTAN. Glioma CpG island methylator phenotype (G-CIMP): biological and clinical implications. NEURO-ONCOLOGY, v. 20, n. 5, p. 608-620, . (16/06488-3, 16/15485-8, 14/08321-3, 16/12329-5, 16/10436-9, 15/07925-5, 16/01975-3, 14/02245-3, 16/01389-7, 16/11039-3)
GARCIA-ROSA, SHEILA; TRIVELLA, DANIELA B. B.; MARQUES, VANESSA D.; SERAFIM, RODOLFO B.; PEREIRA, JOSE G. C.; LORENZI, JULIO C. C.; MOLFETTA, GREICE A.; CHRISTO, PAULO P.; OLIVAL, GUILHERME S.; MARCHITTO, VANIA B. T.; et al. A non-functional galanin receptor-2 in a multiple sclerosis patient. PHARMACOGENOMICS JOURNAL, v. 19, n. 1, p. 72-82, . (15/07925-5, 16/06488-3, 13/24293-7)
CECCARELLI, MICHELE; BARTHEL, FLORIS P.; MALTA, TATHIANE M.; SABEDOT, THAIS S.; SALAMA, SOFIE R.; MURRAY, BRADLEY A.; MOROZOVA, OLENA; NEWTON, YULIA; RADENBAUGH, AMIE; PAGNOTTA, STEFANO M.; et al. Molecular Profiling Reveals Biologically Discrete Subsets and Pathways of Progression in Diffuse Glioma. Cell, v. 164, n. 3, p. 550-563, . (14/08321-3, 15/02844-7, 14/02245-3, 15/07925-5)
DE SOUZA, CAMILA FERREIRA; SABEDOT, THAIS S.; MALTA, TATHIANE M.; STETSON, LINDSAY; MOROZOVA, OLENA; SOKOLOV, ARTEM; LAIRD, PETER W.; WIZNEROWICZ, MACIEJ; IAVARONE, ANTONIO; SNYDER, JAMES; et al. A Distinct DNA Methylation Shift in a Subset of Glioma CpG Island Methylator Phenotypes during Tumor Recurrence. CELL REPORTS, v. 23, n. 2, p. 637-651, . (14/03989-6, 16/15485-8, 16/06488-3, 14/08321-3, 16/12329-5, 16/01975-3, 14/02245-3, 15/07925-5)
GUSEV, ALEXANDER; LAWRENSON, KATE; LIN, XIANZHI; LYRA, JR., PAULO C.; KAR, SIDDHARTHA; VAVRA, KEVIN C.; SEGATO, FELIPE; FONSECA, MARCOS A. S.; LEE, JANET M.; PEJOVIC, TANYA; et al. A transcriptome-wide association study of high-grade serous epithelial ovarian cancer identifies new susceptibility genes and splice variants. Nature Genetics, v. 51, n. 5, p. 815+, . (15/07925-5)
LAWRENSON, KATE; SONG, FENGJU; HAZELETT, DENNIS J.; KAR, SIDDHARTHA P.; TYRER, JONATHAN; PHELAN, CATHERINE M.; CORONA, ROSARIO I.; RODRIGUEZ-MALAVE, NORMA I.; SEO, JI-HEI; ADLER, EMILY; et al. Genome-wide association studies identify susceptibility loci for epithelial ovarian cancer in east Asian women. GYNECOLOGIC ONCOLOGY, v. 153, n. 2, p. 343-355, . (15/07925-5)
SILVA, TIAGO C.; COETZEE, SIMON G.; GULL, NICOLE; YAO, LIJING; HAZELETT, DENNIS J.; NOUSHMEHR, HOUTAN; LIN, DE-CHEN; BERMAN, BENJAMIN R.. ELMER v.2: an R/Bioconductor package to reconstruct gene regulatory networks from DNA methylation and transcriptome profiles. Bioinformatics, v. 35, n. 11, p. 1974-1977, . (16/01389-7, 15/07925-5)
MELISO, FABIANA M.; MICALI, DANILO; SILVA, CAMILA T.; SABEDOT, THAIS S.; COETZEE, SIMON G.; KOCH, ADRIAN; FAHLBUSCH, FABIAN B.; NOUSHMEHR, HOUTAN; SCHNEIDER-STOCK, REGINE; JASIULIONIS, MIRIAM G.. SIRT1 regulates Mxd1 during malignant melanoma progression. ONCOTARGET, v. 8, n. 70, p. 114540-114553, . (16/06488-3, 15/07925-5, 11/12306-1, 14/13663-0)
MOUNIR, MOHAMED; LUCCHETTA, MARTA; SILVA, TIAGO C.; OLSEN, CATHARINA; BONTEMPI, GIANLUCA; CHEN, XI; NOUSHMEHR, HOUTAN; COLAPRICO, ANTONIO; PAPALEO, ELENA. New functionalities in the TCGAbiolinks package for the study and integration of cancer data from GDC and GTEx. PLOS COMPUTATIONAL BIOLOGY, v. 15, n. 3, . (16/01389-7, 15/07925-5)
ALDAPE, KENNETH; AMIN, SAMIRKUMAR B.; ASHLEY, DAVID M.; BARNHOLTZ-SLOAN, JILL S.; BATES, AMANDA J.; BEROUKHIM, RAMEEN; BOCK, CHRISTOPH; BRAT, DANIEL J.; CLAUS, ELIZABETH B.; COSTELLO, JOSEPH F.; et al. Glioma through the looking GLASS: molecular evolution of diffuse gliomas and the Glioma Longitudinal Analysis Consortium. NEURO-ONCOLOGY, v. 20, n. 7, p. 873-884, . (16/15485-8, 14/08321-3, 15/07925-5)
MAZOR, TALI; CHESNELONG, CHARLES; PANKOV, ALEKSANDR; JALBERT, LLEWELLYN E.; HONG, CHIBO; HAYES, JOSIE; SMIRNOV, IVAN V.; MARSHALL, ROXANNE; SOUZA, CAMILA F.; SHEN, YAOQING; et al. Clonal expansion and epigenetic reprogramming following deletion or amplification of mutant IDH1. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, v. 114, n. 40, p. 10743-10748, . (16/15485-8, 14/08321-3, 15/07925-5)
MALTA, TATHIANE M.; SOKOLOV, ARTEM; GENTLES, ANDREW J.; BURZYKOWSKI, TOMASZ; POISSON, LAILA; WEINSTEIN, JOHN N.; KAMINSKA, BOZENA; HUELSKEN, JOERG; OMBERG, LARSSON; GEVAERT, OLIVIER; et al. Machine Learning Identifies Stemness Features Associated with Oncogenic Dedifferentiation. Cell, v. 173, n. 2, p. 338+, . (16/06488-3, 14/08321-3, 15/07925-5, 16/01975-3, 16/01389-7, 16/15485-8, 14/02245-3, 16/10436-9, 16/12329-5)

Please report errors in scientific publications list using this form.