Advanced search
Start date
Betweenand

Gene selection and Outliers in microarray data

Grant number: 12/15751-9
Support Opportunities:Scholarships abroad - Research Internship - Doctorate
Start date: February 01, 2013
End date: January 31, 2014
Field of knowledge:Physical Sciences and Mathematics - Computer Science
Principal Investigator:Roseli Aparecida Francelin Romero
Grantee:Pablo Andretta Jaskowiak
Supervisor: Jörg Sander
Host Institution: Instituto de Ciências Matemáticas e de Computação (ICMC). Universidade de São Paulo (USP). São Carlos , SP, Brazil
Institution abroad: University of Alberta, Canada  
Associated to the scholarship:11/04247-5 - Gene Selection and Clustering Validation in Gene Expression Data, BP.DR

Abstract

Microarray technology enables expression level measurement of thousands of genes in a parallel fashion. The genomic picture obtained with microarrays may help researchers to gather knowledge and insight about diverse biological phenomena. Microarray data, the so-called gene expression data, allows different kinds of analysis, such as patient sample classification, which is usually related to the classification of different types or subtypes of cancer. Given the large number of genes (features) and small number of patient samples (objects) that constitute gene expression datasets, the use of gene selection methods is mandatory to their analysis. Even though different gene selection methods have been recently introduced in the gene expression literature, several aspects of the problem remain poorly explored. Bearing this in mind, this project focuses, mainly, on two aspects of the gene selection task. The first one concerns the development and analysis of hybrid gene selection methods, taking into account advantages from both filters and wrappers. The second one regards outliers and their effect on the stability of gene selection methods, i.e., how outlier samples affect the subsets of genes selected on datasets from a particular cancer type. In the realm of biological data, processes stability turns out to be fundamental not only to identify cancer biomarkers, but also to establish the reliability degree, i.e., the confidence of gene selection methods. This is a central motivation of the present research proposal. (AU)

News published in Agência FAPESP Newsletter about the scholarship:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)

Scientific publications
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
JASKOWIAK, PABLO A.; CAMPELLO, RICARDO J. G. B.; COSTA, IVAN G.. On the selection of appropriate distances for gene expression data clustering. BMC Bioinformatics, v. 15, p. 17-pg., . (11/04247-5, 12/15751-9)
JASKOWIAK, PABLO A.; CAMPELLO, RICARDO J. G. B.; COSTA, IVAN G.. On the selection of appropriate distances for gene expression data clustering. BMC Bioinformatics, v. 15, n. 2, . (12/15751-9, 11/04247-5)
JASKOWIAK, PABLO A.; MOULAVI, DAVOUD; FURTADO, ANTONIO C. S.; CAMPELLO, RICARDO J. G. B.; ZIMEK, ARTHUR; SANDER, JOERG. On strategies for building effective ensembles of relative clustering validity criteria. KNOWLEDGE AND INFORMATION SYSTEMS, v. 47, n. 2, p. 329-354, . (12/15751-9, 10/20032-6, 11/04247-5)