Advanced search
Start date
Betweenand
(Reference retrieved automatically from Web of Science through information on FAPESP grant and its corresponding number as mentioned in the publication by the authors.)

A comparative study of statistical methods used to identify dependencies between gene expression signals

Full text
Author(s):
Santos, Suzana de Siqueira ; Takahashi, Daniel Yasumasa ; Nakata, Asuka ; Fujita, Andre [1]
Total Authors: 4
Affiliation:
[1] Univ Sao Paulo, Dept Comp Sci, Inst Math & Stat, BR-05508090 Sao Paulo, SP - Brazil
Total Affiliations: 1
Document type: Journal article
Source: BRIEFINGS IN BIOINFORMATICS; v. 15, n. 6, p. 906-918, NOV 2014.
Web of Science Citations: 29
Abstract

One major task in molecular biology is to understand the dependency among genes to model gene regulatory networks. Pearson's correlation is the most common method used to measure dependence between gene expression signals, but it works well only when data are linearly associated. For other types of association, such as non-linear or non-functional relationships, methods based on the concepts of rank correlation and information theory-based measures are more adequate than the Pearson's correlation, but are less used in applications, most probably because of a lack of clear guidelines for their use. This work seeks to summarize the main methods (Pearson's, Spearman's and Kendall's correlations; distance correlation; Hoeffding's D measure; Heller-Heller-Gorfine measure; mutual information and maximal information coefficient) used to identify dependency between random variables, especially gene expression data, and also to evaluate the strengths and limitations of each method. Systematic Monte Carlo simulation analyses ranging from sample size, local dependence and linear/non-linear and also non-functional relationships are shown. Moreover, comparisons in actual gene expression data are carried out. Finally, we provide a suggestive list of methods that can be used for each type of data set. (AU)

FAPESP's process: 13/03447-6 - Combinatorial structures, optimization, and algorithms in theoretical Computer Science
Grantee:Carlos Eduardo Ferreira
Support Opportunities: Research Projects - Thematic Grants
FAPESP's process: 12/25417-9 - Development of statistical and computational methods for the analysis of graphs with applications in biological networks
Grantee:Suzana de Siqueira Santos
Support Opportunities: Scholarships in Brazil - Master
FAPESP's process: 11/07762-8 - Granger causality for sets of time series: development of methodologies to model selection and extensions in the frequency domain with applications to molecular biology and neuroscience
Grantee:André Fujita
Support Opportunities: Regular Research Grants
FAPESP's process: 11/50761-2 - Models and methods of e-Science for life and agricultural sciences
Grantee:Roberto Marcondes Cesar Junior
Support Opportunities: Research Projects - Thematic Grants