Advanced search
Start date
Betweenand
(Reference retrieved automatically from Web of Science through information on FAPESP grant and its corresponding number as mentioned in the publication by the authors.)

A systematic comparative evaluation of biclustering techniques

Full text
Author(s):
Padilha, Victor A. ; Campello, Ricardo J. G. B.
Total Authors: 2
Document type: Journal article
Source: BMC Bioinformatics; v. 18, JAN 23 2017.
Web of Science Citations: 23
Abstract

Background: Biclustering techniques are capable of simultaneously clustering rows and columns of a data matrix. These techniques became very popular for the analysis of gene expression data, since a gene can take part of multiple biological pathways which in turn can be active only under specific experimental conditions. Several biclustering algorithms have been developed in the past recent years. In order to provide guidance regarding their choice, a few comparative studies were conducted and reported in the literature. In these studies, however, the performances of the methods were evaluated through external measures that have more recently been shown to have undesirable properties. Furthermore, they considered a limited number of algorithms and datasets. Results: We conducted a broader comparative study involving seventeen algorithms, which were run on three synthetic data collections and two real data collections with a more representative number of datasets. For the experiments with synthetic data, five different experimental scenarios were studied: different levels of noise, different numbers of implanted biclusters, different levels of symmetric bicluster overlap, different levels of asymmetric bicluster overlap and different bicluster sizes, for which the results were assessed with more suitable external measures. For the experiments with real datasets, the results were assessed by gene set enrichment and clustering accuracy. Conclusions: We observed that each algorithm achieved satisfactory results in part of the biclustering tasks in which they were investigated. The choice of the best algorithm for some application thus depends on the task at hand and the types of patterns that one wants to detect. (AU)

FAPESP's process: 13/18698-4 - Methods and algorithms in unsupervised and semi-supervised machine learning
Grantee:Ricardo José Gabrielli Barreto Campello
Support type: Regular Research Grants
FAPESP's process: 14/08840-0 - Systematic evaluation of bi-clustering techniques
Grantee:Victor Alexandre Padilha
Support type: Scholarships in Brazil - Master