Advanced search
Start date
Betweenand

Systematic evaluation of bi-clustering techniques

Grant number: 14/08840-0
Support type:Scholarships in Brazil - Master
Effective date (Start): August 01, 2014
Effective date (End): April 30, 2016
Field of knowledge:Physical Sciences and Mathematics - Computer Science
Principal Investigator:Ricardo José Gabrielli Barreto Campello
Grantee:Victor Alexandre Padilha
Home Institution: Instituto de Ciências Matemáticas e de Computação (ICMC). Universidade de São Paulo (USP). São Carlos , SP, Brazil

Abstract

Cluster analysis is a fundamental problem of unsupervised machine learning where the objective is to determine categories that describe a set of objects according to their similarities and inter-relationships. In the traditional formulation of the problem one seeks partitions or hierarchies of partitions containing groups whose objects are in some way similar among themselves and dissimilar to objects of other groups according to some direct or indirect measure of (dis)similarity that takes into account all the attributes that describe the objects in the database under analysis. Thus, it is assumed that all groups are characterized as such in the same space, that is, according to the same attributes. However, despite decades of successful applications, there are situations in which the nature of the groups contained in the data cannot be represented according to this type of formulation. In particular, there are situations in which groups of objects are characterized as such only under a subset of attributes that describe them, and such subset can be different for each group. Unlike traditional clustering algorithms, bi-clustering algorithms are able to cluster simultaneously rows and columns of an n x d data matrix, which can represent a set of n objects described as d-dimensional attribute vectors. Such algorithms produce bi-clusters formed by subsets of objects and subsets of attributes strongly correlated in some sense. These algorithms began to attract the attention of the scientific community when the importance of the task of bi-clustering in problems such as gene expression data analysis in bioinformatics and analysis of transactions in recommender systems, among others, became evident. This project aims to conduct a comprehensive comparative study involving a wide range of different bi-clustering algorithms and a representative collection of application scenarios of both real and simulated natures, with particular emphasis on problems of analysis of gene expression data.

Scientific publications
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
PADILHA, VICTOR A.; CAMPELLO, RICARDO J. G. B. A systematic comparative evaluation of biclustering techniques. BMC Bioinformatics, v. 18, JAN 23 2017. Web of Science Citations: 22.
Academic Publications
(References retrieved automatically from State of São Paulo Research Institutions)
PADILHA, Victor Alexandre. A systematic comparative evaluation of biclustering techniques. 2016. Master's Dissertation - Universidade de São Paulo (USP). Instituto de Ciências Matemáticas e de Computação São Carlos.

Please report errors in scientific publications list by writing to: cdi@fapesp.br.