Advanced search
Start date

Enumerative algorithms for biclustering: expanding and exploring their potential in bioinformatics and neuroscience

Grant number: 17/21174-8
Support type:Scholarships in Brazil - Post-Doctorate
Effective date (Start): January 01, 2018
Effective date (End): December 31, 2020
Field of knowledge:Physical Sciences and Mathematics - Computer Science
Principal Investigator:Fernando José von Zuben
Grantee:Rosana Veroneze
Home Institution: Faculdade de Engenharia Elétrica e de Computação (FEEC). Universidade Estadual de Campinas (UNICAMP). Campinas , SP, Brazil
Associated research grant:13/07559-3 - BRAINN - The Brazilian Institute of Neuroscience and Neurotechnology, AP.CEPID


Biclustering has proven to be a powerful data analysis technique, with great success in various application domains. Recently, in the doctorate of the candidate for the scholarship, we proposed a family of biclustering algorithms endowed with unique properties and features, which have not been properly explored in application areas of great potential of contribution. This family has algorithms able to enumerate all maximal biclusters with (i) constant values on columns, (ii) constant values on rows, and (iii) coherent values in numerical data matrices. In addition, we also extend one of these algorithms to enumerate biclusters in mixed-data matrices (which may contain numerical and/or categorical attributes). This new algorithm is endowed with additional and more general features, but retains the performance characteristics of its predecessor. To make the scope of applications of these algorithms even greater, the first two challenges of this project are to make them even more computationally efficient and implement them for parallel execution using Graphics Processing Units (GPUs). Given that enumerative algorithms may return a very large number of biclusters, and not all of them are relevant for the data analysis, the second major focus of this project is on the selection / ranking of biclusters. In this way, we will be able to provide the user with a compact bicluster set that exhibits high relevance and low redundancy at the same time. We will also explore further the connection between biclustering and frequent pattern mining for the development of rule-based classifiers, more specifically, based on associations drawn from the biclusters. Finally, the last major objective of this project is to explore the great potential of applications of this family of algorithms, focusing on the analysis of gene expression data and brain activity data. For the first case, we have a partnership with Professor Raquel Scarel-Caminaga from FOAr-UNESP. The second case is linked to the project CEPID entitled BRAINN (Brazilian Institute of Neuroscience and Neurotechnology), and there is already a recent history of partnership with other groups linked to BRAINN.