Gene Networks Inference by Reinforcement Learning

Bonini, Rodrigo Cesar; Martins-, David Correa, Jr.

Full text
Author(s):	Bonini, Rodrigo Cesar ; Martins-, David Correa, Jr. Total Authors: 2
Document type:	Journal article
Source:	ADVANCES IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, BSB 2023; v. 13954, p. 12-pg., 2023-01-01.
Abstract
Gene Regulatory Networks inference from gene expression data is an important problem in systems biology field, involving the estimation of gene-gene indirect dependencies and the regulatory functions among these interactions to provide a model that explains the gene expression dataset. The main goal is to comprehend the global molecular mechanisms underlying diseases for the development of medical treatments and drugs. However, such a problem is considered an open problem, since it is difficult to obtain a satisfactory estimation of the dependencies given a very limited number of samples subject to experimental noises. Many gene networks inference methods exist in the literature, where some of them use heuristics or model based algorithms to find interesting networks that explain the data by codifying whole networks as solutions. However, in general, these models are slow, not scalable to real sized networks (thousands of genes), or require many parameters, the knowledge from an specialist or a large number of samples to be feasible. Reinforcement Learning is an adaptable goal oriented approach that does not require large labeled datasets and many parameters; can give good quality solutions in a feasible execution time; and can work automatically without the need of a specialist for a long time. Therefore, we here propose a way to adapt Reinforcement Learning to the Gene Regulatory Networks inference domain in order to get networks with quality comparable to one achieved by exhaustive search, but in much smaller execution time. Our experimental evaluation shows that our proposal is promising in learning and successfully finding good solutions across different tasks automatically in a reasonable time. However, scalabilty to networks with thousands of genes remains as limitation of our RL approach due to excessive memory consuming, although we foresee some possible improvements that could deal with this limitation in future versions of our proposed method. (AU)

FAPESP's process:	18/21934-5 - Network statistics: theory, methods, and applications
Grantee:	André Fujita
Support Opportunities:	Research Projects - Thematic Grants


FAPESP's process:	18/18560-6 - Data integration to identify biological markers of neurodevelopmental disorders
Grantee:	Helena Paula Brentani
Support Opportunities:	Research Projects - Thematic Grants

Short URL