Systems biology of long non-coding RNAs


It is estimated that thousands of long non-coding RNAs (lncRNAs) are transcribed in the genome of several organisms. Yet, only a small fraction of them have been functionally characterized. One of the main reasons for this lies in our poor understanding of their expression patterns under different biological conditions, and the difficulty in identifying their possible gene targets. Most of the high-throughput techniques, such as microarrays and next-generation sequencing (RNA-seq), aim to monitor the expression of protein-coding genes. Some of this data, although unexplored in the original publications, are derived from lncRNAs. In this project, we propose to utilize publicly-available high-throughput data to study the systems biology of lncRNAs under different perturbations and the implication of lncRNAs in the regulation of candidate target genes. For example, we re-annotated the most used Affymetrix chip, with which more than 2,300 studies have been published, and revealed the existence of 10,248 probe sets of putative lncRNAs. Studying the expression profile of lncRNAs and how their expression correlates with the expression of protein-coding genes in various tissues and conditions will reveal interesting mechanisms of gene regulation. These analyses will involve the identification of co-expression modules, the prediction of transcription factors responsible for lncRNA transcription, and the association of lncRNAs to diseases and infections. We also propose to validate the regulatory role of some of these lncRNAs. (AU)

