Advanced search
Start date
Betweenand

Explainable Artificial Intelligence Framework for Discovering Latent Knowledge in Medical Papers

Grant number: 25/07981-4
Support Opportunities:Scholarships in Brazil - Scientific Initiation
Start date: August 01, 2025
End date: July 31, 2026
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computer Systems
Principal Investigator:Tiago Agostinho de Almeida
Grantee:Tiago Ribeiro Silvério
Host Institution: Centro de Ciências em Gestão e Tecnologia (CCGT). Universidade Federal de São Carlos (UFSCAR). Campus de Sorocaba. Sorocaba , SP, Brazil

Abstract

The volume of information produced and accessed via the Internet is vast and continually growing. The amount of data available in scientific articles follows the same trend, making manual analysis of all existing content unfeasible. Over the past decade, various strategies and artificial neural network architectures have emerged to represent texts using dense vectors, called word vectors. These techniques have evolved continuously and can process increasingly large text collections with fewer computational resources. As a result, text representation models have been developed for specific knowledge domains, such as PubMedBERT, which, by using a single-domain corpus, allows the model to better capture relationships between words.By creating representation models from abstracts of scientific articles in materials science, Tshitoyan et al. (2019) observed that knowledge of certain relationships between elements was latent. Building on these findings, Berto et al. (2024) adapted the method and applied it to the medical field, demonstrating that relationships between compounds and a target disease, Acute Myeloid Leukemia, existed years before they were discovered as treatments.In this context, this project aims to extend the study by Berto et al. (2024), proposing an explainable artificial intelligence framework centered on word vectors trained from scientific articles in a given medical domain. The goal is to capture, analyze, contextualize, and explain whether latent knowledge can be extracted to accelerate the discovery of new diagnoses, prognoses, and treatments.

News published in Agência FAPESP Newsletter about the scholarship:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)