Advanced search
Start date
Betweenand

Developing Specialized Language Models for Real-Time Public Health Data Extraction and Analysis

Grant number: 25/03214-9
Support Opportunities:Scholarships in Brazil - Master
Start date: May 01, 2025
End date: April 30, 2027
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques
Principal Investigator:Elbert Einstein Nehrer Macau
Grantee:Pedro Henrique de Moraes
Host Institution: Instituto de Ciência e Tecnologia (ICT). Universidade Federal de São Paulo (UNIFESP). Campus São José dos Campos. São José dos Campos , SP, Brazil
Associated research grant:21/10599-3 - The Antimicrobial Resistance Institute of São Paulo (The Aries Project), AP.CEPID

Abstract

The rapid advancement of Large Language Models (LLMs) has transformed natural languageprocessing, enabling machines to understand and generate human language withunprecedented accuracy. Models like GPT-3 and GPT-4 have demonstrated exceptionalcapabilities in tasks such as text generation, translation, and question answering. However,directly applying these general-purpose models to specialized domains like public healthposes challenges due to domain-specific terminology and complexities. This project aims toevaluate LLMs capable of extracting data from public health texts, including academicliterature and social media content, to power a real-time system for automatic data extraction.The focus will be on fine-tuning the models for Named Entity Recognition (NER), enablingthem to identify and classify relevant entities such as diseases, medications, healthcareproviders, and public health trends. By leveraging a curated dataset of public healthdocuments and social media data, the project seeks to enhance the models' ability to capturethe nuances of medical language and improve the accuracy of information extraction. Theultimate goal is to develop a robust tool that supports researchers, healthcare professionals,and policymakers by providing timely and accurate analysis of public health data. This willcontribute to a deeper understanding of health trends, inform the development of effectiveinterventions, and enhance the overall health of populations.

News published in Agência FAPESP Newsletter about the scholarship:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)