Advanced search
Start date
Betweenand

Exploring Methodological Pipelines and Advanced Machine Learning for Population-Scale Health Datasets at the Planetary Health Informatics Lab at the University of Oxford

Grant number: 25/20808-0
Support Opportunities:Scholarships abroad - Research Internship - Scientific Initiation
Start date: January 04, 2026
End date: February 28, 2026
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques
Principal Investigator:Ana Estela Antunes da Silva
Grantee:Lais Azevedo Soares
Supervisor: Sara Khalid
Host Institution: Faculdade de Tecnologia (FT). Universidade Estadual de Campinas (UNICAMP). Limeira , SP, Brazil
Institution abroad: University of Oxford, England  
Associated to the scholarship:24/12628-9 - Feature Analysis in Individuals with 22q11.2 Deletion Syndrome using Clustering and Association Rules in Machine Learning, BP.IC

Abstract

The 22q11.2 Deletion Syndrome (22q11.2DS) is the most common chromosomal microdeletion disorder and presents wide phenotypic variability, making early diagnosis challenging and contributing to increased healthcare costs. This project aims to apply machine learning techniques to the Brazilian Database of Craniofacial Anomalies (BBAC) to identify clinical patterns associated with 22q11.2DS. The related scientific initiation project is already investigating clustering algorithms, such as K-means and hierarchical clustering, to group patients by similarity, followed by association rule mining to identify relevant feature combinations that can improve triage efficiency and support early diagnostic decisions. However, the pre-processed dataset includes over 900 attributes derived from clinical forms and physical exams, making the preprocessing phase critical to ensuring data quality and analytical precision. To address this, the proposed research internship will be conducted at the Planetary Health Informatics Lab at the University of Oxford, a leading center in the analysis of multimorbidity using genetic and clinical data. The lab is recognized for its expertise in applying advanced statistical and machine learning techniques to high-dimensional, real-world health datasets, such as those from the UK Biobank and the NHS. This internship will provide hands-on experience with robust data pipelines, preprocessing workflows, and analytical models, contributing directly to the refinement of methodologies applicable to rare disease research. In addition to enhancing the technical quality of the ongoing project, the experience is expected to foster future international collaborations and strengthen the integration between clinical insight and data-driven approaches. (AU)

News published in Agência FAPESP Newsletter about the scholarship:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)