Advanced search
Start date
Betweenand

Development of Advanced Methods for Local Genetic Ancestry Inference Focusing on Rare Alleles in Brazilian Populations

Grant number: 24/15110-0
Support Opportunities:Scholarships in Brazil - Doctorate
Start date: March 01, 2025
End date: August 31, 2027
Field of knowledge:Biological Sciences - Genetics - Human and Medical Genetics
Principal Investigator:Tábita Hünemeier
Grantee:Jose Franklin Calderon Tantalean
Host Institution: Instituto de Biociências (IB). Universidade de São Paulo (USP). São Paulo , SP, Brazil

Abstract

Local ancestry inference (LAI) is essential for understanding genetic diversity and population dynamics in admixed populations. Historical events such as migration, colonization, and slavery have created a mosaic of ancestral lineages, complicating the study of genetic associations with diseases. Methods such as principal component analysis (PCA) and admixture mapping are necessary to infer these genetic relationships with greater precision. LAI tools, including models based on Hidden Markov Models (HMM) and machine learning techniques, have evolved significantly, with notable examples such as SABER, HAPMIX, MOSAIC, LAMP, and WINPOP. Recent innovations, such as RFMix, LAI-Net, and SALAI-Net, have improved LAI accuracy. Convolutional neural networks (CNNs) and transfer learning provide robust solutions for diverse genetic data, while autoencoders help in dimensionality reduction and capturing complex genetic information. However, challenges remain in inferring ancestry in ancient admixture events or in closely related populations. To address these challenges, Autoencoder-Augmented Deep Neural Networks (AADNNs) and other advanced methods will be explored to improve LAI accuracy in regions with rare alleles. The autoencoder will be trained on Brazilian genomic data, particularly from the "DNA of Brazil" (DNABR) project, which encompasses 2,723 individuals from the five geographic regions of Brazil, sequenced for Whole Genome Sequence (WGS). This data, sequenced with Illumina NovaSeq 6000 technology at 35X coverage, will be compressed to feed a deep neural network with convolutional layers for feature extraction, recurrent layers (LSTM or GRU) for sequence learning, and dense layers for ancestry classification. The results will significantly contribute to the Brazilian genomic landscape, advancing medical genetics and public health by facilitating the precise identification of genetic risk factors and supporting the development of personalized medicine approaches. Additionally, they will provide a deeper understanding of admixture patterns and historical migration flows, improving the effectiveness of preventive and therapeutic strategies tailored to the genetic characteristics of the Brazilian population.

News published in Agência FAPESP Newsletter about the scholarship:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)