Advanced search
Start date
Betweenand


Natural selection on HLA genes: a molecular investigation of the location and timing of selection events

Full text
Author(s):
Bárbara Domingues Bitarello
Total Authors: 1
Document type: Master's Dissertation
Press: São Paulo.
Institution: Universidade de São Paulo (USP). Instituto de Biociências (IBIOC/SB)
Defense date:
Examining board members:
Diogo Meyer; Carlos Eduardo Guerra Schrago; Tatiana Teixeira Torres
Advisor: Diogo Meyer
Abstract

The comparison of non-synonymous (dN) and synonymous (dS) substitution rates allows us to infer selection schemes which operated in coding regions. Genes with dN/dS > 1 are candidates to be under positive selection, and genome scans in search for this signature have proved to be a powerful tool. It is a robust method, since it is assumed that such sites are interspersed in regions of the genome under study (and therefore share the same demographic history), and which focuses on the variation in specific genes, eliminating ambiguities about the target of selection. On the other hand, the criterion of ω > 1 for genes to be considered under positive selection is very conservative. This is because usually only a few codons are under positive selection, while most non-synonymous mutations are deleterious and are thus under purifying selection. Therefore, it has been conventioned to analyze subsets of codons in search of selection, either through a narrower data set or through models that calculate different ω values for subsets of codons, making it possible to infer which of them are under positive selection. The MHC molecules genes have different variation patterns that indicate some sort of balancing selection acted upon them (high diversity, large differentiation between alleles and the existence of trans-specific polymorphisms). HLA genes are a subset of human MHC genes and are located on the short arm of chromosome 6. The classical class I genes (HLA-A, HLA-B and HLA-C) are expressed in most somatic cells and play a central role in the process of adaptive immune response, capturing and presenting peptides on the cell surface. The region of the MHC molecule to which antigens are bound to be presented to T lymphocytes, thereby initiating the adaptive immune response, is known as the antigen recognition site (ARS). It is well established that ARS codons have higher non-synonymous than synonymous substitution rates on these three loci, consistent with an effect of balancing selection leading to greater variability in the functional region of the molecule that interacts with the peptide. The classical HLA genes have hundreds of alleles, and these constitute clades which group phylogenetically related and functionally similar alleles. Although there is no controversy about the existence of selection acting on HLA genes, there is no consensus on the relative importance of selection on allelic lineages and on individual alleles on the diversification of HLA alleles, and that was the question we decided to investigate. Our null hypothesis was that lineages were targets of selection and the alternative hypothesis was that the individual alleles were targets of selection during the evolutionary history of HLA genes. We sought, first, to make a validation of the method of inference dN/d>S using as a case study the ARS codons and their relation to the ability of inference. Of all the codons under selection we found, all (except one) are also in the classification of classical ARS codons or ±1 codon away from these, showing that there is evidence of selection at nearby sites to the ARS. Therefore, an expanded classification, which includes the codons under selection that are not commonly used in the ARS classifications, should increase the statistical power of selecion model tests on the HLA genes. By comparing the results obtained in phylogenetic analysis using data sets with or without recombinants, we found that the removal of recombinant alleles alters the parameter estimates, the identification of codons with evidence of selection and the significance of model comparison tests. Our analysis showed that ω is significantly higher for pairs of alleles from different lineages than for pairs of alleles from the same lineage and that there is a significant positive correlation between time of divergence of alleles and estimates of ω. We also verified that it is possible to reject a null model of one ω estimated for all branches of the phylogenetic tree and favor a model where ω is estimated separately for branches within and between lineages of HLA-C. In HLA-A and HLA-C, ω is significantly > 1 between lineages. We also show that, for these same loci, ω fis significantly greater than one or internal branches. In HLA-C, the model that estimates ω separately for terminal and internal branches was favored. Our results show that the intensity of selection between lineages is greater then within them. However, even within lineages, there is a strong evidence of deviation from neutrality, suggesting the action of natural selection. (AU)