Lineage-specific differences in the amino acid substitution process.
Ontology highlight
ABSTRACT: In Darwinian evolution, mutations occur approximately at random in a gene, turned into amino acid mutations by the genetic code. Some mutations are fixed to become substitutions and some are eliminated from the population. Partitioning pairs of closely related species with complete genome sequences by average population size of each pair, we looked at the substitution matrices generated for these partitions and compared the substitution patterns between species. We estimated a population genetic model that relates the relative fixation probabilities of different types of mutations to the selective pressure and population size. Parameterizations of the average and distribution of selective pressures for different amino acid substitution types in different population size comparisons were generated with a Bayesian framework. We found that partitions in population size as well as in substitution type are required to explain the substitution data. Selection coefficients were found to decrease with increasingly radical amino acid substitution and with increasing effective population size. To further explore the role of underlying processes in amino acid substitution, we analyzed embryophyte (plant) gene families from TAED (The Adaptive Evolution Database), where solved structures for at least one member exist in the Protein Data Bank. Using PAML, we assigned branches to three categories: strong negative selection, moderate negative selection/neutrality, and positive diversifying selection. Focusing on the first and third categories, we identified sites changing along gene family lineages and observed the spatial patterns of substitution. Selective sweeps were expected to create primary sequence clustering under positive diversifying selection. Co-evolution through direct physical interaction was expected to cause tertiary structural clustering. Under both positive and negative selection, the substitution patterns were found to be nonrandom. Under positive diversifying selection, significant independent signals were found for primary and tertiary sequence clustering, suggesting roles for both selective sweeps and direct physical interaction. Under strong negative selection, the signals were not found to be independent. All together, a complex interplay of population genetic and protein thermodynamics forces is suggested.
SUBMITTER: Huzurbazar S
PROVIDER: S-EPMC2850115 | biostudies-literature | 2010 Mar
REPOSITORIES: biostudies-literature
ACCESS DATA