Variation in the analysis of positively selected sites using nonsynonymous/synonymous rate ratios: an example using influenza virus.
Ontology highlight
ABSTRACT: Sites in a gene showing the nonsynonymous/synonymous rate ratio (?) >1 have been frequently identified to be under positive selection. To examine the performance of such analysis, sites of the ? ratio >1 in the HA1 gene of H3N2 subtype human influenza viruses were identified from seven overlapping sequence data sets in this study. Our results showed that the sites of the ? ratio >1 were of significant variation among the data sets even though they targeted similar clusters, indicating that the analysis is likely to be either of low sensitivity or of low specificity in identifying sites under positive selection. Most (43/45) of the sites showing ? >1 calculated from at least one data set are involved in B-cell epitopes which cover less than a half sites in the protein, suggesting that the analysis is likely to be of low sensitivity rather than of low specificity. It was further found that the analysis sensitivity could not be enhanced by including more sequences or covering longer time intervals. Previously some reports also likely identified only a portion of the sites under positive selection in the viral gene using the ? ratio. Low sensitivity of the analysis may result from that some sites under positive selection in the gene are also under negative (purifying) selection simultaneously for functional constrains, and so their ? ratios could be <1. Theoretically, the sites under the two opposite selection forces at the same time favor only certain nonsynonymous changes, e.g. those changing the antigenicity of the gene and maintaining the gene function. This study also suggested that sometimes we can identify more sites under positive selection using the ? ratio by integrating the positively selected sites estimated from multiple data sets.
SUBMITTER: Chen J
PROVIDER: S-EPMC3101217 | biostudies-literature | 2011
REPOSITORIES: biostudies-literature
ACCESS DATA