Dataset Information

Diversity and complexity of HIV-1 drug resistance: a bioinformatics approach to predicting phenotype from genotype.

ABSTRACT: Drug resistance testing has been shown to be beneficial for clinical management of HIV type 1 infected patients. Whereas phenotypic assays directly measure drug resistance, the commonly used genotypic assays provide only indirect evidence of drug resistance, the major challenge being the interpretation of the sequence information. We analyzed the significance of sequence variations in the protease and reverse transcriptase genes for drug resistance and derived models that predict phenotypic resistance from genotypes. For 14 antiretroviral drugs, both genotypic and phenotypic resistance data from 471 clinical isolates were analyzed with a machine learning approach. Information profiles were obtained that quantify the statistical significance of each sequence position for drug resistance. For the different drugs, patterns of varying complexity were observed, including between one and nine sequence positions with substantial information content. Based on these information profiles, decision tree classifiers were generated to identify genotypic patterns characteristic of resistance or susceptibility to the different drugs. We obtained concise and easily interpretable models to predict drug resistance from sequence information. The prediction quality of the models was assessed in leave-one-out experiments in terms of the prediction error. We found prediction errors of 9.6-15.5% for all drugs except for zalcitabine, didanosine, and stavudine, with prediction errors between 25.4% and 32.0%. A prediction service is freely available at http://cartan.gmd.de/geno2pheno.html.

SUBMITTER: Beerenwinkel N

PROVIDER: S-EPMC123057 | biostudies-literature | 2002 Jun

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Diversity and complexity of HIV-1 drug resistance: a bioinformatics approach to predicting phenotype from genotype.

Beerenwinkel Niko N Schmidt Barbara B Walter Hauke H Kaiser Rolf R Lengauer Thomas T Hoffmann Daniel D Korn Klaus K Selbig Joachim J

Proceedings of the National Academy of Sciences of the United States of America 20020601 12

Drug resistance testing has been shown to be beneficial for clinical management of HIV type 1 infected patients. Whereas phenotypic assays directly measure drug resistance, the commonly used genotypic assays provide only indirect evidence of drug resistance, the major challenge being the interpretation of the sequence information. We analyzed the significance of sequence variations in the protease and reverse transcriptase genes for drug resistance and derived models that predict phenotypic resi ...[more]

PMID: 12060770

Similar Datasets

Project description:BackgroundMaturation inhibitors are a new class of antiretroviral drugs. Bevirimat (BVM) was the first substance in this class of inhibitors entering clinical trials. While the inhibitory function of BVM is well established, the molecular mechanisms of action and resistance are not well understood. It is known that mutations in the regions CS p24/p2 and p2 can cause phenotypic resistance to BVM. We have investigated a set of p24/p2 sequences of HIV-1 of known phenotypic resistance to BVM to test whether BVM resistance can be predicted from sequence, and to identify possible molecular mechanisms of BVM resistance in HIV-1.ResultsWe used artificial neural networks and random forests with different descriptors for the prediction of BVM resistance. Random forests with hydrophobicity as descriptor performed best and classified the sequences with an area under the Receiver Operating Characteristics (ROC) curve of 0.93 +/- 0.001. For the collected data we find that p2 sequence positions 369 to 376 have the highest impact on resistance, with positions 370 and 372 being particularly important. These findings are in partial agreement with other recent studies. Apart from the complex machine learning models we derived a number of simple rules that predict BVM resistance from sequence with surprising accuracy. According to computational predictions based on the data set used, cleavage sites are usually not shifted by resistance mutations. However, we found that resistance mutations could shorten and weaken the alpha-helix in p2, which hints at a possible resistance mechanism.ConclusionsWe found that BVM resistance of HIV-1 can be predicted well from the sequence of the p2 peptide, which may prove useful for personalized therapy if maturation inhibitors reach clinical practice. Results of secondary structure analysis are compatible with a possible route to BVM resistance in which mutations weaken a six-helix bundle discovered in recent experiments, and thus ease Gag cleavage by the retroviral protease.

Project description:BackgroundWashington DC has a high burden of HIV with a 2.0% HIV prevalence. The city is a national and international hub potentially containing a broad diversity of HIV variants; yet few sequences from DC are available on GenBank to assess the evolutionary history of HIV in the US capital. Towards this general goal, here we analyze extensive sequence data and investigate HIV diversity, phylodynamics, and drug resistant mutations (DRM) in DC.MethodsMolecular HIV-1 sequences were collected from participants infected through 2015 as part of the DC Cohort, a longitudinal observational study of HIV+ patients receiving care at 13 DC clinics. Sequences were paired with Cohort demographic, risk, and clinical data and analyzed using maximum likelihood, Bayesian and coalescent approaches of phylogenetic, network and population genetic inference. We analyzed 601 sequences from 223 participants for int (~864 bp) and 2,810 sequences from 1,659 participants for PR/RT (~1497 bp).ResultsNinety-nine and 94% of the int and PR/RT sequences, respectively, were identified as subtype B, with 14 non-B subtypes also detected. Phylodynamic analyses of US born infected individuals showed that HIV population size varied little over time with no significant decline in diversity. Phylogenetic analyses grouped 13.5% of the int sequences into 14 clusters of 2 or 3 sequences, and 39.0% of the PR/RT sequences into 203 clusters of 2-32 sequences. Network analyses grouped 3.6% of the int sequences into 4 clusters of 2 sequences, and 10.6% of the PR/RT sequences into 76 clusters of 2-7 sequences. All network clusters were detected in our phylogenetic analyses. Higher proportions of clustered sequences were found in zip codes where HIV prevalence is highest (r = 0.607; P<0.00001). We detected a high prevalence of DRM for both int (17.1%) and PR/RT (39.1%), but only 8 int and 12 PR/RT amino acids were identified as under adaptive selection. We observed a significant (P<0.0001) association between main risk factors (men who have sex with men and heterosexuals) and genotypes in the five well-supported clusters with sufficient sample size for testing.DiscussionPairing molecular data with clinical and demographic data provided novel insights into HIV population dynamics in Washington, DC. Identification of populations and geographic locations where clustering occurs can inform and complement active surveillance efforts to interrupt HIV transmission.

Dataset Information

Diversity and complexity of HIV-1 drug resistance: a bioinformatics approach to predicting phenotype from genotype.

Publications

Diversity and complexity of HIV-1 drug resistance: a bioinformatics approach to predicting phenotype from genotype.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets