Unknown

Dataset Information

0

Automated prediction of HIV drug resistance from genotype data.


ABSTRACT:

Background

HIV/AIDS is a serious threat to public health. The emergence of drug resistance mutations diminishes the effectiveness of drug therapy for HIV/AIDS. Developing a computational prediction of drug resistance phenotype will enable efficient and timely selection of the best treatment regimens.

Results

A unified encoding of protein sequence and structure was used as the feature vector for predicting phenotypic resistance from genotype data. Two machine learning algorithms, Random Forest and K-nearest neighbor, were used. The prediction accuracies were examined by five-fold cross-validation on the genotype-phenotype datasets. A supervised machine learning approach for automatic prediction of drug resistance was developed to handle genotype-phenotype datasets of HIV protease (PR) and reverse transcriptase (RT). It predicts the drug resistance phenotype and its relative severity from a query sequence. The accuracy of the classification was higher than 0.973 for eight PR inhibitors and 0.986 for ten RT inhibitors, respectively. The overall cross-validated regression R(2)-values for the severity of drug resistance were 0.772-0.953 for 8 PR inhibitors and 0.773-0.995 for 10 RT inhibitors.

Conclusions

Machine learning using a unified encoding of sequence and protein structure as a feature vector provides an accurate prediction of drug resistance from genotype data. A practical webserver for clinicians has been implemented.

SUBMITTER: Shen C 

PROVIDER: S-EPMC5009519 | biostudies-literature | 2016 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

Automated prediction of HIV drug resistance from genotype data.

Shen ChenHsiang C   Yu Xiaxia X   Harrison Robert W RW   Weber Irene T IT  

BMC bioinformatics 20160831


<h4>Background</h4>HIV/AIDS is a serious threat to public health. The emergence of drug resistance mutations diminishes the effectiveness of drug therapy for HIV/AIDS. Developing a computational prediction of drug resistance phenotype will enable efficient and timely selection of the best treatment regimens.<h4>Results</h4>A unified encoding of protein sequence and structure was used as the feature vector for predicting phenotypic resistance from genotype data. Two machine learning algorithms, R  ...[more]

Similar Datasets

| S-EPMC4120140 | biostudies-literature
| S-EPMC7290575 | biostudies-literature
| S-EPMC6668108 | biostudies-literature
| S-EPMC4045614 | biostudies-literature
| S-EPMC6017644 | biostudies-literature
| S-EPMC123057 | biostudies-literature
| S-EPMC5694023 | biostudies-literature
| PRJEB30947 | ENA
| S-EPMC3224585 | biostudies-literature
| S-EPMC9580932 | biostudies-literature