Unknown

Dataset Information

0

IDEPI: rapid prediction of HIV-1 antibody epitopes and other phenotypic features from sequence data using a flexible machine learning platform.


ABSTRACT: Since its identification in 1983, HIV-1 has been the focus of a research effort unprecedented in scope and difficulty, whose ultimate goals--a cure and a vaccine--remain elusive. One of the fundamental challenges in accomplishing these goals is the tremendous genetic variability of the virus, with some genes differing at as many as 40% of nucleotide positions among circulating strains. Because of this, the genetic bases of many viral phenotypes, most notably the susceptibility to neutralization by a particular antibody, are difficult to identify computationally. Drawing upon open-source general-purpose machine learning algorithms and libraries, we have developed a software package IDEPI (IDentify EPItopes) for learning genotype-to-phenotype predictive models from sequences with known phenotypes. IDEPI can apply learned models to classify sequences of unknown phenotypes, and also identify specific sequence features which contribute to a particular phenotype. We demonstrate that IDEPI achieves performance similar to or better than that of previously published approaches on four well-studied problems: finding the epitopes of broadly neutralizing antibodies (bNab), determining coreceptor tropism of the virus, identifying compartment-specific genetic signatures of the virus, and deducing drug-resistance associated mutations. The cross-platform Python source code (released under the GPL 3.0 license), documentation, issue tracking, and a pre-configured virtual machine for IDEPI can be found at https://github.com/veg/idepi.

SUBMITTER: Hepler NL 

PROVIDER: S-EPMC4177671 | biostudies-literature | 2014 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

IDEPI: rapid prediction of HIV-1 antibody epitopes and other phenotypic features from sequence data using a flexible machine learning platform.

Hepler N Lance NL   Scheffler Konrad K   Weaver Steven S   Murrell Ben B   Richman Douglas D DD   Burton Dennis R DR   Poignard Pascal P   Smith Davey M DM   Kosakovsky Pond Sergei L SL  

PLoS computational biology 20140925 9


Since its identification in 1983, HIV-1 has been the focus of a research effort unprecedented in scope and difficulty, whose ultimate goals--a cure and a vaccine--remain elusive. One of the fundamental challenges in accomplishing these goals is the tremendous genetic variability of the virus, with some genes differing at as many as 40% of nucleotide positions among circulating strains. Because of this, the genetic bases of many viral phenotypes, most notably the susceptibility to neutralization  ...[more]

Similar Datasets

| S-EPMC8402714 | biostudies-literature
2013-01-01 | E-GEOD-29210 | biostudies-arrayexpress
| S-EPMC7714352 | biostudies-literature
| S-EPMC6774822 | biostudies-literature
| S-EPMC7417437 | biostudies-literature
| S-EPMC8477400 | biostudies-literature
| S-EPMC9397377 | biostudies-literature
2013-01-01 | GSE29210 | GEO
| S-EPMC4836714 | biostudies-literature
| S-EPMC7015433 | biostudies-literature