Unknown

Dataset Information

0

HIV-1 coreceptor usage prediction without multiple alignments: an application of string kernels.


ABSTRACT:

Background

Human immunodeficiency virus type 1 (HIV-1) infects cells by means of ligand-receptor interactions. This lentivirus uses the CD4 receptor in conjunction with a chemokine coreceptor, either CXCR4 or CCR5, to enter a target cell. HIV-1 is characterized by high sequence variability. Nonetheless, within this extensive variability, certain features must be conserved to define functions and phenotypes. The determination of coreceptor usage of HIV-1, from its protein envelope sequence, falls into a well-studied machine learning problem known as classification. The support vector machine (SVM), with string kernels, has proven to be very efficient for dealing with a wide class of classification problems ranging from text categorization to protein homology detection. In this paper, we investigate how the SVM can predict HIV-1 coreceptor usage when it is equipped with an appropriate string kernel.

Results

Three string kernels were compared. Accuracies of 96.35% (CCR5) 94.80% (CXCR4) and 95.15% (CCR5 and CXCR4) were achieved with the SVM equipped with the distant segments kernel on a test set of 1425 examples with a classifier built on a training set of 1425 examples. Our datasets are built with Los Alamos National Laboratory HIV Databases sequences. A web server is available at http://genome.ulaval.ca/hiv-dskernel.

Conclusion

We examined string kernels that have been used successfully for protein homology detection and propose a new one that we call the distant segments kernel. We also show how to extract the most relevant features for HIV-1 coreceptor usage. The SVM with the distant segments kernel is currently the best method described.

SUBMITTER: Boisvert S 

PROVIDER: S-EPMC2637298 | biostudies-literature | 2008 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

HIV-1 coreceptor usage prediction without multiple alignments: an application of string kernels.

Boisvert Sébastien S   Marchand Mario M   Laviolette François F   Corbeil Jacques J  

Retrovirology 20081204


<h4>Background</h4>Human immunodeficiency virus type 1 (HIV-1) infects cells by means of ligand-receptor interactions. This lentivirus uses the CD4 receptor in conjunction with a chemokine coreceptor, either CXCR4 or CCR5, to enter a target cell. HIV-1 is characterized by high sequence variability. Nonetheless, within this extensive variability, certain features must be conserved to define functions and phenotypes. The determination of coreceptor usage of HIV-1, from its protein envelope sequenc  ...[more]

Similar Datasets

| S-EPMC3938935 | biostudies-literature
| S-EPMC3523352 | biostudies-literature
| S-EPMC1848001 | biostudies-literature
| S-EPMC3228774 | biostudies-literature
| S-EPMC3436800 | biostudies-literature
| S-EPMC4747591 | biostudies-literature
| S-EPMC3599735 | biostudies-literature
| S-EPMC5571954 | biostudies-literature
| S-EPMC2683948 | biostudies-literature
| S-EPMC6022556 | biostudies-other