Unknown

Dataset Information

0

Feature selection using a one dimensional naive Bayes' classifier increases the accuracy of support vector machine classification of CDR3 repertoires.


ABSTRACT:

Motivation

Somatic DNA recombination, the hallmark of vertebrate adaptive immunity, has the potential to generate a vast diversity of antigen receptor sequences. How this diversity captures antigen specificity remains incompletely understood. In this study we use high throughput sequencing to compare the global changes in T cell receptor ? chain complementarity determining region 3 (CDR3?) sequences following immunization with ovalbumin administered with complete Freund's adjuvant (CFA) or CFA alone.

Results

The CDR3? sequences were deconstructed into short stretches of overlapping contiguous amino acids. The motifs were ranked according to a one-dimensional Bayesian classifier score comparing their frequency in the repertoires of the two immunization classes. The top ranking motifs were selected and used to create feature vectors which were used to train a support vector machine. The support vector machine achieved high classification scores in a leave-one-out validation test reaching >90% in some cases.

Summary

The study describes a novel two-stage classification strategy combining a one-dimensional Bayesian classifier with a support vector machine. Using this approach we demonstrate that the frequency of a small number of linear motifs three amino acids in length can accurately identify a CD4 T cell response to ovalbumin against a background response to the complex mixture of antigens which characterize Complete Freund's Adjuvant.

Availability and implementation

The sequence data is available at www.ncbi.nlm.nih.gov/sra/?term¼SRP075893 . The Decombinator package is available at github.com/innate2adaptive/Decombinator . The R package e1071 is available at the CRAN repository https://cran.r-project.org/web/packages/e1071/index.html .

Contact

b.chain@ucl.ac.uk.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Cinelli M 

PROVIDER: S-EPMC5860388 | biostudies-literature | 2017 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

Feature selection using a one dimensional naïve Bayes' classifier increases the accuracy of support vector machine classification of CDR3 repertoires.

Cinelli Mattia M   Sun Yuxin Y   Best Katharine K   Heather James M JM   Reich-Zeliger Shlomit S   Shifrut Eric E   Friedman Nir N   Shawe-Taylor John J   Chain Benny B  

Bioinformatics (Oxford, England) 20170401 7


<h4>Motivation</h4>Somatic DNA recombination, the hallmark of vertebrate adaptive immunity, has the potential to generate a vast diversity of antigen receptor sequences. How this diversity captures antigen specificity remains incompletely understood. In this study we use high throughput sequencing to compare the global changes in T cell receptor β chain complementarity determining region 3 (CDR3β) sequences following immunization with ovalbumin administered with complete Freund's adjuvant (CFA)  ...[more]

Similar Datasets

| S-EPMC6480413 | biostudies-literature
| S-EPMC1635426 | biostudies-literature
| S-EPMC5783520 | biostudies-literature
| S-EPMC5627885 | biostudies-literature
| S-EPMC8382032 | biostudies-literature
| S-EPMC6567606 | biostudies-literature
| S-EPMC5564130 | biostudies-literature
| S-EPMC4538581 | biostudies-literature
| S-EPMC8249850 | biostudies-literature
| S-EPMC3789547 | biostudies-literature