Unknown

Dataset Information

0

Organism-Specific training improves performance of linear B-Cell epitope prediction.


ABSTRACT: In silico identification of linear B-cell epitopes represents an important step in the development of diagnostic tests and vaccine candidates, by providing potential high-probability targets for experimental investigation. Current predictive tools were developed under a generalist approach, training models with heterogeneous data sets to develop predictors that can be deployed for a wide variety of pathogens. However, continuous advances in processing power and the increasing amount of epitope data for a broad range of pathogens indicate that training organism or taxon-specific models may become a feasible alternative, with unexplored potential gains in predictive performance. This paper shows how organism-specific training of epitope prediction models can yield substantial performance gains across several quality metrics when compared to models trained with heterogeneous and hybrid data, and with a variety of widely-used predictors from the literature. These results suggest a promising alternative for the development of custom-tailored predictive models with high predictive power, which can be easily implemented and deployed for the investigation of specific pathogens. The data underlying this article, as well as the full reproducibility scripts, are available at https://github.com/fcampelo/OrgSpec-paper. The R package that implements the organism-specific pipeline functions is available at https://github.com/fcampelo/epitopes. Supplementary materials are available at Bioinformatics online.

SUBMITTER: Ashford J 

PROVIDER: S-EPMC8665745 | biostudies-literature | 2021 Jul

REPOSITORIES: biostudies-literature

altmetric image

Publications

Organism-specific training improves performance of linear B-cell epitope prediction.

Ashford Jodie J   Reis-Cunha João J   Lobo Igor I   Lobo Francisco F   Campelo Felipe F  

Bioinformatics (Oxford, England) 20211201 24


<h4>Motivation</h4>In silico identification of linear B-cell epitopes represents an important step in the development of diagnostic tests and vaccine candidates, by providing potential high-probability targets for experimental investigation. Current predictive tools were developed under a generalist approach, training models with heterogeneous datasets to develop predictors that can be deployed for a wide variety of pathogens. However, continuous advances in processing power and the increasing a  ...[more]

Similar Datasets

| S-EPMC10199762 | biostudies-literature
| S-EPMC7371472 | biostudies-literature
| S-EPMC3646881 | biostudies-literature
| S-EPMC8004178 | biostudies-literature
| S-EPMC10524249 | biostudies-literature
| S-EPMC8652027 | biostudies-literature
| S-EPMC6072840 | biostudies-literature
| PRJEB24387 | ENA
| S-EPMC8724920 | biostudies-literature
| S-EPMC9319239 | biostudies-literature