Unknown

Dataset Information

0

A MeSH-based text mining method for identifying novel prebiotics.


ABSTRACT: Prebiotics contribute to the well-being of their host by altering the composition of the gut microbiota. Discovering new prebiotics is a challenging and arduous task due to strict inclusion criteria; thus, highly limited numbers of prebiotic candidates have been identified. Notably, the large numbers of published studies may contain substantial information attached to various features of known prebiotics that can be used to predict new candidates. In this paper, we propose a medical subject headings (MeSH)-based text mining method for identifying new prebiotics with structured texts obtained from PubMed. We defined an optimal feature set for prebiotics prediction using a systematic feature-ranking algorithm with which a variety of carbohydrates can be accurately classified into different clusters in accordance with their chemical and biological attributes. The optimal feature set was used to separate positive prebiotics from other carbohydrates, and a cross-validation procedure was employed to assess the prediction accuracy of the model. Our method achieved a specificity of 0.876 and a sensitivity of 0.838. Finally, we identified a high-confidence list of candidates of prebiotics that are strongly supported by the literature. Our study demonstrates that text mining from high-volume biomedical literature is a promising approach in searching for new prebiotics.

SUBMITTER: Shan G 

PROVIDER: S-EPMC5266046 | biostudies-literature | 2016 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

A MeSH-based text mining method for identifying novel prebiotics.

Shan Guangyu G   Lu Yiming Y   Min Bo B   Qu Wubin W   Zhang Chenggang C  

Medicine 20161201 49


Prebiotics contribute to the well-being of their host by altering the composition of the gut microbiota. Discovering new prebiotics is a challenging and arduous task due to strict inclusion criteria; thus, highly limited numbers of prebiotic candidates have been identified. Notably, the large numbers of published studies may contain substantial information attached to various features of known prebiotics that can be used to predict new candidates. In this paper, we propose a medical subject head  ...[more]

Similar Datasets

| S-EPMC6550425 | biostudies-literature
| S-EPMC10762847 | biostudies-literature
| S-EPMC6829803 | biostudies-other
| S-EPMC3110144 | biostudies-literature
| S-EPMC8060081 | biostudies-literature
| S-EPMC7018134 | biostudies-literature
2020-04-30 | GSE142100 | GEO
| S-EPMC9369164 | biostudies-literature
| S-EPMC8163238 | biostudies-literature
| S-EPMC2765257 | biostudies-literature