Dataset Information

E-Sweet: A Machine-Learning Based Platform for the Prediction of Sweetener and Its Relative Sweetness.

ABSTRACT: Artificial sweeteners (AS) can elicit the strong sweet sensation with the low or zero calorie, and are widely used to replace the nutritive sugar in the food and beverage industry. However, the safety issue of current AS is still controversial. Thus, it is imperative to develop more safe and potent AS. Due to the costly and laborious experimental-screening of AS, in-silico sweetener/sweetness prediction could provide a good avenue to identify the potential sweetener candidates before experiment. In this work, we curate the largest dataset of 530 sweeteners and 850 non-sweeteners, and collect the second largest dataset of 352 sweeteners with the relative sweetness (RS) from the literature. In light of these experimental datasets, we adopt five machine-learning methods and conformational-independent molecular fingerprints to derive the classification and regression models for the prediction of sweetener and its RS, respectively via the consensus strategy. Our best classification model achieves the 95% confidence intervals for the accuracy (0.91 ± 0.01), precision (0.90 ± 0.01), specificity (0.94 ± 0.01), sensitivity (0.86 ± 0.01), F1-score (0.88 ± 0.01), and NER (Non-error Rate: 0.90 ± 0.01) on the test set, which outperforms the model (NER = 0.85) of Rojas et al. in terms of NER, and our best regression model gives the 95% confidence intervals for the R²(test set) and ?R² [referring to |R²(test set)- R²(cross-validation)|] of 0.77 ± 0.01 and 0.03 ± 0.01, respectively, which is also better than the other works based on the conformation-independent 2D descriptors (e.g., 2D Dragon) according to R²(test set) and ?R². Our models are obtained by averaging over nineteen data-splitting schemes, and fully comply with the guidelines of Organization for Economic Cooperation and Development (OECD), which are not completely followed by the previous relevant works that are all on the basis of only one random data-splitting scheme for the cross-validation set and test set. Finally, we develop a user-friendly platform "e-Sweet" for the automatic prediction of sweetener and its corresponding RS. To our best knowledge, it is a first and free platform that can enable the experimental food scientists to exploit the current machine-learning methods to boost the discovery of more AS with the low or zero calorie content.

SUBMITTER: Zheng S

PROVIDER: S-EPMC6363693 | biostudies-literature | 2019

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

e-Sweet: A Machine-Learning Based Platform for the Prediction of Sweetener and Its Relative Sweetness.

Zheng Suqing S Chang Wenping W Xu Wenxin W Xu Yong Y Lin Fu F

Frontiers in chemistry 20190130

Artificial sweeteners (AS) can elicit the strong sweet sensation with the low or zero calorie, and are widely used to replace the nutritive sugar in the food and beverage industry. However, the safety issue of current AS is still controversial. Thus, it is imperative to develop more safe and potent AS. Due to the costly and laborious experimental-screening of AS, <i>in-silico</i> sweetener/sweetness prediction could provide a good avenue to identify the potential sweetener candidates before expe ...[more]

PMID: 30761295

Dataset Information

E-Sweet: A Machine-Learning Based Platform for the Prediction of Sweetener and Its Relative Sweetness.

Publications

e-Sweet: A Machine-Learning Based Platform for the Prediction of Sweetener and Its Relative Sweetness.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Machine learning-based e-commerce platform repurchase customer prediction model.
| S-EPMC7714352 | biostudies-literature

Prediction of Breast Cancer Estrogen Receptor Status using Machine Learning
2013-01-01 | E-GEOD-29210 | biostudies-arrayexpress

Gene function prediction in five model eukaryotes exclusively based on gene relative location through machine learning.
| S-EPMC9270439 | biostudies-literature

Machine learning-based prediction of transfusion.
| S-EPMC7540018 | biostudies-literature

A machine learning breast cancer prediction model based on a panel from circulating exosomal miRNAs
2022-02-21 | GSE197020 | GEO

Machine learning-based prediction of the activity and specificity of Cas9 variants in gene editing
2024-08-22 | GSE231840 | GEO

A Machine Learning-Based Prediction Platform for P-Glycoprotein Modulators and Its Validation by Molecular Docking.
| S-EPMC6829872 | biostudies-literature

MutPred Splice: machine learning-based prediction of exonic variants that disrupt splicing.
| S-EPMC4054890 | biostudies-other

Machine learning-enabled renal cell carcinoma status prediction using multi-platform urine-based metabolomics (part-I)
2021-02-11 | ST001705 | MetabolomicsWorkbench

Prediction of Breast Cancer Estrogen Receptor Status using Machine Learning
2013-01-01 | GSE29210 | GEO