Unknown

Dataset Information

0

Employing fingerprinting of medicinal plants by means of LC-MS and machine learning for species identification task.


ABSTRACT: A dataset of liquid chromatography-mass spectrometry measurements of medicinal plant extracts from 74 species was generated and used for training and validating plant species identification algorithms. Various strategies for data handling and feature space extraction were tested. Constrained Tucker decomposition, large-scale (more than 1500 variables) discrete Bayesian Networks and autoencoder based dimensionality reduction coupled with continuous Bayes classifier and logistic regression were optimized to achieve the best accuracy. Even with elimination of all retention time values accuracies of up to 96% and 92% were achieved on validation set for plant species and plant organ identification respectively. Benefits and drawbacks of used algortihms were discussed. Preliminary test showed that developed approaches exhibit tolerance to changes in data created by using different extraction methods and/or equipment. Dataset with more than 2200 chromatograms was published in an open repository.

SUBMITTER: Kharyuk P 

PROVIDER: S-EPMC6243014 | biostudies-literature | 2018 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

Employing fingerprinting of medicinal plants by means of LC-MS and machine learning for species identification task.

Kharyuk Pavel P   Nazarenko Dmitry D   Oseledets Ivan I   Rodin Igor I   Shpigun Oleg O   Tsitsilin Andrey A   Lavrentyev Mikhail M  

Scientific reports 20181119 1


A dataset of liquid chromatography-mass spectrometry measurements of medicinal plant extracts from 74 species was generated and used for training and validating plant species identification algorithms. Various strategies for data handling and feature space extraction were tested. Constrained Tucker decomposition, large-scale (more than 1500 variables) discrete Bayesian Networks and autoencoder based dimensionality reduction coupled with continuous Bayes classifier and logistic regression were op  ...[more]

Similar Datasets

2022-04-15 | MTBLS688 | MetaboLights
| S-EPMC3293931 | biostudies-literature
| S-EPMC3697863 | biostudies-literature
| S-EPMC10659119 | biostudies-literature
| S-EPMC8642397 | biostudies-literature
2022-08-09 | PXD027824 | JPOST Repository
| S-EPMC6885708 | biostudies-literature
| S-EPMC7023254 | biostudies-literature
| S-EPMC9821958 | biostudies-literature
| S-EPMC9047440 | biostudies-literature