Unknown

Dataset Information

0

Development of Database Assisted Structure Identification (DASI) Methods for Nontargeted Metabolomics.


ABSTRACT: Metabolite structure identification remains a significant challenge in nontargeted metabolomics research. One commonly used strategy relies on searching biochemical databases using exact mass. However, this approach fails when the database does not contain the unknown metabolite (i.e., for unknown-unknowns). For these cases, constrained structure generation with combinatorial structure generators provides a potential option. Here we evaluated structure generation constraints based on the specification of: (1) substructures required (i.e., seed structures); (2) substructures not allowed; and (3) filters to remove incorrect structures. Our approach (database assisted structure identification, DASI) used predictive models in MolFind to find candidate structures with chemical and physical properties similar to the unknown. These candidates were then used for seed structure generation using eight different structure generation algorithms. One algorithm was able to generate correct seed structures for 21/39 test compounds. Eleven of these seed structures were large enough to constrain the combinatorial structure generator to fewer than 100,000 structures. In 35/39 cases, at least one algorithm was able to generate a correct seed structure. The DASI method has several limitations and will require further experimental validation and optimization. At present, it seems most useful for identifying the structure of unknown-unknowns with molecular weights <200 Da.

SUBMITTER: Menikarachchi LC 

PROVIDER: S-EPMC4931548 | biostudies-literature | 2016 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

Development of Database Assisted Structure Identification (DASI) Methods for Nontargeted Metabolomics.

Menikarachchi Lochana C LC   Dubey Ritvik R   Hill Dennis W DW   Brush Daniel N DN   Grant David F DF  

Metabolites 20160531 2


Metabolite structure identification remains a significant challenge in nontargeted metabolomics research. One commonly used strategy relies on searching biochemical databases using exact mass. However, this approach fails when the database does not contain the unknown metabolite (i.e., for unknown-unknowns). For these cases, constrained structure generation with combinatorial structure generators provides a potential option. Here we evaluated structure generation constraints based on the specifi  ...[more]

Similar Datasets

| S-EPMC8378237 | biostudies-literature
| S-EPMC3819714 | biostudies-literature
| S-EPMC3376006 | biostudies-literature
| S-EPMC8404481 | biostudies-literature
| S-EPMC4570942 | biostudies-literature
| S-EPMC4112935 | biostudies-literature
| S-EPMC5925027 | biostudies-literature
| S-EPMC6237353 | biostudies-literature
| S-EPMC3837024 | biostudies-literature
| S-EPMC5851804 | biostudies-literature