Unknown

Dataset Information

0

A critical assessment of gene catalogs for metagenomic analysis.


ABSTRACT:

Motivation

Microbial gene catalogs are data structures that organize genes found in microbial communities, providing a reference for standardized analysis of the microbes across samples and studies. Although gene catalogs are commonly used, they have not been critically evaluated for their effectiveness as a basis for metagenomic analyses.

Results

As a case study, we investigate one such catalog, the Integrated Gene Catalog (IGC), however our observations apply broadly to most gene catalogs constructed to date. We focus on both the approach used to construct this catalog and, on its effectiveness, when used as a reference for microbiome studies. Our results highlight important limitations of the approach used to construct the IGC and call into question the broad usefulness of gene catalogs more generally. We also recommend best practices for the construction and use of gene catalogs in microbiome studies and highlight opportunities for future research.

Availability

All supporting scripts for our analyses can be found on GitHub: https://github.com/SethCommichaux/IGC.git. The supporting data can be downloaded from: https://obj.umiacs.umd.edu/igc-analysis/IGC_analysis_data.tar.gz.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Commichaux S 

PROVIDER: S-EPMC8479683 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC7506068 | biostudies-literature
| S-EPMC6446249 | biostudies-literature
| PRJEB31567 | ENA
| S-EPMC10714869 | biostudies-literature
| S-EPMC526221 | biostudies-literature
| S-EPMC4223665 | biostudies-literature
| S-EPMC4715287 | biostudies-literature
| S-EPMC3906045 | biostudies-literature
| S-EPMC3660777 | biostudies-literature
2009-12-01 | GSE14276 | GEO