Ontology highlight
ABSTRACT:
SUBMITTER: Katz KS
PROVIDER: S-EPMC8450716 | biostudies-literature | 2021 Sep
REPOSITORIES: biostudies-literature
Katz Kenneth S KS Shutov Oleg O Lapoint Richard R Kimelman Michael M Brister J Rodney JR O'Sullivan Christopher C
Genome biology 20210920 1
Sequence Read Archive submissions to the National Center for Biotechnology Information often lack useful metadata, which limits the utility of these submissions. We describe the Sequence Taxonomic Analysis Tool (STAT), a scalable k-mer-based tool for fast assessment of taxonomic diversity intrinsic to submissions, independent of metadata. We show that our MinHash-based k-mer tool is accurate and scalable, offering reliable criteria for efficient selection of data for further analysis by the scie ...[more]