Dataset Information

Metagenome fragment classification based on multiple motif-occurrence profiles.

ABSTRACT: A vast amount of metagenomic data has been obtained by extracting multiple genomes simultaneously from microbial communities, including genomes from uncultivable microbes. By analyzing these metagenomic data, novel microbes are discovered and new microbial functions are elucidated. The first step in analyzing these data is sequenced-read classification into reference genomes from which each read can be derived. The Naïve Bayes Classifier is a method for this classification. To identify the derivation of the reads, this method calculates a score based on the occurrence of a DNA sequence motif in each reference genome. However, large differences in the sizes of the reference genomes can bias the scoring of the reads. This bias might cause erroneous classification and decrease the classification accuracy. To address this issue, we have updated the Naïve Bayes Classifier method using multiple sets of occurrence profiles for each reference genome by normalizing the genome sizes, dividing each genome sequence into a set of subsequences of similar length and generating profiles for each subsequence. This multiple profile strategy improves the accuracy of the results generated by the Naïve Bayes Classifier method for simulated and Sargasso Sea datasets.

SUBMITTER: Matsushita N

PROVIDER: S-EPMC4157293 | biostudies-literature | 2014

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Metagenome fragment classification based on multiple motif-occurrence profiles.

Matsushita Naoki N Seno Shigeto S Takenaka Yoichi Y Matsuda Hideo H

PeerJ 20140904

A vast amount of metagenomic data has been obtained by extracting multiple genomes simultaneously from microbial communities, including genomes from uncultivable microbes. By analyzing these metagenomic data, novel microbes are discovered and new microbial functions are elucidated. The first step in analyzing these data is sequenced-read classification into reference genomes from which each read can be derived. The Naïve Bayes Classifier is a method for this classification. To identify the deriv ...[more]

PMID: 25210663

Dataset Information

Metagenome fragment classification based on multiple motif-occurrence profiles.

Publications

Metagenome fragment classification based on multiple motif-occurrence profiles.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Motif-Based Text Mining of Microbial Metagenome Redundancy Profiling Data for Disease Classification.
| S-EPMC4769744 | biostudies-literature

Fragment-Based Analysis of Ligand Dockings Improves Classification of Actives.
| S-EPMC5023760 | biostudies-literature

Motif comparison based on similarity of binding affinity profiles.
| S-EPMC5181567 | biostudies-literature

Predicting drug side-effect profiles: a chemical fragment-based approach.
| S-EPMC3125260 | biostudies-literature

Fragment Finder: a web-based software to identify similar three-dimensional structural motif.
| S-EPMC1160114 | biostudies-literature

Classification of Multiple Sclerosis Clinical Profiles via Graph Convolutional Neural Networks.
| S-EPMC6581753 | biostudies-literature

Multiple cancer type classification by small RNA expression profiles with plasma samples from multiple facilities.
| S-EPMC9207371 | biostudies-literature

Fragment-based modelling of single stranded RNA bound to RNA recognition motif containing proteins.
| S-EPMC4889956 | biostudies-literature

Environmental metagenome classification for constructing a microbiome fingerprint.
| S-EPMC6854650 | biostudies-literature

RNAMotifProfile: a graph-based approach to build RNA structural motif profiles.
| S-EPMC11426329 | biostudies-literature