Dataset Information

DNApod: DNA polymorphism annotation database from next-generation sequence read archives.

ABSTRACT: With the rapid advances in next-generation sequencing (NGS), datasets for DNA polymorphisms among various species and strains have been produced, stored, and distributed. However, reliability varies among these datasets because the experimental and analytical conditions used differ among assays. Furthermore, such datasets have been frequently distributed from the websites of individual sequencing projects. It is desirable to integrate DNA polymorphism data into one database featuring uniform quality control that is distributed from a single platform at a single place. DNA polymorphism annotation database (DNApod; http://tga.nig.ac.jp/dnapod/) is an integrated database that stores genome-wide DNA polymorphism datasets acquired under uniform analytical conditions, and this includes uniformity in the quality of the raw data, the reference genome version, and evaluation algorithms. DNApod genotypic data are re-analyzed whole-genome shotgun datasets extracted from sequence read archives, and DNApod distributes genome-wide DNA polymorphism datasets and known-gene annotations for each DNA polymorphism. This new database was developed for storing genome-wide DNA polymorphism datasets of plants, with crops being the first priority. Here, we describe our analyzed data for 679, 404, and 66 strains of rice, maize, and sorghum, respectively. The analytical methods are available as a DNApod workflow in an NGS annotation system of the DNA Data Bank of Japan and a virtual machine image. Furthermore, DNApod provides tables of links of identifiers between DNApod genotypic data and public phenotypic data. To advance the sharing of organism knowledge, DNApod offers basic and ubiquitous functions for multiple alignment and phylogenetic tree construction by using orthologous gene information.

SUBMITTER: Mochizuki T

PROVIDER: S-EPMC5325239 | biostudies-literature | 2017

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

DNApod: DNA polymorphism annotation database from next-generation sequence read archives.

Mochizuki Takako T Tanizawa Yasuhiro Y Fujisawa Takatomo T Ohta Tazro T Nikoh Naruo N Shimizu Tokurou T Toyoda Atsushi A Fujiyama Asao A Kurata Nori N Nagasaki Hideki H Kaminuma Eli E Nakamura Yasukazu Y

PloS one 20170224 2

With the rapid advances in next-generation sequencing (NGS), datasets for DNA polymorphisms among various species and strains have been produced, stored, and distributed. However, reliability varies among these datasets because the experimental and analytical conditions used differ among assays. Furthermore, such datasets have been frequently distributed from the websites of individual sequencing projects. It is desirable to integrate DNA polymorphism data into one database featuring uniform qua ...[more]

PMID: 28234924

Similar Datasets

Project description:BACKGROUND: Renewable energy production is currently a major issue worldwide. Biogas is a promising renewable energy carrier as the technology of its production combines the elimination of organic waste with the formation of a versatile energy carrier, methane. In consequence of the complexity of the microbial communities and metabolic pathways involved the biotechnology of the microbiological process leading to biogas production is poorly understood. Metagenomic approaches are suitable means of addressing related questions. In the present work a novel high-throughput technique was tested for its benefits in resolving the functional and taxonomical complexity of such microbial consortia. RESULTS: It was demonstrated that the extremely parallel SOLiD™ short-read DNA sequencing platform is capable of providing sufficient useful information to decipher the systematic and functional contexts within a biogas-producing community. Although this technology has not been employed to address such problems previously, the data obtained compare well with those from similar high-throughput approaches such as 454-pyrosequencing GS FLX or Titanium. The predominant microbes contributing to the decomposition of organic matter include members of the Eubacteria, class Clostridia, order Clostridiales, family Clostridiaceae. Bacteria belonging in other systematic groups contribute to the diversity of the microbial consortium. Archaea comprise a remarkably small minority in this community, given their crucial role in biogas production. Among the Archaea, the predominant order is the Methanomicrobiales and the most abundant species is Methanoculleus marisnigri. The Methanomicrobiales are hydrogenotrophic methanogens. Besides corroborating earlier findings on the significance of the contribution of the Clostridia to organic substrate decomposition, the results demonstrate the importance of the metabolism of hydrogen within the biogas producing microbial community. CONCLUSIONS: Both microbiological diversity and the regulatory role of the hydrogen metabolism appear to be the driving forces optimizing biogas-producing microbial communities. The findings may allow a rational design of these communities to promote greater efficacy in large-scale practical systems. The composition of an optimal biogas-producing consortium can be determined through the use of this approach, and this systematic methodology allows the design of the optimal microbial community structure for any biogas plant. In this way, metagenomic studies can contribute to significant progress in the efficacy and economic improvement of biogas production.

Dataset Information

DNApod: DNA polymorphism annotation database from next-generation sequence read archives.

Publications

DNApod: DNA polymorphism annotation database from next-generation sequence read archives.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets