Unknown

Dataset Information

0

BioDataome: a collection of uniformly preprocessed and automatically annotated datasets for data-driven biology.


ABSTRACT: Biotechnology revolution generates a plethora of omics data with an exponential growth pace. Therefore, biological data mining demands automatic, 'high quality' curation efforts to organize biomedical knowledge into online databases. BioDataome is a database of uniformly preprocessed and disease-annotated omics data with the aim to promote and accelerate the reuse of public data. We followed the same preprocessing pipeline for each biological mart (microarray gene expression, RNA-Seq gene expression and DNA methylation) to produce ready for downstream analysis datasets and automatically annotated them with disease-ontology terms. We also designate datasets that share common samples and automatically discover control samples in case-control studies. Currently, BioDataome includes ?5600 datasets, ?260 000 samples spanning ?500 diseases and can be easily used in large-scale massive experiments and meta-analysis. All datasets are publicly available for querying and downloading via BioDataome web application. We demonstrate BioDataome's utility by presenting exploratory data analysis examples. We have also developed BioDataome R package found in: https://github.com/mensxmachina/BioDataome/.Database URL: http://dataome.mensxmachina.org/.

SUBMITTER: Lakiotaki K 

PROVIDER: S-EPMC5836265 | biostudies-literature | 2018 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

BioDataome: a collection of uniformly preprocessed and automatically annotated datasets for data-driven biology.

Lakiotaki Kleanthi K   Vorniotakis Nikolaos N   Tsagris Michail M   Georgakopoulos Georgios G   Tsamardinos Ioannis I  

Database : the journal of biological databases and curation 20180101


Biotechnology revolution generates a plethora of omics data with an exponential growth pace. Therefore, biological data mining demands automatic, 'high quality' curation efforts to organize biomedical knowledge into online databases. BioDataome is a database of uniformly preprocessed and disease-annotated omics data with the aim to promote and accelerate the reuse of public data. We followed the same preprocessing pipeline for each biological mart (microarray gene expression, RNA-Seq gene expres  ...[more]

Similar Datasets

| S-EPMC5820610 | biostudies-literature
| S-EPMC8328063 | biostudies-literature
| S-EPMC7917102 | biostudies-literature
| S-EPMC6296361 | biostudies-other
| S-EPMC3531282 | biostudies-literature
| S-EPMC6812467 | biostudies-literature
| S-EPMC3437820 | biostudies-literature
| S-EPMC4720990 | biostudies-literature
| S-EPMC6278692 | biostudies-literature
| S-EPMC7404612 | biostudies-literature