Unknown

Dataset Information

0

BioTEA: Containerized Methods of Analysis for Microarray-Based Transcriptomics Data.


ABSTRACT: Tens of thousands of gene expression data sets describing a variety of model organisms in many different pathophysiological conditions are currently stored in publicly available databases such as the Gene Expression Omnibus (GEO) and ArrayExpress (AE). As microarray technology is giving way to RNA-seq, it becomes strategic to develop high-level tools of analysis to preserve access to this huge amount of information through the most sophisticated methods of data preparation and processing developed over the years, while ensuring, at the same time, the reproducibility of the results. To meet this need, here we present bioTEA (biological Transcript Expression Analyzer), a novel software tool that combines ease of use with the versatility and power of an R/Bioconductor-based differential expression analysis, starting from raw data retrieval and preparation to gene annotation. BioTEA is an R-coded pipeline, wrapped in a Python-based command line interface and containerized with Docker technology. The user can choose among multiple options-including gene filtering, batch effect handling, sample pairing, statistical test type-to adapt the algorithm flow to the structure of the particular data set. All these options are saved in a single text file, which can be easily shared between different laboratories to deterministically reproduce the results. In addition, a detailed log file provides accurate information about each step of the analysis. Overall, these features make bioTEA an invaluable tool for both bioinformaticians and wet-lab biologists interested in transcriptomics. BioTEA is free and open-source.

SUBMITTER: Visentin L 

PROVIDER: S-EPMC9495986 | biostudies-literature | 2022 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

BioTEA: Containerized Methods of Analysis for Microarray-Based Transcriptomics Data.

Visentin Luca L   Scarpellino Giorgia G   Chinigò Giorgia G   Munaron Luca L   Ruffinatti Federico Alessandro FA  

Biology 20220913 9


Tens of thousands of gene expression data sets describing a variety of model organisms in many different pathophysiological conditions are currently stored in publicly available databases such as the Gene Expression Omnibus (GEO) and ArrayExpress (AE). As microarray technology is giving way to RNA-seq, it becomes strategic to develop high-level tools of analysis to preserve access to this huge amount of information through the most sophisticated methods of data preparation and processing develop  ...[more]

Similar Datasets

| S-EPMC3489090 | biostudies-other
| S-EPMC7228135 | biostudies-literature
| S-EPMC10072060 | biostudies-literature
| S-EPMC2712751 | biostudies-literature
| S-EPMC5002968 | biostudies-literature
| S-EPMC133449 | biostudies-literature
| S-EPMC8951701 | biostudies-literature
| S-EPMC4497424 | biostudies-literature
| S-EPMC5581932 | biostudies-literature
| S-EPMC4042480 | biostudies-literature