Unknown

Dataset Information

0

TCGA-assembler 2: software pipeline for retrieval and processing of TCGA/CPTAC data.


ABSTRACT: Motivation:The Cancer Genome Atlas (TCGA) program has produced huge amounts of cancer genomics data providing unprecedented opportunities for research. In 2014, we developed TCGA-Assembler, a software pipeline for retrieval and processing of public TCGA data. In 2016, TCGA data were transferred from the TCGA data portal to the Genomic Data Commons (GDCs), which is supported by a different set of data storage and retrieval mechanisms. In addition, new proteomics data of TCGA samples have been generated by the Clinical Proteomic Tumor Analysis Consortium (CPTAC) program, which were not available for downloading through TCGA-Assembler. It is desirable to acquire and integrate data from both GDC and CPTAC. Results:We develop TCGA-assembler 2 (TA2) to automatically download and integrate data from GDC and CPTAC. We make substantial improvement on the functionality of TA2 to enhance user experience and software performance. TA2 together with its previous version have helped more than 2000 researchers from 64 countries to access and utilize TCGA and CPTAC data in their research. Availability of TA2 will continue to allow existing and new users to conduct reproducible research based on TCGA and CPTAC data. Availability and implementation:http://www.compgenome.org/TCGA-Assembler/ or https://github.com/compgenome365/TCGA-Assembler-2. Contact:zhuyitan@gmail.com or koaeraser@gmail.com. Supplementary information:Supplementary data are available at Bioinformatics online.

SUBMITTER: Wei L 

PROVIDER: S-EPMC5925773 | biostudies-literature | 2018 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

TCGA-assembler 2: software pipeline for retrieval and processing of TCGA/CPTAC data.

Wei Lin L   Jin Zhilin Z   Yang Shengjie S   Xu Yanxun Y   Zhu Yitan Y   Ji Yuan Y  

Bioinformatics (Oxford, England) 20180501 9


<h4>Motivation</h4>The Cancer Genome Atlas (TCGA) program has produced huge amounts of cancer genomics data providing unprecedented opportunities for research. In 2014, we developed TCGA-Assembler, a software pipeline for retrieval and processing of public TCGA data. In 2016, TCGA data were transferred from the TCGA data portal to the Genomic Data Commons (GDCs), which is supported by a different set of data storage and retrieval mechanisms. In addition, new proteomics data of TCGA samples have  ...[more]

Similar Datasets

| S-EPMC4387197 | biostudies-literature
| S-EPMC5835235 | biostudies-literature
| S-EPMC5117628 | biostudies-literature
| S-EPMC2649129 | biostudies-literature
| S-EPMC9854277 | biostudies-literature
| S-EPMC9652453 | biostudies-literature
| S-EPMC4706059 | biostudies-literature
2015-10-29 | E-GEOD-73204 | biostudies-arrayexpress
2015-10-29 | GSE73204 | GEO
| S-EPMC3051320 | biostudies-literature