Unknown

Dataset Information

0

Assembling the Community-Scale Discoverable Human Proteome.


ABSTRACT: The increasing throughput and sharing of proteomics mass spectrometry data have now yielded over one-third of a million public mass spectrometry runs. However, these discoveries are not continuously aggregated in an open and error-controlled manner, which limits their utility. To facilitate the reusability of these data, we built the MassIVE Knowledge Base (MassIVE-KB), a community-wide, continuously updating knowledge base that aggregates proteomics mass spectrometry discoveries into an open reusable format with full provenance information for community scrutiny. Reusing >31 TB of public human data stored in a mass spectrometry interactive virtual environment (MassIVE), the MassIVE-KB contains >2.1 million precursors from 19,610 proteins (48% larger than before; 97% of the total) and doubles proteome coverage to 6 million amino acids (54% of the proteome) with strict library-scale false discovery controls, thereby providing evidence for 430 proteins for which sufficient protein-level evidence was previously missing. Furthermore, MassIVE-KB can inform experimental design, helps identify and quantify new data, and provides tools for community construction of specialized spectral libraries.

SUBMITTER: Wang M 

PROVIDER: S-EPMC6279426 | biostudies-literature | 2018 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

Assembling the Community-Scale Discoverable Human Proteome.

Wang Mingxun M   Wang Jian J   Carver Jeremy J   Pullman Benjamin S BS   Cha Seong Won SW   Bandeira Nuno N  

Cell systems 20180829 4


The increasing throughput and sharing of proteomics mass spectrometry data have now yielded over one-third of a million public mass spectrometry runs. However, these discoveries are not continuously aggregated in an open and error-controlled manner, which limits their utility. To facilitate the reusability of these data, we built the MassIVE Knowledge Base (MassIVE-KB), a community-wide, continuously updating knowledge base that aggregates proteomics mass spectrometry discoveries into an open re  ...[more]

Similar Datasets

| S-EPMC5141276 | biostudies-literature
| S-EPMC4266588 | biostudies-literature
| S-EPMC8769072 | biostudies-literature
| S-EPMC7447446 | biostudies-literature
| S-EPMC5502489 | biostudies-literature
| S-EPMC4520187 | biostudies-literature
2012-07-05 | E-GEOD-31368 | biostudies-arrayexpress
| S-EPMC3285560 | biostudies-literature
| S-EPMC5387268 | biostudies-literature
| PRJEB8330 | ENA