Unknown

Dataset Information

0

Pmparser and PMDB: resources for large-scale, open studies of the biomedical literature.


ABSTRACT: PubMed is an invaluable resource for the biomedical community. Although PubMed is freely available, the existing API is not designed for large-scale analyses and the XML structure of the underlying data is inconvenient for complex queries. We developed an R package called pmparser to convert the data in PubMed to a relational database. Our implementation of the database, called PMDB, currently contains data on over 31 million PubMed Identifiers (PMIDs) and is updated regularly. Together, pmparser and PMDB can enable large-scale, reproducible, and transparent analyses of the biomedical literature. pmparser is licensed under GPL-2 and available at https://pmparser.hugheylab.org. PMDB is available in both PostgreSQL (DOI 10.5281/zenodo.4008109) and Google BigQuery (https://console.cloud.google.com/bigquery?project=pmdb-bq&d=pmdb).

SUBMITTER: Schoenbachler JL 

PROVIDER: S-EPMC7955988 | biostudies-literature | 2021

REPOSITORIES: biostudies-literature

altmetric image

Publications

pmparser and PMDB: resources for large-scale, open studies of the biomedical literature.

Schoenbachler Joshua L JL   Hughey Jacob J JJ  

PeerJ 20210311


PubMed is an invaluable resource for the biomedical community. Although PubMed is freely available, the existing API is not designed for large-scale analyses and the XML structure of the underlying data is inconvenient for complex queries. We developed an R package called pmparser to convert the data in PubMed to a relational database. Our implementation of the database, called PMDB, currently contains data on over 31 million PubMed Identifiers (PMIDs) and is updated regularly. Together, pmparse  ...[more]

Similar Datasets

| S-EPMC7237030 | biostudies-literature
| S-EPMC2276192 | biostudies-literature
| S-EPMC7400038 | biostudies-literature
| S-EPMC7951980 | biostudies-literature
| S-EPMC3165456 | biostudies-other
2013-12-23 | E-GEOD-53091 | biostudies-arrayexpress
| S-EPMC3771067 | biostudies-literature
| S-EPMC3621846 | biostudies-literature
| S-EPMC5996850 | biostudies-literature
| S-EPMC11543612 | biostudies-literature