Unknown

Dataset Information

0

Data discovery with DATS: exemplar adoptions and lessons learned.


ABSTRACT: The DAta Tag Suite (DATS) is a model supporting dataset description, indexing, and discovery. It is available as an annotated serialization with schema.org, a vocabulary used by major search engines, thus making the datasets discoverable on the web. DATS underlies DataMed, the National Institutes of Health Big Data to Knowledge Data Discovery Index prototype, which aims to provide a "PubMed for datasets." The experience gained while indexing a heterogeneous range of >60 repositories in DataMed helped in evaluating DATS's entities, attributes, and scope. In this work, 3 additional exemplary and diverse data sources were mapped to DATS by their representatives or experts, offering a deep scan of DATS fitness against a new set of existing data. The procedure, including feedback from users and implementers, resulted in DATS implementation guidelines and best practices, and identification of a path for evolving and optimizing the model. Finally, the work exposed additional needs when defining datasets for indexing, especially in the context of clinical and observational information.

SUBMITTER: Gonzalez-Beltran AN 

PROVIDER: S-EPMC6481379 | biostudies-other | 2018 Jan

REPOSITORIES: biostudies-other

altmetric image

Publications

Data discovery with DATS: exemplar adoptions and lessons learned.

Gonzalez-Beltran Alejandra N AN   Campbell John J   Dunn Patrick P   Guijarro Diana D   Ionescu Sanda S   Kim Hyeoneui H   Lyle Jared J   Wiser Jeffrey J   Sansone Susanna-Assunta SA   Rocca-Serra Philippe P  

Journal of the American Medical Informatics Association : JAMIA 20180101 1


The DAta Tag Suite (DATS) is a model supporting dataset description, indexing, and discovery. It is available as an annotated serialization with schema.org, a vocabulary used by major search engines, thus making the datasets discoverable on the web. DATS underlies DataMed, the National Institutes of Health Big Data to Knowledge Data Discovery Index prototype, which aims to provide a "PubMed for datasets." The experience gained while indexing a heterogeneous range of >60 repositories in DataMed h  ...[more]

Similar Datasets

| S-EPMC9214331 | biostudies-literature
| S-EPMC9708951 | biostudies-literature
| S-EPMC8756524 | biostudies-literature
| S-EPMC6936121 | biostudies-literature
| S-EPMC3976103 | biostudies-literature
| S-EPMC9391706 | biostudies-literature
| S-EPMC4581358 | biostudies-other
| S-EPMC6721610 | biostudies-literature
| S-EPMC8390785 | biostudies-literature
| S-EPMC5424123 | biostudies-literature