Unknown

Dataset Information

0

PhenDisco: phenotype discovery system for the database of genotypes and phenotypes.


ABSTRACT: The database of genotypes and phenotypes (dbGaP) developed by the National Center for Biotechnology Information (NCBI) is a resource that contains information on various genome-wide association studies (GWAS) and is currently available via NCBI's dbGaP Entrez interface. The database is an important resource, providing GWAS data that can be used for new exploratory research or cross-study validation by authorized users. However, finding studies relevant to a particular phenotype of interest is challenging, as phenotype information is presented in a non-standardized way. To address this issue, we developed PhenDisco (phenotype discoverer), a new information retrieval system for dbGaP. PhenDisco consists of two main components: (1) text processing tools that standardize phenotype variables and study metadata, and (2) information retrieval tools that support queries from users and return ranked results. In a preliminary comparison involving 18 search scenarios, PhenDisco showed promising performance for both unranked and ranked search comparisons with dbGaP's search engine Entrez. The system can be accessed at http://pfindr.net.

SUBMITTER: Doan S 

PROVIDER: S-EPMC3912702 | biostudies-literature | 2014 Jan-Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

PhenDisco: phenotype discovery system for the database of genotypes and phenotypes.

Doan Son S   Lin Ko-Wei KW   Conway Mike M   Ohno-Machado Lucila L   Hsieh Alex A   Feupe Stephanie Feudjio SF   Garland Asher A   Ross Mindy K MK   Jiang Xiaoqian X   Farzaneh Seena S   Walker Rebecca R   Alipanah Neda N   Zhang Jing J   Xu Hua H   Kim Hyeon-Eui HE  

Journal of the American Medical Informatics Association : JAMIA 20130829 1


The database of genotypes and phenotypes (dbGaP) developed by the National Center for Biotechnology Information (NCBI) is a resource that contains information on various genome-wide association studies (GWAS) and is currently available via NCBI's dbGaP Entrez interface. The database is an important resource, providing GWAS data that can be used for new exploratory research or cross-study validation by authorized users. However, finding studies relevant to a particular phenotype of interest is ch  ...[more]

Similar Datasets

| S-EPMC3965052 | biostudies-literature
| S-EPMC2031016 | biostudies-literature
| S-EPMC4326710 | biostudies-literature
| S-EPMC8489415 | biostudies-literature
| S-EPMC2790299 | biostudies-literature
| S-EPMC4384019 | biostudies-literature
| S-EPMC5716007 | biostudies-literature
| S-EPMC6093839 | biostudies-other
| S-EPMC4861960 | biostudies-other
| S-EPMC3386880 | biostudies-other