Unknown

Dataset Information

0

Beegle: from literature mining to disease-gene discovery.


ABSTRACT: Disease-gene identification is a challenging process that has multiple applications within functional genomics and personalized medicine. Typically, this process involves both finding genes known to be associated with the disease (through literature search) and carrying out preliminary experiments or screens (e.g. linkage or association studies, copy number analyses, expression profiling) to determine a set of promising candidates for experimental validation. This requires extensive time and monetary resources. We describe Beegle, an online search and discovery engine that attempts to simplify this process by automating the typical approaches. It starts by mining the literature to quickly extract a set of genes known to be linked with a given query, then it integrates the learning methodology of Endeavour (a gene prioritization tool) to train a genomic model and rank a set of candidate genes to generate novel hypotheses. In a realistic evaluation setup, Beegle has an average recall of 84% in the top 100 returned genes as a search engine, which improves the discovery engine by 12.6% in the top 5% prioritized genes. Beegle is publicly available at http://beegle.esat.kuleuven.be/.

SUBMITTER: ElShal S 

PROVIDER: S-EPMC4737179 | biostudies-literature | 2016 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

Beegle: from literature mining to disease-gene discovery.

ElShal Sarah S   Tranchevent Léon-Charles LC   Sifrim Alejandro A   Ardeshirdavani Amin A   Davis Jesse J   Moreau Yves Y  

Nucleic acids research 20150917 2


Disease-gene identification is a challenging process that has multiple applications within functional genomics and personalized medicine. Typically, this process involves both finding genes known to be associated with the disease (through literature search) and carrying out preliminary experiments or screens (e.g. linkage or association studies, copy number analyses, expression profiling) to determine a set of promising candidates for experimental validation. This requires extensive time and mon  ...[more]

Similar Datasets

| S-EPMC3508881 | biostudies-literature
| S-EPMC2929241 | biostudies-literature
| S-EPMC5975655 | biostudies-literature
| S-EPMC11399685 | biostudies-literature
2017-09-26 | GSE103413 | GEO
| S-EPMC2944780 | biostudies-literature
| S-EPMC5844215 | biostudies-literature
| S-EPMC3325219 | biostudies-literature
| S-EPMC11297645 | biostudies-literature
| S-EPMC5181534 | biostudies-literature