Unknown

Dataset Information

0

Turtling: a time-aware neural topic model on NIH grant data.


ABSTRACT:

Motivation

Recent initiatives for federal grant transparency allow direct knowledge extraction from large volumes of grant texts, serving as a powerful alternative to traditional surveys. However, its computational modeling is challenging as grants are usually multifaceted with constantly evolving topics.

Results

We propose Turtling, a time-aware neural topic model with three unique characteristics. First, Turtling employs pretrained biomedical word embedding to extract research topics. Second, it leverages a probabilistic time-series model to allow smooth and coherent topic evolution. Lastly, Turtling leverages additional topic diversity loss and funding institute classification loss to improve topic quality and facilitate funding institute prediction. We apply Turtling on publicly available NIH grant text and show that it significantly outperforms other methods on topic quality metrics. We also demonstrate that Turtling can provide insights into research topic evolution by detecting topic trends across decades. In summary, Turtling may be a valuable tool for grant text analysis.

Availability and implementation

Turtling is freely available as an open-source software at https://github.com/aicb-ZhangLabs/Turtling.

SUBMITTER: Zhang R 

PROVIDER: S-EPMC11216609 | biostudies-literature | 2023

REPOSITORIES: biostudies-literature

altmetric image

Publications

<i>Turtling</i>: a time-aware neural topic model on NIH grant data.

Zhang Ruiyi R   Duan Ziheng Z   Lee CheYu C   Riffle Dylan D   Min Martin Renqiang MR   Zhang Jing J  

Bioinformatics advances 20230724 1


<h4>Motivation</h4>Recent initiatives for federal grant transparency allow direct knowledge extraction from large volumes of grant texts, serving as a powerful alternative to traditional surveys. However, its computational modeling is challenging as grants are usually multifaceted with constantly evolving topics.<h4>Results</h4>We propose Turtling, a time-aware neural topic model with three unique characteristics. First, Turtling employs pretrained biomedical word embedding to extract research t  ...[more]

Similar Datasets

| S-EPMC7891716 | biostudies-literature
| S-EPMC9809243 | biostudies-literature
| S-EPMC10909194 | biostudies-literature
| S-EPMC11009615 | biostudies-literature
| S-EPMC10673642 | biostudies-literature
| S-EPMC2219790 | biostudies-other
| S-EPMC5866547 | biostudies-literature
| S-EPMC6358090 | biostudies-literature
| S-EPMC11371832 | biostudies-literature
| S-EPMC10171858 | biostudies-literature