Unknown

Dataset Information

0

Asymmetric author-topic model for knowledge discovering of big data in toxicogenomics.


ABSTRACT: The advancement of high-throughput screening technologies facilitates the generation of massive amount of biological data, a big data phenomena in biomedical science. Yet, researchers still heavily rely on keyword search and/or literature review to navigate the databases and analyses are often done in rather small-scale. As a result, the rich information of a database has not been fully utilized, particularly for the information embedded in the interactive nature between data points that are largely ignored and buried. For the past 10 years, probabilistic topic modeling has been recognized as an effective machine learning algorithm to annotate the hidden thematic structure of massive collection of documents. The analogy between text corpus and large-scale genomic data enables the application of text mining tools, like probabilistic topic models, to explore hidden patterns of genomic data and to the extension of altered biological functions. In this paper, we developed a generalized probabilistic topic model to analyze a toxicogenomics dataset that consists of a large number of gene expression data from the rat livers treated with drugs in multiple dose and time-points. We discovered the hidden patterns in gene expression associated with the effect of doses and time-points of treatment. Finally, we illustrated the ability of our model to identify the evidence of potential reduction of animal use.

SUBMITTER: Chung MH 

PROVIDER: S-EPMC4403303 | biostudies-literature | 2015

REPOSITORIES: biostudies-literature

altmetric image

Publications

Asymmetric author-topic model for knowledge discovering of big data in toxicogenomics.

Chung Ming-Hua MH   Wang Yuping Y   Tang Hailin H   Zou Wen W   Basinger John J   Xu Xiaowei X   Tong Weida W  

Frontiers in pharmacology 20150420


The advancement of high-throughput screening technologies facilitates the generation of massive amount of biological data, a big data phenomena in biomedical science. Yet, researchers still heavily rely on keyword search and/or literature review to navigate the databases and analyses are often done in rather small-scale. As a result, the rich information of a database has not been fully utilized, particularly for the information embedded in the interactive nature between data points that are lar  ...[more]

Similar Datasets

| S-EPMC7288990 | biostudies-literature
| S-EPMC6138951 | biostudies-literature
| S-EPMC4090507 | biostudies-literature
| S-EPMC6779380 | biostudies-literature
| S-EPMC7306473 | biostudies-literature
| S-EPMC3469446 | biostudies-literature
| S-EPMC4477241 | biostudies-other
| S-EPMC7006671 | biostudies-literature
| S-EPMC10673642 | biostudies-literature
2023-08-15 | GSE239996 | GEO