Unknown

Dataset Information

0

A Bayesian approach to accurate and robust signature detection on LINCS L1000 data.


ABSTRACT: MOTIVATION:LINCS L1000 dataset contains numerous cellular expression data induced by large sets of perturbagens. Although it provides invaluable resources for drug discovery as well as understanding of disease mechanisms, the existing peak deconvolution algorithms cannot recover the accurate expression level of genes in many cases, inducing severe noise in the dataset and limiting its applications in biomedical studies. RESULTS:Here, we present a novel Bayesian-based peak deconvolution algorithm that gives unbiased likelihood estimations for peak locations and characterize the peaks with probability based z-scores. Based on the above algorithm, we build a pipeline to process raw data from L1000 assay into signatures that represent the features of perturbagen. The performance of the proposed pipeline is evaluated using similarity between the signatures of bio-replicates and the drugs with shared targets, and the results show that signatures derived from our pipeline gives a substantially more reliable and informative representation for perturbagens than existing methods. Thus, the new pipeline may significantly boost the performance of L1000 data in the downstream applications such as drug repurposing, disease modeling and gene function prediction. AVAILABILITY AND IMPLEMENTATION:The code and the precomputed data for LINCS L1000 Phase II (GSE 70138) are available at https://github.com/njpipeorgan/L1000-bayesian. SUPPLEMENTARY INFORMATION:Supplementary data are available at Bioinformatics online.

SUBMITTER: Qiu Y 

PROVIDER: S-EPMC7203754 | biostudies-literature | 2020 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

A Bayesian approach to accurate and robust signature detection on LINCS L1000 data.

Qiu Yue Y   Lu Tianhuan T   Lim Hansaim H   Xie Lei L  

Bioinformatics (Oxford, England) 20200501 9


<h4>Motivation</h4>LINCS L1000 dataset contains numerous cellular expression data induced by large sets of perturbagens. Although it provides invaluable resources for drug discovery as well as understanding of disease mechanisms, the existing peak deconvolution algorithms cannot recover the accurate expression level of genes in many cases, inducing severe noise in the dataset and limiting its applications in biomedical studies.<h4>Results</h4>Here, we present a novel Bayesian-based peak deconvol  ...[more]

Similar Datasets

| S-EPMC4333019 | biostudies-literature
| S-EPMC4965635 | biostudies-literature
| S-EPMC5532784 | biostudies-literature
| S-EPMC6588157 | biostudies-literature
| S-EPMC4086130 | biostudies-literature
| S-EPMC5389891 | biostudies-literature
| S-EPMC7788947 | biostudies-literature
| S-EPMC6175892 | biostudies-literature
| S-EPMC7324992 | biostudies-literature
| PRJEB21102 | ENA