Dataset Information

RNA-Seq Count Data Modelling by Grey Relational Analysis and Nonparametric Gaussian Process.

ABSTRACT: This paper introduces an approach to classification of RNA-seq read counts using grey relational analysis (GRA) and Bayesian Gaussian process (GP) models. Read counts are transformed to microarray-like data to facilitate normal-based statistical methods. GRA is designed to select differentially expressed genes by integrating outcomes of five individual feature selection methods including two-sample t-test, entropy test, Bhattacharyya distance, Wilcoxon test and receiver operating characteristic curve. GRA performs as an aggregate filter method through combining advantages of the individual methods to produce significant feature subsets that are then fed into a nonparametric GP model for classification. The proposed approach is verified by using two benchmark real datasets and the five-fold cross-validation method. Experimental results show the performance dominance of the GRA-based feature selection method as well as GP classifier against their competing methods. Moreover, the results demonstrate that GRA-GP considerably dominates the sparse Poisson linear discriminant analysis classifiers, which were introduced specifically for read counts, on different number of features. The proposed approach therefore can be implemented effectively in real practice for read count data analysis, which is useful in many applications including understanding disease pathogenesis, diagnosis and treatment monitoring at the molecular level.

SUBMITTER: Nguyen T

PROVIDER: S-EPMC5082617 | biostudies-literature | 2016

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

RNA-Seq Count Data Modelling by Grey Relational Analysis and Nonparametric Gaussian Process.

Nguyen Thanh T Bhatti Asim A Yang Samuel S Nahavandi Saeid S

PloS one 20161026 10

This paper introduces an approach to classification of RNA-seq read counts using grey relational analysis (GRA) and Bayesian Gaussian process (GP) models. Read counts are transformed to microarray-like data to facilitate normal-based statistical methods. GRA is designed to select differentially expressed genes by integrating outcomes of five individual feature selection methods including two-sample t-test, entropy test, Bhattacharyya distance, Wilcoxon test and receiver operating characteristic ...[more]

PMID: 27783633

Dataset Information

RNA-Seq Count Data Modelling by Grey Relational Analysis and Nonparametric Gaussian Process.

Publications

RNA-Seq Count Data Modelling by Grey Relational Analysis and Nonparametric Gaussian Process.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Flexible link functions in nonparametric binary regression with Gaussian process priors.
| S-EPMC4914475 | biostudies-literature

LFCseq: a nonparametric approach for differential expression analysis of RNA-seq data.
| S-EPMC4304217 | biostudies-other

Morphometric Gaussian Process for Landmarking on Grey Matter Tetrahedral Models.
| S-EPMC8112202 | biostudies-literature

<i>ComBat-seq</i>: batch effect adjustment for RNA-seq count data.
| S-EPMC7518324 | biostudies-literature

Data from fitting Gaussian process models to various data sets using eight Gaussian process software packages.
| S-EPMC5995745 | biostudies-literature

NPEBseq: nonparametric empirical bayesian-based procedure for differential expression analysis of RNA-seq data.
| S-EPMC3765716 | biostudies-literature

A sparse negative binomial mixture model for clustering RNA-seq count data.
| S-EPMC9766880 | biostudies-literature

Latent periodic process inference from single-cell RNA-seq data.
| S-EPMC7080821 | biostudies-literature

Error estimates for the analysis of differential expression from RNA-seq count data.
| S-EPMC4179614 | biostudies-literature

A Gaussian Process Model of Human Electrocorticographic Data.
| S-EPMC7472198 | biostudies-literature