Dataset Information

A graph-embedded deep feedforward network for disease outcome classification and feature selection using gene expression data.

ABSTRACT: Motivation:Gene expression data represents a unique challenge in predictive model building, because of the small number of samples (n) compared with the huge amount of features (p). This 'n?p' property has hampered application of deep learning techniques for disease outcome classification. Sparse learning by incorporating external gene network information could be a potential solution to this issue. Still, the problem is very challenging because (i) there are tens of thousands of features and only hundreds of training samples, (ii) the scale-free structure of the gene network is unfriendly to the setup of convolutional neural networks. Results:To address these issues and build a robust classification model, we propose the Graph-Embedded Deep Feedforward Networks (GEDFN), to integrate external relational information of features into the deep neural network architecture. The method is able to achieve sparse connection between network layers to prevent overfitting. To validate the method's capability, we conducted both simulation experiments and real data analysis using a breast invasive carcinoma RNA-seq dataset and a kidney renal clear cell carcinoma RNA-seq dataset from The Cancer Genome Atlas. The resulting high classification accuracy and easily interpretable feature selection results suggest the method is a useful addition to the current graph-guided classification models and feature selection procedures. Availability and implementation:The method is available at https://github.com/yunchuankong/GEDFN. Supplementary information:Supplementary data are available at Bioinformatics online.

SUBMITTER: Kong Y

PROVIDER: S-EPMC6198851 | biostudies-literature | 2018 Nov

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

A graph-embedded deep feedforward network for disease outcome classification and feature selection using gene expression data.

Kong Yunchuan Y Yu Tianwei T

Bioinformatics (Oxford, England) 20181101 21

<h4>Motivation</h4>Gene expression data represents a unique challenge in predictive model building, because of the small number of samples (n) compared with the huge amount of features (p). This 'n≪p' property has hampered application of deep learning techniques for disease outcome classification. Sparse learning by incorporating external gene network information could be a potential solution to this issue. Still, the problem is very challenging because (i) there are tens of thousands of feature ...[more]

PMID: 29850911

Dataset Information

A graph-embedded deep feedforward network for disease outcome classification and feature selection using gene expression data.

Publications

A graph-embedded deep feedforward network for disease outcome classification and feature selection using gene expression data.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Efficient feature selection and classification for microarray data.
| S-EPMC6101392 | biostudies-literature

forgeNet: a graph deep neural network model using tree-based ensemble classifiers for feature graph construction.
| S-EPMC7267822 | biostudies-literature

Speech emotion classification using attention based network and regularized feature selection.
| S-EPMC10368662 | biostudies-literature

A Deep Neural Network Model using Random Forest to Extract Feature Representation for Gene Expression Data Classification.
| S-EPMC6220289 | biostudies-literature

A kernel-based multivariate feature selection method for microarray data classification.
| S-EPMC4105478 | biostudies-literature

Interaction-based feature selection and classification for high-dimensional biological data.
| S-EPMC3577111 | biostudies-literature

DeepFeature: feature selection in nonimage data using convolutional neural network.
| S-EPMC8575039 | biostudies-literature

Optimization of mine ventilation network feature graph.
| S-EPMC7668600 | biostudies-literature

The classification of brain network for major depressive disorder patients based on deep graph convolutional neural network.
| S-EPMC9908753 | biostudies-literature

An Innovative Multi-Model Neural Network Approach for Feature Selection in Emotion Recognition Using Deep Feature Clustering.
| S-EPMC7374326 | biostudies-literature