Dataset Information

A kernel-based multivariate feature selection method for microarray data classification.

ABSTRACT: High dimensionality and small sample sizes, and their inherent risk of overfitting, pose great challenges for constructing efficient classifiers in microarray data classification. Therefore a feature selection technique should be conducted prior to data classification to enhance prediction performance. In general, filter methods can be considered as principal or auxiliary selection mechanism because of their simplicity, scalability, and low computational complexity. However, a series of trivial examples show that filter methods result in less accurate performance because they ignore the dependencies of features. Although few publications have devoted their attention to reveal the relationship of features by multivariate-based methods, these methods describe relationships among features only by linear methods. While simple linear combination relationship restrict the improvement in performance. In this paper, we used kernel method to discover inherent nonlinear correlations among features as well as between feature and target. Moreover, the number of orthogonal components was determined by kernel Fishers linear discriminant analysis (FLDA) in a self-adaptive manner rather than by manual parameter settings. In order to reveal the effectiveness of our method we performed several experiments and compared the results between our method and other competitive multivariate-based features selectors. In our comparison, we used two classifiers (support vector machine, [Formula: see text]-nearest neighbor) on two group datasets, namely two-class and multi-class datasets. Experimental results demonstrate that the performance of our method is better than others, especially on three hard-classify datasets, namely Wang's Breast Cancer, Gordon's Lung Adenocarcinoma and Pomeroy's Medulloblastoma.

SUBMITTER: Sun S

PROVIDER: S-EPMC4105478 | biostudies-literature | 2014

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

A kernel-based multivariate feature selection method for microarray data classification.

Sun Shiquan S Peng Qinke Q Shakoor Adnan A

PloS one 20140721 7

High dimensionality and small sample sizes, and their inherent risk of overfitting, pose great challenges for constructing efficient classifiers in microarray data classification. Therefore a feature selection technique should be conducted prior to data classification to enhance prediction performance. In general, filter methods can be considered as principal or auxiliary selection mechanism because of their simplicity, scalability, and low computational complexity. However, a series of trivial ...[more]

PMID: 25048512

Dataset Information

A kernel-based multivariate feature selection method for microarray data classification.

Publications

A kernel-based multivariate feature selection method for microarray data classification.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Efficient feature selection and classification for microarray data.
| S-EPMC6101392 | biostudies-literature

Genetic algorithm-based feature selection with manifold learning for cancer classification using microarray data.
| S-EPMC10082986 | biostudies-literature

An entropy-based gene selection method for cancer classification using microarray data.
| S-EPMC1087831 | biostudies-literature

Entropy based sub-dimensional evaluation and selection method for DNA microarray data classification.
| S-EPMC2639693 | biostudies-literature

Novel feature selection method via kernel tensor decomposition for improved multi-omics data analysis.
| S-EPMC8876179 | biostudies-literature

A robust gene selection method for microarray-based cancer classification.
| S-EPMC2834377 | biostudies-literature

Feature selection and classification for microarray data analysis: evolutionary methods for identifying predictive genes.
| S-EPMC1181625 | biostudies-literature

Classification of heterogeneous microarray data by maximum entropy kernel.
| S-EPMC1994960 | biostudies-literature

Voxel-Wise Feature Selection Method for CNN Binary Classification of Neuroimaging Data.
| S-EPMC8093438 | biostudies-literature

Evaluation of data integration strategies based on kernel method of clinical and microarray data.
| S-EPMC3283887 | biostudies-literature