Unknown

Dataset Information

0

A robust and accurate method for feature selection and prioritization from multi-class OMICs data.


ABSTRACT: Selecting relevant features is a common task in most OMICs data analysis, where the aim is to identify a small set of key features to be used as biomarkers. To this end, two alternative but equally valid methods are mainly available, namely the univariate (filter) or the multivariate (wrapper) approach. The stability of the selected lists of features is an often neglected but very important requirement. If the same features are selected in multiple independent iterations, they more likely are reliable biomarkers. In this study, we developed and evaluated the performance of a novel method for feature selection and prioritization, aiming at generating robust and stable sets of features with high predictive power. The proposed method uses the fuzzy logic for a first unbiased feature selection and a Random Forest built from conditional inference trees to prioritize the candidate discriminant features. Analyzing several multi-class gene expression microarray data sets, we demonstrate that our technique provides equal or better classification performance and a greater stability as compared to other Random Forest-based feature selection methods.

SUBMITTER: Fortino V 

PROVIDER: S-EPMC4172658 | biostudies-literature | 2014

REPOSITORIES: biostudies-literature

altmetric image

Publications

A robust and accurate method for feature selection and prioritization from multi-class OMICs data.

Fortino Vittorio V   Kinaret Pia P   Fyhrquist Nanna N   Alenius Harri H   Greco Dario D  

PloS one 20140923 9


Selecting relevant features is a common task in most OMICs data analysis, where the aim is to identify a small set of key features to be used as biomarkers. To this end, two alternative but equally valid methods are mainly available, namely the univariate (filter) or the multivariate (wrapper) approach. The stability of the selected lists of features is an often neglected but very important requirement. If the same features are selected in multiple independent iterations, they more likely are re  ...[more]

Similar Datasets

| S-EPMC8876179 | biostudies-literature
| S-EPMC10944569 | biostudies-literature
| S-EPMC5738110 | biostudies-literature
| S-EPMC9533501 | biostudies-literature
| S-EPMC8896606 | biostudies-literature
| S-EPMC6157248 | biostudies-literature
| S-EPMC9146727 | biostudies-literature
| S-EPMC10701104 | biostudies-literature
| S-EPMC8288516 | biostudies-literature
| S-EPMC3737526 | biostudies-literature