Unknown

Dataset Information

0

Artificial Intelligence based wrapper for high dimensional feature selection.


ABSTRACT:

Background

Feature selection is important in high dimensional data analysis. The wrapper approach is one of the ways to perform feature selection, but it is computationally intensive as it builds and evaluates models of multiple subsets of features. The existing wrapper algorithm primarily focuses on shortening the path to find an optimal feature set. However, it underutilizes the capability of feature subset models, which impacts feature selection and its predictive performance.

Method and results

This study proposes a novel Artificial Intelligence based Wrapper (AIWrap) algorithm that integrates Artificial Intelligence (AI) with the existing wrapper algorithm. The algorithm develops a Performance Prediction Model using AI which predicts the model performance of any feature set and allows the wrapper algorithm to evaluate the feature subset performance in a model without building the model. The algorithm can make the wrapper algorithm more relevant for high-dimensional data. We evaluate the performance of this algorithm using simulated studies and real research studies. AIWrap shows better or at par feature selection and model prediction performance than standard penalized feature selection algorithms and wrapper algorithms.

Conclusion

AIWrap approach provides an alternative algorithm to the existing algorithms for feature selection. The current study focuses on AIWrap application in continuous cross-sectional data. However, it could be applied to other datasets like longitudinal, categorical and time-to-event biological data.

SUBMITTER: Jain R 

PROVIDER: S-EPMC10585895 | biostudies-literature | 2023 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

Artificial Intelligence based wrapper for high dimensional feature selection.

Jain Rahi R   Xu Wei W  

BMC bioinformatics 20231018 1


<h4>Background</h4>Feature selection is important in high dimensional data analysis. The wrapper approach is one of the ways to perform feature selection, but it is computationally intensive as it builds and evaluates models of multiple subsets of features. The existing wrapper algorithm primarily focuses on shortening the path to find an optimal feature set. However, it underutilizes the capability of feature subset models, which impacts feature selection and its predictive performance.<h4>Meth  ...[more]

Similar Datasets

| S-EPMC5738058 | biostudies-literature
| S-EPMC4342225 | biostudies-literature
| S-EPMC6885241 | biostudies-literature
| S-EPMC3445441 | biostudies-literature
| S-EPMC3577111 | biostudies-literature
| S-EPMC10119907 | biostudies-literature
| S-EPMC10193239 | biostudies-literature
| S-EPMC7092448 | biostudies-literature
| S-EPMC4349086 | biostudies-literature
| S-EPMC7886179 | biostudies-literature