Unknown

Dataset Information

0

Comparative analysis of feature selection techniques for COVID-19 dataset.


ABSTRACT: In the context of early disease detection, machine learning (ML) has emerged as a vital tool. Feature selection (FS) algorithms play a crucial role in ensuring the accuracy of predictive models by identifying the most influential variables. This study, focusing on a retrospective cohort of 4778 COVID-19 patients from Iran, explores the performance of various FS methods, including filter, embedded, and hybrid approaches, in predicting mortality outcomes. The researchers leveraged 115 routine clinical, laboratory, and demographic features and employed 13 ML models to assess the effectiveness of these FS methods based on classification accuracy, predictive accuracy, and statistical tests. The results indicate that a Hybrid Boruta-VI model combined with the Random Forest algorithm demonstrated superior performance, achieving an accuracy of 0.89, an F1 score of 0.76, and an AUC value of 0.95 on test data. Key variables identified as important predictors of adverse outcomes include age, oxygen saturation levels, albumin levels, neutrophil counts, platelet levels, and markers of kidney function. These findings highlight the potential of advanced FS techniques and ML models in enhancing early disease detection and informing clinical decision-making.

SUBMITTER: Mohtasham F 

PROVIDER: S-EPMC11317481 | biostudies-literature | 2024 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

Comparative analysis of feature selection techniques for COVID-19 dataset.

Mohtasham Farideh F   Pourhoseingholi MohamadAmin M   Hashemi Nazari Seyed Saeed SS   Kavousi Kaveh K   Zali Mohammad Reza MR  

Scientific reports 20240811 1


In the context of early disease detection, machine learning (ML) has emerged as a vital tool. Feature selection (FS) algorithms play a crucial role in ensuring the accuracy of predictive models by identifying the most influential variables. This study, focusing on a retrospective cohort of 4778 COVID-19 patients from Iran, explores the performance of various FS methods, including filter, embedded, and hybrid approaches, in predicting mortality outcomes. The researchers leveraged 115 routine clin  ...[more]

Similar Datasets

| S-EPMC4137534 | biostudies-other
| S-EPMC10101453 | biostudies-literature
| S-EPMC8485143 | biostudies-literature
| S-EPMC11380031 | biostudies-literature
| S-EPMC6195279 | biostudies-literature
| S-EPMC8282111 | biostudies-literature
| S-EPMC8979841 | biostudies-literature
| S-EPMC7888282 | biostudies-literature
| S-EPMC11914577 | biostudies-literature
| S-EPMC10522575 | biostudies-literature