Unknown

Dataset Information

0

Integrated COVID-19 Predictor: Differential expression analysis to reveal potential biomarkers and prediction of coronavirus using RNA-Seq profile data.


ABSTRACT:

Background

The world has been battling the continuous COVID-19 pandemic spread by the SARS-CoV-2 virus for last two years. The issue of viral disease prediction is constantly a matter of interest in virology and the study of disease transmission over the long years.

Objective

In this study, we aimed to implement genome association studies using RNA-Seq of COVID-19 and reveal highly expressed gene biomarkers and prediction based on the machine learning model of COVID-19 analysis to combat this pandemic.

Method

We collected RNA-Seq gene count data for both healthy (Control) and non-healthy (Treated) COVID-19 cases. In this experiment, a sequence of bioinformatics strategies and statistical techniques, such as fold-change and adjusted p-value, were processed to identify differentially expressed genes (DEGs). We filtered biomarker sets of high DEGs, moderate DEGs, and low DEGs using DESeq2, Limma Trend, and Limma Voom methods based on intersection and union operations and applied machine learning techniques to predict COVID-19.

Result

Through experimental analysis, 67 potential biomarkers were extracted, comprising 49 up-regulated and 18 down-regulated genes, using statistical techniques and a set-theory consensus strategy. We trained the machine learning models on 12 different biomarker sets and found that the SVM model performed better than the other classifiers with 99.07% classification accuracy for moderate DEGs.

Conclusion

Our study revealed that identified differentially expressed genes of the moderate DEGs biomarker set, |log2FC| ≥ 2 with adjusted p-value < 0.05, work significantly as input features to implement a machine learning model using a kernel-based SVM technique to predict COVID-19.

SUBMITTER: Iqbal N 

PROVIDER: S-EPMC9162937 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC9354357 | biostudies-literature
| S-EPMC8496852 | biostudies-literature
| S-EPMC6299935 | biostudies-literature
| S-EPMC10057503 | biostudies-literature
| S-EPMC9322779 | biostudies-literature
| S-EPMC10406383 | biostudies-literature
| S-EPMC4132698 | biostudies-literature
| S-EPMC4393055 | biostudies-literature
| S-EPMC6450261 | biostudies-literature
| S-EPMC4924883 | biostudies-literature