Unknown

Dataset Information

0

MSIFinder: a python package for detecting MSI status using random forest classifier.


ABSTRACT:

Background

Microsatellite instability (MSI) is a common genomic alteration in colorectal cancer, endometrial carcinoma, and other solid tumors. MSI is characterized by a high degree of polymorphism in microsatellite lengths owing to the deficiency in the mismatch repair system. Based on the degree, MSI can be classified as microsatellite instability-high (MSI-H) and microsatellite stable (MSS). MSI is a predictive biomarker for immunotherapy efficacy in advanced/metastatic solid tumors, especially in colorectal cancer patients. Several computational approaches based on target panel sequencing data have been used to detect MSI; however, they are considerably affected by the sequencing depth and panel size.

Results

We developed MSIFinder, a python package for automatic MSI classification, using random forest classifier (RFC)-based genome sequencing, which is a machine learning technology. We included 19 MSI-H and 25 MSS samples as training sets. First, we selected 54 feature markers from the training sets, built an RFC model, and validated the classifier using a test set comprising 21 MSI-H and 379 MSS samples. With this test set, MSIFinder achieved a sensitivity (recall) of 1.0, a specificity of 0.997, an accuracy of 0.998, a positive predictive value of 0.954, an F1 score of 0.977, and an area under the curve of 0.999. To further verify the robustness and effectiveness of the model, we used a prospective cohort consisting of 18 MSI-H samples and 122 MSS samples. MSIFinder achieved a sensitivity (recall) of 1.0 and a specificity of 1.0. We discovered that MSIFinder is less affected by a low sequencing depth and can achieve a concordance of 0.993 while exhibiting a sequencing depth of 100×. Furthermore, we realized that MSIFinder is less affected by the panel size and can achieve a concordance of 0.99 when the panel size is 0.5 M (million bases).

Conclusion

These results indicate that MSIFinder is a robust and effective MSI classification tool that can provide reliable MSI detection for scientific and clinical purposes.

SUBMITTER: Zhou T 

PROVIDER: S-EPMC8042960 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC5548337 | biostudies-literature
| S-EPMC4822295 | biostudies-literature
| S-EPMC4402517 | biostudies-literature
| S-EPMC4957112 | biostudies-literature
| S-EPMC3724815 | biostudies-literature
| S-EPMC4682995 | biostudies-literature
| S-EPMC6102638 | biostudies-literature
| S-EPMC8017839 | biostudies-literature
| S-EPMC8236179 | biostudies-literature
| S-EPMC7257555 | biostudies-literature