Project description:Using a public reference data set of 82 unique entities, 382 nanopore-sequenced brain tumor samples were classified based on their methylation status through an ad hoc random forest algorithm. As a measure of confidence, score recalibration was performed and platform-specific thresholds were defined.
Project description:A Random Forest model is developed to incorporate tumor mutation data within the context of the biological process known as leukocyte proliferation regulation. This model aims to predict a patient's response to anti-PD1 treatment.
The authors conducted experiments using four different types of classifiers: Random Forest, Gradient Boosting, Feed Forward Neural Network, and Long Short-Term Memory (LSTM) recurrent neural network. Among these classifiers, the Random Forest algorithm yielded the best predictive performance when modeling gene mutation data associated with the 'leukocyte proliferation regulation' biological process. Hence, this curated version of the model focuses on the Random Forest model trained specifically on the 'Leukocyte Proliferation Regulation' process.
In this model, a value of '0' is assigned to NonResponders, while a value of '1' is assigned to Responders. Please note that to obtain predictions, users should provide mutation data containing only the genes corresponding to the 'GO_REGULATION_OF_LEUKOCYTE_PROLIFERATION' process keyword, as specified in the 'GO_test_genes_dict_intersection' dictionary.
Project description:Immunotherapy has improved the prognosis of patients with advanced non-small cell lung
cancer (NSCLC), but only a small subset of patients achieved clinical benefit. The purpose of our study was to integrate multidimensional data using a machine learning method to predict the therapeutic efficacy of immune checkpoint inhibitors (ICIs) monotherapy in patients with advanced NSCLC.The authors retrospectively enrolled 112 patients with stage IIIB-IV NSCLC receiving ICIs monotherapy. The random forest (RF) algorithm was used to establish efficacy prediction models based on five different input datasets, including precontrast computed tomography (CT) radiomic data, postcontrast CT radiomic data, combination of the two CT radiomic data, clinical data, and a combination of radiomic and clinical data. The 5-fold cross-validation was used to train and test the random forest classifier. The performance of the models was assessed according to the area under the curve (AUC) in the receiver operating characteristic (ROC) curve. Among these models(RF MLP LR XGBoost), our reproduced onnx models have better performance, especially for random forest. The response variable with a value (1/0) indicates the (efficacy/inefficacy) of PD-1/PD-L1 monotherapy in patients with advanced NSCLC
Project description:This is a Random Forest algorithm-based machine learning model to predict lncRNAs from coding mRNAs in plant transcriptomic data. The model assigns 1 for coding sequences and 2 for long non-coding sequences. The prediction is performed using a combination of Open Reading Frame (ORF) based, Sequence-based and Codon-bias features. Users need to download the curated ONNX model and also need to convert the sequences into feature matrix as mentioned in PLIT paper (Deshpande et al. 2019) to make predictions on sequences from Zea Mays sequence data.
Project description:We examined published microarray data from 104 acute lymphoblastic leukaemia patient specimens, that represent six different subgroups defined by cytogenetic features and immunophenotypes. Using the decision-tree based supervised learning algorithm Random Forest (RF), we determined a small set of genes for optimal subgroup distinction and subsequently validated their predictive power in an independent cohort of 68 specimens that were assessed using Affymetrix HG-U133A arrays.
Project description:This is a Random Forest algorithm-based machine learning model called RF16, which incorporates a total of 16 genomic, molecular, demographic, and clinical features to predict the immunotherapy response for a patient. The model assigns a value of 0 for NonResponder and 1 for Responder. Please be aware that the column names in the GitHub code and the downloaded dataset from the publication may vary. Users are advised to make minor adjustments to either the code or the dataset to ensure compatibility. The curated version of the model has modified the column names in the training code to align with the dataset.
GitHub repository: https://github.com/CCF-ChanLab/MSK-IMPACT-IO