MiRBind: a Deep Learning Method for miRNA Binding Classification
Ontology highlight
ABSTRACT: Here we present miR-eCLIP analysis of AGO2 in HEK293 cells to address the small RNA repertoire and uncover their physiological targets. We developed an optimized bioinformatics approach of chimeric read identification to detect chimeras of high confidence, which were useed as an biologically validated input for miRBind, a deep learning method and web-server that can be used to accurately predict the potential of miRNA:target site binding.
Project description:Here we present the CLASH analysis of AGO2 in HEK293 cells to address the small RNA repertoire and uncover their physiological targets. We developed an optimized bioinformatics approach of chimeric read identification to detect chimeras of high confidence. We report thousands of Ago2 target sites driven by microRNAs, but also a substantial number of Ago2 ‘drivers’ derived from fragments of other small RNAs such as tRNAs, snoRNAs, rRNAs and others. Target validation of several miRNAs delivered by 3’ Quantseq RNA-Seq.
Project description:Here we present the CLASH analysis of AGO2 in HEK293 cells to address the small RNA repertoire and uncover their physiological targets. We developed an optimized bioinformatics approach of chimeric read identification to detect chimeras of high confidence. We report thousands of Ago2 target sites driven by microRNAs, but also a substantial number of Ago2 ‘drivers’ derived from fragments of other small RNAs such as tRNAs, snoRNAs, rRNAs and others. Target validation of several miRNAs delivered by 3’ Quantseq RNA-Seq.
Project description:The binding of microRNAs (miRNAs) to their target sites is a complex process, mediated by the Argonaute (Ago) family of proteins. The prediction of miRNA:target site binding is an important first step for any miRNA target prediction algorithm. To date, the potential for miRNA:target site binding is evaluated using either co-folding free energy measures or heuristic approaches, based on the identification of binding 'seeds', i.e., continuous stretches of binding corresponding to specific parts of the miRNA. The limitations of both these families of methods have produced generations of miRNA target prediction algorithms that are primarily focused on 'canonical' seed targets, even though unbiased experimental methods have shown that only approximately half of in vivo miRNA targets are 'canonical'. Herein, we present miRBind, a deep learning method and web server that can be used to accurately predict the potential of miRNA:target site binding. We trained our method using seed-agnostic experimental data and show that our method outperforms both seed-based approaches and co-fold free energy approaches. The full code for the development of miRBind and a freely accessible web server are freely available.
Project description:Isocitrate dehydrogenase (IDH) mutation status has emerged as an important prognostic marker in gliomas. This study sought to develop deep learning networks for non-invasive IDH classification using T2w MR images while comparing their performance to a multi-contrast network. Methods: Multi-contrast brain tumor MRI and genomic data were obtained from The Cancer Imaging Archive (TCIA) and The Erasmus Glioma Database (EGD). Two separate 2D networks were developed using nnU-Net, a T2w-image-only network (T2-net) and a multi-contrast network (MC-net). Each network was separately trained using TCIA (227 subjects) or TCIA + EGD data (683 subjects combined). The networks were trained to classify IDH mutation status and implement single-label tumor segmentation simultaneously. The trained networks were tested on over 1100 held-out datasets including 360 cases from UT Southwestern Medical Center, 136 cases from New York University, 175 cases from the University of Wisconsin-Madison, 456 cases from EGD (for the TCIA-trained network), and 495 cases from the University of California, San Francisco public database. A receiver operating characteristic curve (ROC) was drawn to calculate the AUC value to determine classifier performance. Results: T2-net trained on TCIA and TCIA + EGD datasets achieved an overall accuracy of 85.4% and 87.6% with AUCs of 0.86 and 0.89, respectively. MC-net trained on TCIA and TCIA + EGD datasets achieved an overall accuracy of 91.0% and 92.8% with AUCs of 0.94 and 0.96, respectively. We developed reliable, high-performing deep learning algorithms for IDH classification using both a T2-image-only and a multi-contrast approach. The networks were tested on more than 1100 subjects from diverse databases, making this the largest study on image-based IDH classification to date.
Project description:Leukemia is a cancer of blood cells in the bone marrow that affects both children and adolescents. The rapid growth of unusual lymphocyte cells leads to bone marrow failure, which may slow down the production of new blood cells, and hence increases patient morbidity and mortality. Age is a crucial clinical factor in leukemia diagnosis, since if leukemia is diagnosed in the early stages, it is highly curable. Incidence is increasing globally, as around 412,000 people worldwide are likely to be diagnosed with some type of leukemia, of which acute lymphoblastic leukemia accounts for approximately 12% of all leukemia cases worldwide. Thus, the reliable and accurate detection of normal and malignant cells is of major interest. Automatic detection with computer-aided diagnosis (CAD) models can assist medics, and can be beneficial for the early detection of leukemia. In this paper, a single center study, we aimed to build an aggregated deep learning model for Leukemic B-lymphoblast classification. To make a reliable and accurate deep learner, data augmentation techniques were applied to tackle the limited dataset size, and a transfer learning strategy was employed to accelerate the learning process, and further improve the performance of the proposed network. The results show that our proposed approach was able to fuse features extracted from the best deep learning models, and outperformed individual networks with a test accuracy of 96.58% in Leukemic B-lymphoblast diagnosis.
Project description:Background: Flat foot deformity is a prevalent and challenging condition often leading to various clinical complications. Accurate identification of abnormal foot types is essential for appropriate interventions. Method: A dataset consisting of 1573 plantar pressure images from 125 individuals was collected. The performance of the You Only Look Once v5 (YOLO-v5) model, improved YOLO-v5 model, and multi-label classification model was evaluated for foot type identification using the collected images. A new dataset was also collected to verify and compare the models. Results: The multi-label classification algorithm based on ResNet-50 outperformed other algorithms. The improved YOLO-v5 model with Squeeze-and-Excitation (SE), the improved YOLO-v5 model with Convolutional Block Attention Module (CBAM), and the multilabel classification model based on ResNet-50 achieved an accuracy of 0.652, 0.717, and 0.826, respectively, which is significantly higher than those obtained using the ordinary plantar-pressure system and the standard YOLO-v5 model. Conclusion: These results indicate that the proposed DL-based multilabel classification model based on ResNet-50 is superior in flat foot type detection and can be used to evaluate the clinical rehabilitation status of patients with abnormal foot types and various foot pathologies when more data on patients with various diseases are available for training.
Project description:Sugar in the blood can harm individuals and their vital organs, potentially leading to blindness, renal illness, as well as kidney and heart diseases. Globally, diabetic patients face an average annual mortality rate of 38%. This study employs Chi-square, mutual information, and sequential feature selection (SFS) to choose features for training multiple classifiers. These classifiers include an artificial neural network (ANN), a random forest (RF), a gradient boosting (GB) algorithm, Tab-Net, and a support vector machine (SVM). The goal is to predict the onset of diabetes at an earlier age. The classifier, developed based on the selected features, aims to enable early diagnosis of diabetes. The PIMA and early-risk diabetes datasets serve as test subjects for the developed system. The feature selection technique is then applied to focus on the most important and relevant features for model training. The experiment findings conclude that the ANN exhibited a spectacular performance in terms of accuracy on the PIMA dataset, achieving a remarkable accuracy rate of 99.35%. The second experiment, conducted on the early diabetes risk dataset using selected features, revealed that RF achieved an accuracy of 99.36%. Based on our experimental results, it can be concluded that our suggested method significantly outperformed baseline machine learning algorithms already employed for diabetes prediction on both datasets.
Project description:This study aimed to assess the utility of optic nerve head (onh) en-face images, captured with scanning laser ophthalmoscopy (slo) during standard optical coherence tomography (oct) imaging of the posterior segment, and demonstrate the potential of deep learning (dl) ensemble method that operates in a low data regime to differentiate glaucoma patients from healthy controls. The two groups of subjects were initially categorized based on a range of clinical tests including measurements of intraocular pressure, visual fields, oct derived retinal nerve fiber layer (rnfl) thickness and dilated stereoscopic examination of onh. 227 slo images of 227 subjects (105 glaucoma patients and 122 controls) were used. A new task-specific convolutional neural network architecture was developed for slo image-based classification. To benchmark the results of the proposed method, a range of classifiers were tested including five machine learning methods to classify glaucoma based on rnfl thickness-a well-known biomarker in glaucoma diagnostics, ensemble classifier based on inception v3 architecture, and classifiers based on features extracted from the image. The study shows that cross-validation dl ensemble based on slo images achieved a good discrimination performance with up to 0.962 of balanced accuracy, outperforming all of the other tested classifiers.