Project description:We present a language model Affordable Cancer Interception and Diagnostics (ACID) that can achieve high classification performance in the diagnosis of cancer exclusively from using raw cfDNA sequencing reads. We formulate ACID as an autoregressive language model. ACID is pretrained with language sentences that are obtained from concatenation of raw sequencing reads and diagnostic labels. We benchmark ACID against three methods. On testing set subjected to whole-genome sequencing, ACID significantly outperforms the best benchmarked method in diagnosis of cancer [Area Under the Receiver Operating Curve (AUROC), 0.924 versus 0.853; P < 0.001] and detection of hepatocellular carcinoma (AUROC, 0.981 versus 0.917; P < 0.001). ACID can achieve high accuracy with just 10 000 reads per sample. Meanwhile, ACID achieves the best performance on testing sets that were subjected to bisulfite sequencing compared with benchmarked methods. In summary, we present an affordable, simple yet efficient end-to-end paradigm for cancer detection using raw cfDNA sequencing reads.
Project description:Pancreatic cancer has the worst prognosis among all cancers. Cancer screening of body fluids may improve the survival time prognosis of patients, who are often diagnosed too late at an incurable stage. Several studies report the dysregulation of lipid metabolism in tumor cells, suggesting that changes in the blood lipidome may accompany tumor growth. Here we show that the comprehensive mass spectrometric determination of a wide range of serum lipids reveals statistically significant differences between pancreatic cancer patients and healthy controls, as visualized by multivariate data analysis. Three phases of biomarker discovery research (discovery, qualification, and verification) are applied for 830 samples in total, which shows the dysregulation of some very long chain sphingomyelins, ceramides, and (lyso)phosphatidylcholines. The sensitivity and specificity to diagnose pancreatic cancer are over 90%, which outperforms CA 19-9, especially at an early stage, and is comparable to established diagnostic imaging methods. Furthermore, selected lipid species indicate a potential as prognostic biomarkers.
Project description:Highlight • In terms of 5hmC sequencing data, cancer samples contained lower proportion of ultra-long fragments than control, and ultra-long fragments showed the largest deviation to control in coverage profile.• cfDNA hydroxymethylation and fragmentomic markers for cancer detection can be simultaneously detected in low-pass 5hmC sequencing data.• An integrated model combined fragmentomic features and hydroxymethylation signatures for pan-cancer detection with high sensitivity and specificity was built, which was based on low-pass 5hmC sequencing data.• Ultra-long fragments related features dominated the high sensitivity and specificity pan-cancer detection model. Background Using epigenetic markers and fragmentomics of cell-free DNA for cancer detection has been proven applicable. Methods We further investigated the diagnostic potential of combining two features (epigenetic markers and fragmentomic information) of cell-free DNA for detecting various types of cancers. To do this, we extracted cfDNA fragmentomic features from 191 whole-genome sequencing data and studied them in 396 low-pass 5hmC sequencing data, which included four common cancer types and control samples. Results In our analysis of 5hmC sequencing data from cancer samples, we observed aberrant ultra-long fragments (220–500 bp) that differed from normal samples in terms of both size and coverage profile. These fragments played a significant role in predicting cancer. Leveraging the ability to detect cfDNA hydroxymethylation and fragmentomic markers simultaneously in low-pass 5hmC sequencing data, we developed an integrated model that incorporated 63 features representing both fragmentomic features and hydroxymethylation signatures. This model achieved high sensitivity and specificity for pan-cancer detection (88.52% and 82.35%, respectively). Conclusion We showed that fragmentomic information in 5hmC sequencing data is an ideal marker for cancer detection and that it shows high performance in low-pass sequencing data.
Project description:Although circulating cell-free DNA (cfDNA) is a promising biomarker for the diagnosis and prognosis of various tumors, clinical correlation of cfDNA with gastric cancer has not been fully understood. To address this, we developed a highly sensitive cfDNA capture system by integrating polydopamine (PDA) and silica. PDA-silica hybrids incorporated different molecular interactions to a single system, enhancing cfDNA capture by 1.34-fold compared to the conventional silica-based approach (p = 0.001), which was confirmed using cell culture supernatants. A clinical study using human plasma samples revealed that the diagnostic accuracy of the new system to be superior than the commercially available cfDNA kit, as well as other serum antigen tests. Among the cancer patients, plasma cfDNA levels exhibited a good correlation with the size of a tumor. cfDNA was also predicative of distant metastasis, as the median cfDNA levels of metastatic cancer patients were ~60-fold higher than those without metastasis (p = 0.008). Furthermore, high concordance between tissue biopsy and cfDNA genomic analysis was found, as HER2 expression in cfDNA demonstrated an area under ROC curve (AUC) of 0.976 (p <0.001) for detecting patients with HER2-positive tumors. The new system also revealed high prognostic capability of cfDNA, as the concentration of cfDNA was highly associated with the survival outcomes. Our novel technology demonstrates the potential to achieve efficient detection of cfDNA that may serve as a reliable biomarker for gastric tumor.
Project description:In the field of medical imaging, deep learning has made considerable strides, particularly in the diagnosis of brain tumors. The Internet of Medical Things (IoMT) has made it possible to combine these deep learning models into advanced medical devices for more accurate and efficient diagnosis. Convolutional neural networks (CNNs) are a popular deep learning technique for brain tumor detection because they can be trained on vast medical imaging datasets to recognize cancers in new images. Despite its benefits, which include greater accuracy and efficiency, deep learning has disadvantages, such as high computing costs and the possibility of skewed findings due to inadequate training data. Further study is needed to fully understand the potential and limitations of deep learning in brain tumor detection in the IoMT and to overcome the obstacles associated with real-world implementation. In this study, we propose a new CNN-based deep learning model for brain tumor detection. The suggested model is an end-to-end model, which reduces the system's complexity in comparison to earlier deep learning models. In addition, our model is lightweight, as it is built from a small number of layers compared to other previous models, which makes the model suitable for real-time applications. The optimistic findings of a rapid increase in accuracy (99.48% for binary class and 96.86% for multi-class) demonstrate that the new framework model has excelled in the competition. This study demonstrates that the suggested deep model outperforms other CNNs for detecting brain tumors. Additionally, the study provides a framework for secure data transfer of medical lab results with security recommendations to ensure security in the IoMT.
Project description:Comprehensive descriptions of animal behavior require precise three-dimensional (3D) measurements of whole-body movements. Although two-dimensional approaches can track visible landmarks in restrictive environments, performance drops in freely moving animals, due to occlusions and appearance changes. Therefore, we designed DANNCE to robustly track anatomical landmarks in 3D across species and behaviors. DANNCE uses projective geometry to construct inputs to a convolutional neural network that leverages learned 3D geometric reasoning. We trained and benchmarked DANNCE using a dataset of nearly seven million frames that relates color videos and rodent 3D poses. In rats and mice, DANNCE robustly tracked dozens of landmarks on the head, trunk, and limbs of freely moving animals in naturalistic settings. We extended DANNCE to datasets from rat pups, marmosets, and chickadees, and demonstrate quantitative profiling of behavioral lineage during development.
Project description:Unlike for DNA and RNA, accurate and high-throughput sequencing methods for proteins are lacking, hindering the utility of proteomics in applications where the sequences are unknown including variant calling, neoepitope identification, and metaproteomics. We introduce Spectralis, a de novo peptide sequencing method for tandem mass spectrometry. Spectralis leverages several innovations including a convolutional neural network layer connecting peaks in spectra spaced by amino acid masses, proposing fragment ion series classification as a pivotal task for de novo peptide sequencing, and a peptide-spectrum confidence score. On spectra for which database search provided a ground truth, Spectralis surpassed 40% sensitivity at 90% precision, nearly doubling state-of-the-art sensitivity. Application to unidentified spectra confirmed its superiority and showcased its applicability to variant calling. Altogether, these algorithmic innovations and the substantial sensitivity increase in the high-precision range constitute an important step toward broadly applicable peptide sequencing.
Project description:Cell-free DNA (cfDNA) sequencing has demonstrated great potential for early cancer detection. However, most large-scale studies have focused only on either targeted methylation sites or whole-genome sequencing, limiting comprehensive analysis that integrates both epigenetic and genetic signatures. In this study, we present a platform that enables simultaneous analysis of whole-genome methylation, copy number, and fragmentomic patterns of cfDNA in a single assay. Using a total of 950 plasma (361 healthy and 589 cancer) and 240 tissue samples, we demonstrate that a multifeature cancer signature ensemble (CSE) classifier integrating all features outperforms single-feature classifiers. At 95.2% specificity, the cancer detection sensitivity with methylation, copy number, and fragmentomic models was 77.2%, 61.4%, and 60.5%, respectively, but sensitivity was significantly increased to 88.9% with the CSE classifier (p value < 0.0001). For tissue of origin, the CSE classifier enhanced the accuracy beyond the methylation classifier, from 74.3% to 76.4%. Overall, this work proves the utility of a signature ensemble integrating epigenetic and genetic information for accurate cancer detection.
Project description:A cell-free DNA (cfDNA) assay would be a promising approach to early cancer diagnosis, especially for patients with dense tissues. Consistent cfDNA signatures have been observed for many carcinogens. Recently, investigations of cfDNA as a reliable early detection bioassay have presented a powerful opportunity for detecting dense tissue screening complications early. We performed a prospective study to evaluate the potential of characterizing cfDNA as a central element in the early detection of dense tissue breast cancer (BC). Plasma samples were collected from 32 consenting subjects with dense tissue and positive mammograms, 20 with positive biopsies and 12 with negative biopsies. After screening and before biopsy, cfDNA was extracted, and whole-genome next-generation sequencing (NGS) was performed on all samples. Copy number alteration (CNA) and single nucleotide polymorphism (SNP)/insertion/deletion (Indel) analyses were performed to characterize cfDNA. In the positive-positive subjects (cases), a total of 5 CNAs overlapped with 5 previously reported BC-related oncogenes (KSR2, MAP2K4, MSI2, CANT1 and MSI2). In addition, 1 SNP was detected in KMT2C, a BC oncogene, and 9 others were detected in or near 10 genes (SERAC1, DAGLB, MACF1, NVL, FBXW4, FANK1, KCTD4, CAVIN1; ATP6V0A1 and ZBTB20-AS1) previously associated with non-BC cancers. For the positive-negative subjects (screening), 3 CNAs were detected in BC genes (ACVR2A, CUL3 and PIK3R1), and 5 SNPs were identified in 6 non-BC cancer genes (SNIP1, TBC1D10B, PANK1, PRKCA and RUNX2; SUPT3H). This study presents evidence of the potential of using cfDNA somatic variants as dense tissue BC biomarkers from a noninvasive liquid bioassay for early cancer detection.
Project description:We report an approach for cancer phenotyping based on targeted sequencing of cell-free DNA (cfDNA) for small cell lung cancer (SCLC). In SCLC, differential activation of transcription factors (TFs), such as ASCL1, NEUROD1, POU2F3, and REST defines molecular subtypes. We designed a targeted capture panel that identifies chromatin organization signatures at 1535 TF binding sites and 13,240 gene transcription start sites and detects exonic mutations in 842 genes. Sequencing of cfDNA from SCLC patient-derived xenograft models captured TF activity and gene expression and revealed individual highly informative loci. Prediction models of ASCL1 and NEUROD1 activity using informative loci achieved areas under the receiver operating characteristic curve (AUCs) from 0.84 to 0.88 in patients with SCLC. As non-SCLC (NSCLC) often transforms to SCLC following targeted therapy, we applied our framework to distinguish NSCLC from SCLC and achieved an AUC of 0.99. Our approach shows promising utility for SCLC subtyping and transformation monitoring, with potential applicability to diverse tumor types.