Project description:The field of machine learning has allowed researchers to generate and analyse vast amounts of data using a wide variety of methodologies. Artificial Neural Networks (ANN) are some of the most commonly used statistical models and have been successful in biomarker discovery studies in multiple disease types. This review seeks to explore and evaluate an integrated ANN pipeline for biomarker discovery and validation in Alzheimer's disease, the most common form of dementia worldwide with no proven cause and no available cure. The proposed pipeline consists of analysing public data with a categorical and continuous stepwise algorithm and further examination through network inference to predict gene interactions. This methodology can reliably generate novel markers and further examine known ones and can be used to guide future research in Alzheimer's disease.
Project description:Hepatocellular carcinoma (HCC), the most prevalent form of liver cancer, is the third leading cause of mortality globally. Patients with HCC have a poor prognosis due to the fact that the emergence of symptoms typically occurs at a late stage of the disease. In addition, conventional biomarkers perform suboptimally when identifying HCC in its early stages, heightening the need for the identification of new and more effective biomarkers. Using metabolomics and lipidomics approaches, this study aims to identify serum biomarkers for identification of HCC in patients with liver cirrhosis (LC). Serum samples from 20 HCC cases and 20 patients with LC were analyzed using ultra-high-performance liquid chromatography-Q Exactive mass spectrometry (UHPLC-Q-Exactive-MS). Metabolites and lipids that are significantly altered between HCC cases and patients with LC were identified. These include organic acids, amino acids, TCA cycle intermediates, fatty acids, bile acids, glycerophospholipids, sphingolipids, and glycerolipids. The most significant variability was observed in the concentrations of bile acids, fatty acids, and glycerophospholipids. In the context of HCC cases, there was a notable increase in the levels of phosphatidylethanolamine and triglycerides, but the levels of fatty acids and phosphatidylcholine exhibited a substantial decrease. In addition, it was observed that all of the identified metabolites exhibited a superior area under the receiver operating characteristic (ROC) curve in comparison to alpha-fetoprotein (AFP). The pathway analysis of these metabolites revealed fatty acid, lipid, and energy metabolism as the most impacted pathways. Putative biomarkers identified in this study will be validated in future studies via targeted quantification.
Project description:Nonalcoholic steatohepatitis (NASH) is a major cause of liver fibrosis with increasing prevalence worldwide. Currently there are no approved drugs available. The development of new therapies is difficult as diagnosis and staging requires biopsies. Consequently, predictive plasma biomarkers would be useful for drug development. Here we present a multi-omics approach to characterize the molecular pathophysiology and to identify new plasma biomarkers in a choline-deficient L-amino acid-defined diet rat NASH model. We analyzed liver samples by RNA-Seq and proteomics, revealing disease relevant signatures and a high correlation between mRNA and protein changes. Comparison to human data showed an overlap of inflammatory, metabolic, and developmental pathways. Using proteomics analysis of plasma we identified mainly secreted proteins that correlate with liver RNA and protein levels. We developed a multi-dimensional attribute ranking approach integrating multi-omics data with liver histology and prior knowledge uncovering known human markers, but also novel candidates. Using regression analysis, we show that the top-ranked markers were highly predictive for fibrosis in our model and hence can serve as preclinical plasma biomarkers. Our approach presented here illustrates the power of multi-omics analyses combined with plasma proteomics and is readily applicable to human biomarker discovery.
Project description:Three osteosarcoma (OS) cell lines (MG-63, Saos-2 and U-2 OS) and 1 osteoblastic cell line (hFOB1.19) were collected for this work. MG-63 was kindly provided by Dr. Agi Grigoriadis from University College London. Saos-2, U-2 OS and hFOB1.19 were purchased from ATCC. All cells used were kept in exponential phase of growth. Total RNA was extracted using the RNeasy Total RNA Isolation kit (QIAGEN). The quality and purity of the products were controlled by Agilent 2100. The final synthesized biotinylated cDNAs were hybridized to Affymetrix GeneChip® U133A 2.0 arrays following the protocol strictly. Arrays were scanned with the Affymetrix scanner 3000. Data analysis was performed by Microarray Suite 5.0 after pre-standard procedure. Link-test on datasets from both SELDI-TOF-MS and microarray high-throughput analysis platforms can accelerate the identification of tumor biomarkers. The results confirmed that CYC-1 with important biomedical function was an effective candidate biomarker for osteosarcoma early diagnosis.
Project description:Inflammatory bowel disease (IBD) represents a group of progressive disorders characterized by recurrent chronic inflammation of the gut. Ulcerative colitis and Crohn's disease are the major manifestations of IBD. While our understanding of IBD has progressed in recent years, its etiology is far from being fully understood, resulting in suboptimal treatment options. Complementing other biological endpoints, bioanalytical "omics" methods that quantify many biomolecules simultaneously have great potential in the dissection of the complex pathogenesis of IBD. In this review, we focus on the rapidly evolving proteomics and lipidomics technologies and their broad applicability to IBD studies; these range from investigations of immune-regulatory mechanisms and biomarker discovery to studies dissecting host⁻microbiome interactions and the role of intestinal epithelial cells. Future studies can leverage recent advances, including improved analytical methodologies, additional relevant sample types, and integrative multi-omics analyses. Proteomics and lipidomics could effectively accelerate the development of novel targeted treatments and the discovery of complementary biomarkers, enabling continuous monitoring of the treatment response of individual patients; this may allow further refinement of treatment and, ultimately, facilitate a personalized medicine approach to IBD.
Project description:The current coronary artery disease (CAD) risk scores for predicting future cardiovascular events rely on well-recognized traditional cardiovascular risk factors derived from a population level but often fail individuals, with up to 25% of first-time heart attack patients having no risk factors. Non-invasive imaging technology can directly measure coronary artery plaque burden. With an advanced lipidomic measurement methodology, for the first time, we aim to identify lipidomic biomarkers to enable intervention before cardiovascular events. With 994 participants from BioHEART-CT Discovery Cohort, we collected clinical data and performed high-performance liquid chromatography with mass spectrometry to determine concentrations of 683 plasma lipid species. Statin-naive participants were selected based on subclinical CAD (sCAD) categories as the analytical cohort (n = 580), with sCAD+ (n = 243) compared to sCAD- (n = 337). Through a machine learning approach, we built a lipid risk score (LRS) and compared the performance of the existing Framingham Risk Score (FRS) in predicting sCAD+. We obtained individual classifiability scores and determined Body Mass Index (BMI) as the modifying variable. FRS and LRS models achieved similar areas under the receiver operating characteristic curve (AUC) in predicting the validation cohort. LRS enhanced the prediction of sCAD+ in the healthy-weight group (BMI < 25 kg/m2), where FRS performed poorly and identified individuals at risk that FRS missed. Lipid features have strong potential as biomarkers to predict CAD plaque burden and can identify residual risk not captured by traditional risk factors/scores. LRS compliments FRS in prediction and has the most significant benefit in healthy-weight individuals.
Project description:Biomarkers lie at the heart of precision medicine. Surprisingly, while rapid genomic profiling is becoming ubiquitous, the development of biomarkers usually involves the application of bespoke techniques that cannot be directly applied to other datasets. There is an urgent need for a systematic methodology to create biologically-interpretable molecular models that robustly predict key phenotypes. Here we present SIMMS (Subnetwork Integration for Multi-Modal Signatures): an algorithm that fragments pathways into functional modules and uses these to predict phenotypes. We apply SIMMS to multiple data types across five diseases, and in each it reproducibly identifies known and novel subtypes, and makes superior predictions to the best bespoke approaches. To demonstrate its ability on a new dataset, we profile 33 genes/nodes of the PI3K pathway in 1734 FFPE breast tumors and create a four-subnetwork prediction model. This model out-performs a clinically-validated molecular test in an independent cohort of 1742 patients. SIMMS is generic and enables systematic data integration for robust biomarker discovery.
Project description:Breath collection and analysis can be used to discover volatile biomarkers in a number of infectious and non-infectious diseases, such as malaria, tuberculosis, lung cancer, and liver disease. This protocol describes a reproducible method for sampling breath in children and then stabilizing breath samples for further analysis with gas chromatography-mass spectrometry (GC-MS). The goal of this method is to establish a standardized protocol for the acquisition of breath samples for further chemical analysis, from children aged 4-15 years. First, breath is sampled using a cardboard mouthpiece attached to a 2-way valve, which is connected to a 3 L bag. Breath analytes are then transferred to a thermal desorption tube and stored at 4-5 °C until analysis. This technique has been previously used to capture breath of children with malaria for successful breath biomarker identification. Subsequently, we have successfully applied this technique to additional pediatric cohorts. The advantage of this method is that it requires minimal cooperation on part of the patient (of particular value in pediatric populations), has a short collection period, does not require trained staff, and can be performed with portable equipment in resource-limited field settings.
Project description:BackgroundBiomarker discovery holds the promise for advancing personalized medicine as the biomarkers can help match patients to optimal treatment to improve patient outcomes. However, serious concerns have been raised because very few molecular biomarkers or signatures discovered from high dimensional array data can be successfully validated and applied to clinical use. We propose good practice guidelines as well as a novel tool for biomarker discovery and use breast cancer prognosis as a case study to illustrate the proposed approach.ResultsWe applied the proposed approach to a publicly available breast cancer prognosis dataset and identified small numbers of predictive markers for patient subpopulations stratified by clinical variables. Results from an independent cross-platform validation set show that our model compares favorably to other gene signature and clinical variable based prognostic tools. About half of the discovered candidate markers can individually achieve very good performance, which further demonstrate the high quality of feature selection. These candidate markers perform extremely well for young patient with estrogen receptor-positive, lymph node-negative early stage breast cancers, suggesting a distinct subset of these patients identified by these markers is actually at high risk of recurrence and may benefit from more aggressive treatment than current practice.ConclusionThe results show that by following good practice guidelines, we can identify highly predictive genes in high dimensional breast cancer array data. These predictive genes have been successfully validated using an independent cross-platform dataset.