Unknown

Dataset Information

0

Towards precision medicine: discovering novel gynecological cancer biomarkers and pathways using linked data.


ABSTRACT: Next Generation Sequencing (NGS) is playing a key role in therapeutic decision making for the cancer prognosis and treatment. The NGS technologies are producing a massive amount of sequencing datasets. Often, these datasets are published from the isolated and different sequencing facilities. Consequently, the process of sharing and aggregating multisite sequencing datasets are thwarted by issues such as the need to discover relevant data from different sources, built scalable repositories, the automation of data linkage, the volume of the data, efficient querying mechanism, and information rich intuitive visualisation.We present an approach to link and query different sequencing datasets (TCGA, COSMIC, REACTOME, KEGG and GO) to indicate risks for four cancer types - Ovarian Serous Cystadenocarcinoma (OV), Uterine Corpus Endometrial Carcinoma (UCEC), Uterine Carcinosarcoma (UCS), Cervical Squamous Cell Carcinoma and Endocervical Adenocarcinoma (CESC) - covering the 16 healthy tissue-specific genes from Illumina Human Body Map 2.0. The differentially expressed genes from Illumina Human Body Map 2.0 are analysed together with the gene expressions reported in COSMIC and TCGA repositories leading to the discover of potential biomarkers for a tissue-specific cancer.We analyse the tissue expression of genes, copy number variation (CNV), somatic mutation, and promoter methylation to identify associated pathways and find novel biomarkers. We discovered twenty (20) mutated genes and three (3) potential pathways causing promoter changes in different gynaecological cancer types. We propose a data-interlinked platform called BIOOPENER that glues together heterogeneous cancer and biomedical repositories. The key approach is to find correspondences (or data links) among genetic, cellular and molecular features across isolated cancer datasets giving insight into cancer progression from normal to diseased tissues. The proposed BIOOPENER platform enriches mutations by filling in missing links from TCGA, COSMIC, REACTOME, KEGG and GO datasets and provides an interlinking mechanism to understand cancer progression from normal to diseased tissues with pathway components, which in turn helped to map mutations, associated phenotypes, pathways, and mechanism.

SUBMITTER: Jha A 

PROVIDER: S-EPMC5606033 | biostudies-literature | 2017 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Towards precision medicine: discovering novel gynecological cancer biomarkers and pathways using linked data.

Jha Alokkumar A   Khan Yasar Y   Mehdi Muntazir M   Karim Md Rezaul MR   Mehmood Qaiser Q   Zappa Achille A   Rebholz-Schuhmann Dietrich D   Sahay Ratnesh R  

Journal of biomedical semantics 20170919 1


<h4>Background</h4>Next Generation Sequencing (NGS) is playing a key role in therapeutic decision making for the cancer prognosis and treatment. The NGS technologies are producing a massive amount of sequencing datasets. Often, these datasets are published from the isolated and different sequencing facilities. Consequently, the process of sharing and aggregating multisite sequencing datasets are thwarted by issues such as the need to discover relevant data from different sources, built scalable  ...[more]

Similar Datasets

2019-02-27 | GSE127208 | GEO
| S-EPMC5605187 | biostudies-literature
| S-EPMC10472042 | biostudies-literature
| S-EPMC6477790 | biostudies-literature
| S-EPMC7192849 | biostudies-literature
| S-EPMC4456621 | biostudies-other
2019-03-15 | GSE125216 | GEO
| S-EPMC10762256 | biostudies-literature
| S-EPMC7607071 | biostudies-literature
| PRJNA524343 | ENA