Dataset Information

A Robust Approach for Identification of Cancer Biomarkers and Candidate Drugs.

ABSTRACT: Background and objectives: Identification of cancer biomarkers that are differentially expressed (DE) between two biological conditions is an important task in many microarray studies. There exist several methods in the literature in this regards and most of these methods designed especially for unpaired samples, those are not suitable for paired samples. Furthermore, the traditional methods use p-values or fold change (FC) values to detect the DE genes. However, sometimes, p-value based results do not comply with FC based results due to the smaller pooled variance of gene expressions, which occurs when variance of each individual condition becomes smaller. There are some methods that combine both p-values and FC values to solve this problem. But, those methods also show weak performance for small sample cases in the presence of outlying expressions. To overcome this problem, in this paper, an attempt is made to propose a hybrid robust SAM-FC approach by combining rank of FC values and rank of p-values computed by SAM statistic using minimum ?-divergence method, which is designed for paired samples. Materials and Methods: The proposed method introduces a weight function known as ?-weight function. This weight function produces larger weights corresponding to usual and smaller weights for unusual expressions. The ?-weight function plays the significant role on the performance of the proposed method. The proposed method uses ?-weight function as a measure of outlier detection by setting ? = 0.2. We unify both classical and robust estimates using ?-weight function, such that maximum likelihood estimators (MLEs) are used in absence of outliers and minimum ?-divergence estimators are used in presence of outliers to obtain reasonable p-values and FC values in the proposed method. Results: We examined the performance of proposed method in a comparison of some popular methods (t-test, SAM, LIMMA, Wilcoxon, WAD, RP, and FCROS) using both simulated and real gene expression profiles for both small and large sample cases. From the simulation and a real spike in data analysis results, we observed that the proposed method outperforms other methods for small sample cases in the presence of outliers and it keeps almost equal performance with other robust methods (Wilcoxon, RP, and FCROS) otherwise. From the head and neck cancer (HNC) gene expression dataset, the proposed method identified two additional genes (CYP3A4 and NOVA1) that are significantly enriched in linoleic acid metabolism, drug metabolism, steroid hormone biosynthesis and metabolic pathways. The survival analysis through Kaplan-Meier curve revealed that combined effect of these two genes has prognostic capability and they might be promising biomarker of HNC. Moreover, we retrieved the 12 candidate drugs based on gene interaction from glad4u and drug bank literature based gene associations. Conclusions: Using pathway analysis, disease association study, protein-protein interactions and survival analysis we found that our proposed two additional genes might be involved in the critical pathways of cancer. Furthermore, the identified drugs showed statistical significance which indicates that proteins associated with these genes might be therapeutic target in cancer.

SUBMITTER: Shahjaman M

PROVIDER: S-EPMC6631768 | biostudies-literature | 2019 Jun

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

A Robust Approach for Identification of Cancer Biomarkers and Candidate Drugs.

Shahjaman Md M Rahman Md Rezanur MR Islam S M Shahinul SMS Mollah Md Nurul Haque MNH

Medicina (Kaunas, Lithuania) 20190611 6

Background and objectives: Identification of cancer biomarkers that are differentially expressed (DE) between two biological conditions is an important task in many microarray studies. There exist several methods in the literature in this regards and most of these methods designed especially for unpaired samples, those are not suitable for paired samples. Furthermore, the traditional methods use p-values or fold change (FC) values to detect the DE genes. However, sometimes, p</i ...[more]

PMID: 31212673

Similar Datasets

Project description:Background and aims:Glioblastoma (GBM) is a common and aggressive primary brain tumor, and the prognosis for GBM patients remains poor. This study aimed to identify the key genes associated with the development of GBM and provide new diagnostic and therapies for GBM. Methods:Three microarray datasets (GSE111260, GSE103227, and GSE104267) were selected from Gene Expression Omnibus (GEO) database for integrated analysis. The differential expressed genes (DEGs) between GBM and normal tissues were identified. Then, prognosis-related DEGs were screened by survival analysis, followed by functional enrichment analysis. The protein-protein interaction (PPI) network was constructed to explore the hub genes associated with GBM. The mRNA and protein expression levels of hub genes were respectively validated in silico using The Cancer Genome Atlas (TCGA) and Human Protein Atlas (HPA) databases. Subsequently, the small molecule drugs of GBM were predicted by using Connectivity Map (CMAP) database. Results:A total of 78 prognosis-related DEGs were identified, of which10 hub genes with higher degree were obtained by PPI analysis. The mRNA expression and protein expression levels of CETN2, MKI67, ARL13B, and SETDB1 were overexpressed in GBM tissues, while the expression levels of CALN1, ELAVL3, ADCY3, SYN2, SLC12A5, and SOD1 were down-regulated in GBM tissues. Additionally, these genes were significantly associated with the prognosis of GBM. We eventually predicted the 10 most vital small molecule drugs, which potentially imitate or reverse GBM carcinogenic status. Cycloserine and 11-deoxy-16,16-dimethylprostaglandin E2 might be considered as potential therapeutic drugs of GBM. Conclusions:Our study provided 10 key genes for diagnosis, prognosis, and therapy for GBM. These findings might contribute to a better comprehension of molecular mechanisms of GBM development, and provide new perspective for further GBM research. However, specific regulatory mechanism of these genes needed further elaboration.

Project description:Background: Non-small-cell lung cancer (NSCLC) remains the leading cause of cancer morbidity and mortality worldwide. In the present study, we identified novel biomarkers associated with the pathogenesis of NSCLC aiming to provide new diagnostic and therapeutic approaches for NSCLC. Methods: The microarray datasets of GSE18842, GSE30219, GSE31210, GSE32863 and GSE40791 from Gene Expression Omnibus database were downloaded. The differential expressed genes (DEGs) between NSCLC and normal samples were identified by limma package. The construction of protein-protein interaction (PPI) network, module analysis and enrichment analysis were performed using bioinformatics tools. The expression and prognostic values of hub genes were validated by GEPIA database and real-time quantitative PCR. Based on these DEGs, the candidate small molecules for NSCLC were identified by the CMap database. Results: A total of 408 overlapping DEGs including 109 up-regulated and 296 down-regulated genes were identified; 300 nodes and 1283 interactions were obtained from the PPI network. The most significant biological process and pathway enrichment of DEGs were response to wounding and cell adhesion molecules, respectively. Six DEGs (PTTG1, TYMS, ECT2, COL1A1, SPP1 and CDCA5) which significantly up-regulated in NSCLC tissues, were selected as hub genes according to the results of module analysis. The GEPIA database further confirmed that patients with higher expression levels of these hub genes experienced a shorter overall survival. Additionally, CMap predicted the 20 most significant small molecules as potential therapeutic drugs for NSCLC. DL-thiorphan was the most promising small molecule to reverse the NSCLC gene expression. Conclusions: Based on the gene expression profiles of 696 NSCLC samples and 237 normal samples, we first revealed that PTTG1, TYMS, ECT2, COL1A1, SPP1 and CDCA5 could act as the promising novel diagnostic and therapeutic targets for NSCLC. Our work will contribute to clarifying the molecular mechanisms of NSCLC initiation and progression.

Dataset Information

A Robust Approach for Identification of Cancer Biomarkers and Candidate Drugs.

Publications

A Robust Approach for Identification of Cancer Biomarkers and Candidate Drugs.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets