Dataset Information

Identification of differentially expressed genes by means of outlier detection.

ABSTRACT: An important issue in microarray data is to select, from thousands of genes, a small number of informative differentially expressed (DE) genes which may be key elements for a disease. If each gene is analyzed individually, there is a big number of hypotheses to test and a multiple comparison correction method must be used. Consequently, the resulting cut-off value may be too small. Moreover, an important issue is the selection's replicability of the DE genes. We present a new method, called ORdensity, to obtain a reproducible selection of DE genes. It takes into account the relation between all genes and it is not a gene-by-gene approach, unlike the usually applied techniques to DE gene selection.The proposed method returns three measures, related to the concepts of outlier and density of false positives in a neighbourhood, which allow us to identify the DE genes with high classification accuracy. To assess the performance of ORdensity, we used simulated microarray data and four real microarray cancer data sets. The results indicated that the method correctly detects the DE genes; it is competitive with other well accepted methods; the list of DE genes that it obtains is useful for the correct classification or diagnosis of new future samples and, in general, it is more stable than other procedures.ORdensity is a new method for identifying DE genes that avoids some of the shortcomings of the individual gene identification and it is stable when the original sample is changed by subsamples.

SUBMITTER: Irigoien I

PROVIDER: S-EPMC6131896 | biostudies-other | 2018 Sep

REPOSITORIES: biostudies-other

ACCESS DATA

Publications

Identification of differentially expressed genes by means of outlier detection.

Irigoien Itziar I Arenas Concepción C

BMC bioinformatics 20180910 1

<h4>Background</h4>An important issue in microarray data is to select, from thousands of genes, a small number of informative differentially expressed (DE) genes which may be key elements for a disease. If each gene is analyzed individually, there is a big number of hypotheses to test and a multiple comparison correction method must be used. Consequently, the resulting cut-off value may be too small. Moreover, an important issue is the selection's replicability of the DE genes. We present a new ...[more]

PMID: 30200879

Similar Datasets

Project description:Osteocytes represent the most abundant cellular component of mammalian bones with important functions in bone mass maintenance and remodeling. To elucidate the differential gene expression between osteoblasts and osteocytes we completed a comprehensive analysis of their gene profiles. Selective identification of these two mature populations was achieved by utilization of visual markers of bone lineage cells. We have utilized dual GFP reporter mice in which osteocytes are expressing GFP (topaz) directed by the DMP1 promoter, while osteoblasts are identified by expression of GFP (cyan) driven by 2.3 kb of the Col1a1 promoter. Histological analysis of 7-day-old neonatal calvaria confirmed the expression pattern of DMP1GFP in osteocytes and Col2.3 in osteoblasts and osteocytes. To isolate distinct populations of cells we utilized fluorescent activated cell sorting (FACS). Cell suspensions were subjected to RNA extraction, in vitro transcription and labeling of cDNA and gene expression was analyzed using the Illumina WG-6v1 BeadChip. Following normalization of raw data from four biological replicates, 3444 genes were called present in all three sorted cell populations: GFP negative, Col2.3cyan(+) (osteoblasts), and DMP1topaz(+) (preosteocytes and osteocytes). We present the genes that showed in excess of a 2-fold change for gene expression between DMP1topaz(+) and Col2.3cyan(+) cells. The selected genes were classified and grouped according to their associated gene ontology terms. Genes clustered to osteogenesis and skeletal development such as Bmp4, Bmp8a, Dmp1, Enpp1, Phex and Ank were highly expressed in DMP1topaz(+)cells. Most of the genes encoding extracellular matrix components and secreted proteins had lower expression in DMP1topaz(+) cells, while most of the genes encoding plasma membrane proteins were increased. Interestingly a large number of genes associated with muscle development and function and with neuronal phenotype were increased in DMP1topaz(+) cells, indicating some new aspects of osteocyte biology. Although a large number of genes differentially expressed in DMP1topaz(+) and Col2.3cyan(+) cells in our study have already been assigned to bone development and physiology, for most of them we still lack any substantial data. Therefore, isolation of osteocyte and osteoblast cell populations and their subsequent microarray analysis allowed us to identify a number or genes and pathways with potential roles in regulation of bone mass.

Project description:BackgroundCurrent diagnosis and treatment of urinary bladder cancer (BC) has shown great progress with the utilization of microarrays.PurposeOur goal was to identify common differentially expressed (DE) genes among clinically relevant subclasses of BC using microarrays.Methodology/principal findingsBC samples and controls, both experimental and publicly available datasets, were analyzed by whole genome microarrays. We grouped the samples according to their histology and defined the DE genes in each sample individually, as well as in each tumor group. A dual analysis strategy was followed. First, experimental samples were analyzed and conclusions were formulated; and second, experimental sets were combined with publicly available microarray datasets and were further analyzed in search of common DE genes. The experimental dataset identified 831 genes that were DE in all tumor samples, simultaneously. Moreover, 33 genes were up-regulated and 85 genes were down-regulated in all 10 BC samples compared to the 5 normal tissues, simultaneously. Hierarchical clustering partitioned tumor groups in accordance to their histology. K-means clustering of all genes and all samples, as well as clustering of tumor groups, presented 49 clusters. K-means clustering of common DE genes in all samples revealed 24 clusters. Genes manifested various differential patterns of expression, based on PCA. YY1 and NFκB were among the most common transcription factors that regulated the expression of the identified DE genes. Chromosome 1 contained 32 DE genes, followed by chromosomes 2 and 11, which contained 25 and 23 DE genes, respectively. Chromosome 21 had the least number of DE genes. GO analysis revealed the prevalence of transport and binding genes in the common down-regulated DE genes; the prevalence of RNA metabolism and processing genes in the up-regulated DE genes; as well as the prevalence of genes responsible for cell communication and signal transduction in the DE genes that were down-regulated in T1-Grade III tumors and up-regulated in T2/T3-Grade III tumors. Combination of samples from all microarray platforms revealed 17 common DE genes, (BMP4, CRYGD, DBH, GJB1, KRT83, MPZ, NHLH1, TACR3, ACTC1, MFAP4, SPARCL1, TAGLN, TPM2, CDC20, LHCGR, TM9SF1 and HCCS) 4 of which participate in numerous pathways.Conclusions/significanceThe identification of the common DE genes among BC samples of different histology can provide further insight into the discovery of new putative markers.

Dataset Information

Identification of differentially expressed genes by means of outlier detection.

Publications

Identification of differentially expressed genes by means of outlier detection.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets