Project description:A variety of high-throughput techniques are now available for constructing comprehensive gene regulatory networks in systems biology. In this study, we report a new statistical approach for facilitating in silico inference of regulatory network structure. The new measure of association, coefficient of intrinsic dependence (CID), is model-free and can be applied to both continuous and categorical distributions. When given two variables X and Y, CID answers whether Y is dependent on X by examining the conditional distribution of Y given X. In this paper, we apply CID to analyze the regulatory relationships between transcription factors (TFs) (X) and their downstream genes (Y) based on clinical data. More specifically, we use estrogen receptor alpha (ERalpha) as the variable X, and the analyses are based on 48 clinical breast cancer gene expression arrays (48A). RESULTS: The analytical utility of CID was evaluated in comparison with four commonly used statistical methods, Galton-Pearson's correlation coefficient (GPCC), Student's t-test (STT), coefficient of determination (CoD), and mutual information (MI). When being compared to GPCC, CoD, and MI, CID reveals its preferential ability to discover the regulatory association where distribution of the mRNA expression levels on X and Y does not fit linear models. On the other hand, when CID is used to measure the association of a continuous variable (Y) against a discrete variable (X), it shows similar performance as compared to STT, and appears to outperform CoD and MI. In addition, this study established a two-layer transcriptional regulatory network to exemplify the usage of CID, in combination with GPCC, in deciphering gene networks based on gene expression profiles from patient arrays. CONCLUSION: CID is shown to provide useful information for identifying associations between genes and transcription factors of interest in patient arrays. When coupled with the relationships detected by GPCC, the association predicted by CID are applicable to the construction of transcriptional regulatory networks. This study shows how information from different data sources and learning algorithms can be integrated to investigate whether relevant regulatory mechanisms identified in cell models can also be partially re-identified in clinical samples of breast cancers. AVAILABILITY: the implementation of CID in R codes can be freely downloaded from (http://homepage.ntu.edu.tw/~lyliu/BC/). Experiment Overall Design: Total 48 clinical arrays (48A) used in this study can be found in GSE9309. We designed the experiments using a given breast cancer population with clear status of estrogen receptor alpha (ER), which were confirmed by immunochemical staining (If ³10% immunopositive stain is found at tumor section, we designate it as ER(+). Otherwise, it is ER(-). ) in this study. 48A consist of 36A with positive in ER status and of 12A with negative in ER status.
Project description:A variety of high-throughput techniques are now available for constructing comprehensive gene regulatory networks in systems biology. In this study, we report a new statistical approach for facilitating in silico inference of regulatory network structure. The new measure of association, coefficient of intrinsic dependence (CID), is model-free and can be applied to both continuous and categorical distributions. When given two variables X and Y, CID answers whether Y is dependent on X by examining the conditional distribution of Y given X. In this paper, we apply CID to analyze the regulatory relationships between transcription factors (TFs) (X) and their downstream genes (Y) based on clinical data. More specifically, we use estrogen receptor alpha (ERalpha) as the variable X, and the analyses are based on 48 clinical breast cancer gene expression arrays (48A). RESULTS: The analytical utility of CID was evaluated in comparison with four commonly used statistical methods, Galton-Pearson's correlation coefficient (GPCC), Student's t-test (STT), coefficient of determination (CoD), and mutual information (MI). When being compared to GPCC, CoD, and MI, CID reveals its preferential ability to discover the regulatory association where distribution of the mRNA expression levels on X and Y does not fit linear models. On the other hand, when CID is used to measure the association of a continuous variable (Y) against a discrete variable (X), it shows similar performance as compared to STT, and appears to outperform CoD and MI. In addition, this study established a two-layer transcriptional regulatory network to exemplify the usage of CID, in combination with GPCC, in deciphering gene networks based on gene expression profiles from patient arrays. CONCLUSION: CID is shown to provide useful information for identifying associations between genes and transcription factors of interest in patient arrays. When coupled with the relationships detected by GPCC, the association predicted by CID are applicable to the construction of transcriptional regulatory networks. This study shows how information from different data sources and learning algorithms can be integrated to investigate whether relevant regulatory mechanisms identified in cell models can also be partially re-identified in clinical samples of breast cancers. AVAILABILITY: the implementation of CID in R codes can be freely downloaded from (http://homepage.ntu.edu.tw/~lyliu/BC/).
Project description:We report the ER alpha regulatory network of Tamoxifen resistance MCF7 cell line using the Chromatin immunoprecipitated high-throughput sequencing technology (ChIP-seq). By Integrating the gene expression data (previously reported) with the ChIP-seq data, we generated ER alpha regulatory network and pathways. For ER alpha regulatory network, hub TFs with enriched motifs were identified from ER alpha peak together with PolII peaks. We then scan the position weight matrix (PWM) of ER alpha peak region of certain gene to find out the regulatory relationship between hub TF and normal TF. For regulatory pathway, genes were grouped base on their expression value at 4 different time point. Then the hub TF that plays important role in each time point of each group was identified. This study provides a framework for the application of ChIP-seq and gene expression data for the construction of ER alpha regulatory network. 4 different ChIP-seq dataset in Tamoxifen resistance MCF7 cell line
Project description:We report the ER alpha regulatory network of Tamoxifen resistance MCF7 cell line using the Chromatin immunoprecipitated high-throughput sequencing technology (ChIP-seq). By Integrating the gene expression data (previously reported) with the ChIP-seq data, we generated ER alpha regulatory network and pathways. For ER alpha regulatory network, hub TFs with enriched motifs were identified from ER alpha peak together with PolII peaks. We then scan the position weight matrix (PWM) of ER alpha peak region of certain gene to find out the regulatory relationship between hub TF and normal TF. For regulatory pathway, genes were grouped base on their expression value at 4 different time point. Then the hub TF that plays important role in each time point of each group was identified. This study provides a framework for the application of ChIP-seq and gene expression data for the construction of ER alpha regulatory network.
Project description:Analyses of QTLs for expression levels (eQTLs) of the genes reveal genetic relationship between expression variation and the regulator, thus unlocking the information for identifying the regulatory network. In this study, we used Affymetrix GeneChip Rice Genome Array to analyze eQTLs in rice flag leaf at heading date from 210 recombinant inbred lines (RILs) derived from a cross between Zhenshan 97 and Minghui 63. In the study, we attempted to construct the regulatory network by identifying putative regulators and the respective targets using an eQTL guided co-expression analysis with a recombinant inbred line population of rice. The ability to reveal the regulatory architecture of the genes at the whole genome level by constructing the regulatory network is critical for understanding the biological processes and developmental programs of the organism. Here we conducted an eQTL guided function-related co-expression analysis for identifying the putative regulators and constructing gene regulatory network. The Affymetrix Genechip rice Genome Array was used to investigate their dynamic transcript levels. one replicates were sampled from each RIL, three for parents, and three replicates for each parent resulting in a dataset of 216 microarrays.
Project description:To better characterize group IE like human breast cancer based on the gene profiles of estrogen actions through estrogen receptor alpha (ER alpha), we identified an ER alpha transcriptional regulatory network for cell cycle in silico. We used two datasets from cell line (Data 1) and clinical samples (Data 2), respectively. Analyses on Data 1 via trajectory clustering and Pathway-Express confirmed the significant estrogen effect on up-regulating cell cycle activities. The gene expression relationships between ER alpha and cell cycle genes were re-identified in Data 2 by three statistical methods – Galton-Pearson’s correlation coefficient, Student’s t-test and the coefficient of intrinsic dependence. They were mostly (56.09%)(46/82) re-confirmed by literature search. E2F1 was found to be the major ER alpha target in regulating cell cycle gene expressions (83.72%)(36/43) via suppressive mode. However, enhanced cell cycle progression via up-regulating some cell cycle genes was predicted in silico possibly involving E2F2, in part. Both tumorigenic and tumor suppressing activities indicated by this network were predicted. This network clearly provides a robust way for uncovering estrogen actions in an ER(+) subtype specific manner. Experiment Overall Design: Two clinical datasets were used in this study. One, the 37 clinical arrays (abbreviated as 37A) consist of 26 A for patients positive in estrogen receptor alpha (ER) and in progesterone receptor (PR) immunohistochemical stain (IHC) and 11A for patients negative in ER IHC. This dataset was designated as Data 2. The 31 clinical arrays (31A) consist of 20A for patients positive in ER status but negative in PR status and 11A which are the same as in 37A. This dataset was used for data comparison. All the signals from the mRNA profile of each sample in the experiments were normalized using the internal control RNA- Stratagene's human common reference RNA via statistical method 'rank consistant lowess. Finally, those ratios were transformed by Log2.
Project description:To better characterize group IE like human breast cancer based on the gene profiles of estrogen actions through estrogen receptor alpha (ER alpha), we identified an ER alpha transcriptional regulatory network for cell cycle in silico. We used two datasets from cell line (Data 1) and clinical samples (Data 2), respectively. Analyses on Data 1 via trajectory clustering and Pathway-Express confirmed the significant estrogen effect on up-regulating cell cycle activities. The gene expression relationships between ER alpha and cell cycle genes were re-identified in Data 2 by three statistical methods – Galton-Pearson’s correlation coefficient, Student’s t-test and the coefficient of intrinsic dependence. They were mostly (56.09%)(46/82) re-confirmed by literature search. E2F1 was found to be the major ER alpha target in regulating cell cycle gene expressions (83.72%)(36/43) via suppressive mode. However, enhanced cell cycle progression via up-regulating some cell cycle genes was predicted in silico possibly involving E2F2, in part. Both tumorigenic and tumor suppressing activities indicated by this network were predicted. This network clearly provides a robust way for uncovering estrogen actions in an ER(+) subtype specific manner.
Project description:LIPG has an important role in the maintenance of lipid homeostasis.To reveal the potential molecular mechanisms by which LIPG regulates lipid deposition and proliferation in goat intramuscular preadipocytes by constructing a transcriptional profile of knockdown LIPG gene.This work extends the regulatory network of IMF deposition.