Project description:Several recent studies have suggested that genes that are longer than 100 kb are more likely to be misregulated in neurological diseases associated with synaptic dysfunction, such as autism and Rett syndrome. These length-dependent transcriptional changes are modest in Mecp2 mutant samples, but, given the low sensitivity of high-throughput transcriptome profiling technology, the statistical significance of these results needs to be re-evaluated. Here, we show that the apparent length-dependent trends previously observed in MeCP2 microarray and RNA-Sequencing datasets, particularly in genes with low-fold changes, disappeared when compared to randomized control samples. As we found no similar bias with Nanostring technology, this bias seems to be particular to PCR amplification-based platforms. Transcriptional alterations with large fold-change values, however, can reveal an authentic long gene bias. Discriminating authentic from artefactual length-dependent trends requires establishing a baseline from randomized control samples.
Project description:Several recent studies have suggested that genes that are longer than 100 kb are more likely to be misregulated in neurological diseases associated with synaptic dysfunction, such as autism and Rett syndrome. These length-dependent transcriptional changes are modest in Mecp2 mutant samples, but, given the low sensitivity of high-throughput transcriptome profiling technology, the statistical significance of these results needs to be re-evaluated. Here, we show that the apparent length-dependent trends previously observed in MeCP2 microarray and RNA-Sequencing datasets, particularly in genes with low-fold changes, disappeared when compared to randomized control samples. As we found no similar bias with Nanostring technology, this bias seems to be particular to PCR amplification-based platforms. Transcriptional alterations with large fold-change values, however, can reveal an authentic long gene bias. Discriminating authentic from artefactual length-dependent trends requires establishing a baseline from randomized control samples.
Project description:Several recent studies have suggested that genes that are longer than 100 kb are more likely to be misregulated in neurological diseases associated with synaptic dysfunction, such as autism and Rett syndrome. These length-dependent transcriptional changes are modest in Mecp2 mutant samples, but, given the low sensitivity of high-throughput transcriptome profiling technology, the statistical significance of these results needs to be re-evaluated. Here, we show that the apparent length-dependent trends previously observed in MeCP2 microarray and RNA-Sequencing datasets, particularly in genes with low-fold changes, disappeared when compared to randomized control samples. As we found no similar bias with Nanostring technology, this bias seems to be particular to PCR amplification-based platforms. Transcriptional alterations with large fold-change values, however, can reveal an authentic long gene bias. Discriminating authentic from artefactual length-dependent trends requires establishing a baseline from randomized control samples.
Project description:Several recent studies have suggested that genes that are over 100 kb in length are particularly likely to be misregulated in neurological diseases associated with synaptic dysfunction, such as autism, Fragile X syndrome, and Rett syndrome. These length-dependent transcriptional changes seem to be modest, but, given the low sensitivity of high-throughput transcriptome profiling technology, the statistical significance of these results needs to be reevaluated. Here we show that transcriptional changes reflected in microarray and RNA-Sequencing benchmark datasets from the SEQC Consortium show a bias toward genes of greater length, even in the comparison of technical replicates. We hypothesized that PCR amplification, which is used in both microarray and RNA-Seq technologies, could be introducing this bias. We found that, when the fold-change values are small, PCR amplification in microarray and RNA-Seq technologies does produce a bias toward longer genes; we found no similar bias with nCounter technology, which is not based on PCR amplification. We provide an approach to more rigorously assess length-dependent changes that begins with comparing randomized control samples to estimate baseline gene length dependency and evaluate the statistical significance of gene length regulation.
2018-06-05 | GSE94073 | GEO
Project description:Low bias amplification with confinement effect based on agarose gel
Project description:Considerable variation in gene expression data from different DNA microarray platforms has been demonstrated. However, no characterization of the source of variation arising from labeling protocols has been performed. To analyze the variation associated with T7-based RNA amplification/labeling methods, aliquots of the Stratagene Human Universal Reference RNA were labeled using 3 eukaryotic target preparation methods and hybridized to a single array type (Affymetrix U95Av2). Variability was measured in yield and size distribution of labeled products, as well as in the gene expression results. All methods showed a shift in cRNA size distribution, when compared to un-amplified mRNA, with a significant increase in short transcripts for methods with long IVT reactions. Intra-method reproducibility showed correlation coefficients >0.99, while inter-method comparisons showed coefficients ranging from 0.94 to 0.98 and a nearly two-fold increase in coefficient of variation. Fold amplification for each method was positively correlated with the number of present genes. Two factors that introduced significant bias in gene expression data were observed: a) number of labeled nucleotides that introduces sequence dependent bias, and b) the length of the IVT reaction that introduces a transcript size dependent bias. This study provides evidence of amplification method dependent biases in gene expression data. Keywords: method validation study
Project description:Considerable variation in gene expression data from different DNA microarray platforms has been demonstrated. However, no characterization of the source of variation arising from labeling protocols has been performed. To analyze the variation associated with T7-based RNA amplification/labeling methods, aliquots of the Stratagene Human Universal Reference RNA were labeled using 3 eukaryotic target preparation methods and hybridized to a single array type (Affymetrix U95Av2). Variability was measured in yield and size distribution of labeled products, as well as in the gene expression results. All methods showed a shift in cRNA size distribution, when compared to un-amplified mRNA, with a significant increase in short transcripts for methods with long IVT reactions. Intra-method reproducibility showed correlation coefficients >0.99, while inter-method comparisons showed coefficients ranging from 0.94 to 0.98 and a nearly two-fold increase in coefficient of variation. Fold amplification for each method was positively correlated with the number of present genes. Two factors that introduced significant bias in gene expression data were observed: a) number of labeled nucleotides that introduces sequence dependent bias, and b) the length of the IVT reaction that introduces a transcript size dependent bias. This study provides evidence of amplification method dependent biases in gene expression data.
Project description:This experiment highlights the extreme sequence bias generated by standard PCR amplication of sequencing libraries and decribes an adapted T7-polymerase based amplification method, which results in non-baised, representative libraries for Illumina sequencing
Project description:ADNP syndrome, involving the ADNP transcription factor in the SWI/SNF chromatin-remodeling complex, is characterized by developmental delay, intellectual disability, and autism spectrum disorders (ASD). In ASD, ADNP is a highly penetrant risk gene, accounting for ~0.17% cases. Although Adnp-haploinsufficient mice display various phenotypic deficits, the underlying synaptic mechanisms are poorly understood. Here we report age-differential synaptic plasticity deficits associated with cognitive inflexibility and CaMKIIα hyperactivity in Adnp-mutant mice. These mice show impaired and inflexible contextual learning and memory additional to social and anxiety-related deficits in adults long after a marked decrease in ADNP protein levels to ~10% of newborn levels in juveniles. In addition, Adnp-mutant adults show abnormally enhanced long-term potentiation in adults but not in juveniles. This accompanies CaMKIIα hyperactivity involving increased baseline autophosphorylation, as revealed by unbiased proteomic analyses. Therefore, ADNP haploinsufficiency in mice leads to cognitive inflexibility involving altered synaptic plasticity and signaling in adults long after a marked decrease in Adnp expression in juveniles.
Project description:Genetic variants associated with autism spectrum disorders (ASDs) are enriched for genes encoding synaptic proteins and chromatin regulators. Although the role of synaptic proteins in ASDs is well studied, the mechanism by which disruptions of chromatin regulators promote the development of ASDs remains unclear. Here we identify 184 neuronal long genes containing super-enhancer-like chromatin modifications across their gene bodies, which we term SE long genes. We show that SE long genes exhibit reduced transcriptional RNA Polymerase II pausing, higher transcription initiation frequency, and higher expression levels. The gene body regions of SE long genes display active epigenomic status and are organized into a highly connected chromatin unit. Notably, SE long genes are enriched in known ASD-associated genes related to the synapse and signaling pathways. These characteristics present a molecular link between chromatin regulators and synaptic function, and suggest a mechanism by which disruptions in these chromatin regulators compromise the synapse. Indeed, we show that the expression of SE long genes is more sensitive to the disruptions of autism-risk chromatin regulators, including Kmt2c, Kdm5c, Kdm6b, and Mecp2, than non-SE long genes. We propose that the transcriptional impairment of SE long genes by dysfunctional chromatin regulators is a general molecular mechanism for ASD pathogenesis.