Project description:Most of the human genome is thought to be non-functional, and includes large segments often referred to as “dark matter” DNA. The genome also encodes hundreds of putative and poorly characterized transcription factors (TFs). We determined genomic binding locations of 166 uncharacterized human TFs in living cells. Nearly half of them associated strongly with known regulatory regions such as promoters and enhancers, often at conserved motif matches and co-localizing with each other. Surprisingly, the other half often associated with genomic dark matter, at largely unique sites, via intrinsic sequence recognition. Dozens of these, which we term “Dark TFs” mainly bind within regions of closed chromatin. Dark TF binding sites are rarely under purifying selection, and are enriched for transposable elements. Many Dark TFs are KZNFs, which contain the repressive KRAB domain, but many are not, and may represent potential pioneer TFs: based on compiled literature information, the Dark TFs exert diverse functions ranging from early development to tumor suppression. Thus, a large fraction of previously uncharacterized human TFs may have unappreciated activities within the dark matter genome.
Project description:Most of the human genome is thought to be non-functional, and includes large segments often referred to as “dark matter” DNA. The genome also encodes hundreds of putative and poorly characterized transcription factors (TFs). We determined genomic binding locations of 166 uncharacterized human TFs in living cells. Nearly half of them associated strongly with known regulatory regions such as promoters and enhancers, often at conserved motif matches and co-localizing with each other. Surprisingly, the other half often associated with genomic dark matter, at largely unique sites, via intrinsic sequence recognition. Dozens of these, which we term “Dark TFs” mainly bind within regions of closed chromatin. Dark TF binding sites are rarely under purifying selection, and are enriched for transposable elements. Many Dark TFs are KZNFs, which contain the repressive KRAB domain, but many are not, and may represent potential pioneer TFs: based on compiled literature information, the Dark TFs exert diverse functions ranging from early development to tumor suppression. Thus, a large fraction of previously uncharacterized human TFs may have unappreciated activities within the dark matter genome.
Project description:Most of the human genome is thought to be non-functional, and includes large segments often referred to as “dark matter” DNA. The genome also encodes hundreds of putative and poorly characterized transcription factors (TFs). We determined genomic binding locations of 166 uncharacterized human TFs in living cells. Nearly half of them associated strongly with known regulatory regions such as promoters and enhancers, often at conserved motif matches and co-localizing with each other. Surprisingly, the other half often associated with genomic dark matter, at largely unique sites, via intrinsic sequence recognition. Dozens of these, which we term “Dark TFs” mainly bind within regions of closed chromatin. Dark TF binding sites are rarely under purifying selection, and are enriched for transposable elements. Many Dark TFs are KZNFs, which contain the repressive KRAB domain, but many are not, and may represent potential pioneer TFs: based on compiled literature information, the Dark TFs exert diverse functions ranging from early development to tumor suppression. Thus, a large fraction of previously uncharacterized human TFs may have unappreciated activities within the dark matter genome.
Project description:Numerous studies have applied molecular techniques to understand the diversity, evolution, and ecological function of Antarctic bacteria and archaea. One common technique is sequencing of the 16S rRNA gene, which produces a nearly quantitative profile of community membership. However, the utility of this and similar approaches is limited by what is known about the evolution, physiology, and ecology of surveyed taxa. When representative genomes are available in public databases some of this information can be gleaned from genomic studies, and automated pipelines exist to carry out this task. Here the paprica metabolic inference pipeline was used to assess how well Antarctic microbial communities are represented by the available completed genomes. The NCBI's Sequence Read Archive (SRA) was searched for Antarctic datasets that used one of the Illumina platforms to sequence the 16S rRNA gene. These data were quality controlled and denoised to identify unique reads, then analyzed with paprica to determine the degree of overlap with the closest phylogenetic neighbor with a completely sequenced genome. While some unique reads had perfect mapping to 16S rRNA genes from completed genomes, the mean percent overlap for all mapped reads was 86.6%. When samples were grouped by environment, some environments appeared more or less well represented by the available genomes. For the domain Bacteria, seawater was particularly poorly represented with a mean overlap of 80.2%, while for the domain Archaea glacial ice was particularly poorly represented with an overlap of only 48.0% for a single sample. These findings suggest that a considerable effort is needed to improve the representation of Antarctic microbes in genome sequence databases.
Project description:Multiple sclerosis (MS) is a chronic inflammatory demyelinating disease of the brain. Among characteristics of MS pathology are cortical grey matter abnormalities, which have been linked to clinical signs such as cognitive impairment. To understand MS cortical grey matter lesion pathogenesis, we performed differential gene expression analysis of MS cortical normal-appearing grey matter (NAGM) and grey matter lesions. HLA-DRB1 is the transcript with highest expression in MS NAGM with a bimodal distribution among the examined cases. Genotyping revealed that every case with the MS-associated HLA-DR15 haplotype also shows high HLA-DRB1 expression. Quantitative immunohistochemical analysis confirmed the higher expression of HLA-DRB1 in HLA-DRB1*15:01 cases at the protein level. Analysis of grey matter lesion size revealed a significant increase of cortical lesion size in cases with high HLA-DRB1 expression. Our data indicate that increased HLA-DRB1 expression in the brain of MS patients may be an important factor in how the HLA-DR15 haplotype contributes to MS risk in the target organ.