Project description:Accurate estimates of genome-wide rates and fitness effects of new mutations are essential for an improved understanding of molecular evolutionary processes. Although eukaryotic genomes generally contain a large noncoding fraction, functional noncoding regions and fitness effects of mutations in such regions are still incompletely characterized. A promising approach to characterize functional noncoding regions relies on identifying accessible chromatin regions (ACRs) tightly associated with regulatory DNA. Here, we applied this approach to identify and estimate selection on ACRs in Capsella grandiflora, a crucifer species ideal for population genomic quantification of selection due to its favorable population demography. We describe a population-wide ACR distribution based on ATAC-seq data for leaf samples of 16 individuals from a natural population. We use population genomic methods to estimate fitness effects and proportions of positively selected fixations (α) in ACRs and find that intergenic ACRs harbor a considerable fraction of weakly deleterious new mutations, as well as a significantly higher proportion of strongly deleterious mutations than comparable inaccessible intergenic regions. ACRs are enriched for expression quantitative trait loci (eQTL) and depleted of transposable element insertions, as expected if intergenic ACRs are under selection because they harbor regulatory regions. By integrating empirical identification of intergenic ACRs with analyses of eQTL and population genomic analyses of selection, we demonstrate that intergenic regulatory regions are an important source of nearly neutral mutations. These results improve our understanding of selection on noncoding regions and the role of nearly neutral mutations for evolutionary processes in outcrossing Brassicaceae species.
Project description:Despite great advances in sequencing capacity, generating functional information for non-model organisms remains a challenge. One solution lies in an improved ability to predict genetic circuits based on primary DNA sequence combined with the characterization of regulatory molecules from model species. Here, we focus on the LEAFY (LFY) transcription factor, a conserved master regulator of floral development. Starting with biochemical and structural information, we built a biophysical model describing LFY DNA binding specificity in vitro that accurately predicts in vivo LFY binding sites in the Arabidopsis thaliana genome. Extending the model to other species, we show that it can correctly identify functional homologs of known LFY targets from Arabidopsis thaliana in other angiosperms, even if a functional shift between orthologs and paralogs has occurred. Moreover, this model demonstrates the evolutionary fluidity of the link between LFY and one of its target genes, underlining how this regulatory interaction can be conserved despite changes in position, sequence and affinity of the LFY binding sites. Our study shows that the cis-element fluidity recently illustrated in animals also exists in plants, and that it can be detected without any experimental work in each individual species, using a biophysical transcription factor model. A. thaliana LEAFY ChIP-seq w control, 2 replicates
Project description:Despite great advances in sequencing capacity, generating functional information for non-model organisms remains a challenge. One solution lies in an improved ability to predict genetic circuits based on primary DNA sequence combined with the characterization of regulatory molecules from model species. Here, we focus on the LEAFY (LFY) transcription factor, a conserved master regulator of floral development. Starting with biochemical and structural information, we built a biophysical model describing LFY DNA binding specificity in vitro that accurately predicts in vivo LFY binding sites in the Arabidopsis thaliana genome. Extending the model to other species, we show that it can correctly identify functional homologs of known LFY targets from Arabidopsis thaliana in other angiosperms, even if a functional shift between orthologs and paralogs has occurred. Moreover, this model demonstrates the evolutionary fluidity of the link between LFY and one of its target genes, underlining how this regulatory interaction can be conserved despite changes in position, sequence and affinity of the LFY binding sites. Our study shows that the cis-element fluidity recently illustrated in animals also exists in plants, and that it can be detected without any experimental work in each individual species, using a biophysical transcription factor model.
Project description:Flowering plants often prevent selfing through mechanisms of self-incompatibility (S.I.). The loss of S.I. has occurred many times independently, because it provides short-term advantages in situations where pollinators or mates are rare. The genus Capsella, which is closely related to Arabidopsis, contains a pair of closely related diploid species, the self-incompatible Capsella grandiflora and the self-compatible Capsella rubella. To elucidate the transition to selfing and its relationship to speciation of C. rubella, we have made use of comparative sequence information. Our analyses indicate that C. rubella separated from C. grandiflora recently ( approximately 30,000-50,000 years ago) and that breakdown of S.I. occurred at approximately the same time. Contrasting the nucleotide diversity patterns of the 2 species, we found that C. rubella has only 1 or 2 alleles at most loci, suggesting that it originated through an extreme population bottleneck. Our data are consistent with diploid speciation by a single, selfing individual, most likely living in Greece. The new species subsequently colonized the Mediterranean by Northern and Southern routes, at a time that also saw the spread of agriculture. The presence of phenotypic diversity within modern C. rubella suggests that this species will be an interesting model to understand divergence and adaptation, starting from very limited standing genetic variation.
Project description:Understanding the causes of cis-regulatory variation is a long-standing aim in evolutionary biology. Although cis-regulatory variation has long been considered important for adaptation, we still have a limited understanding of the selective importance and genomic determinants of standing cis-regulatory variation. To address these questions, we studied the prevalence, genomic determinants, and selective forces shaping cis-regulatory variation in the outcrossing plant Capsella grandiflora We first identified a set of 1,010 genes with common cis-regulatory variation using analyses of allele-specific expression (ASE). Population genomic analyses of whole-genome sequences from 32 individuals showed that genes with common cis-regulatory variation (i) are under weaker purifying selection and (ii) undergo less frequent positive selection than other genes. We further identified genomic determinants of cis-regulatory variation. Gene body methylation (gbM) was a major factor constraining cis-regulatory variation, whereas presence of nearby transposable elements (TEs) and tissue specificity of expression increased the odds of ASE. Our results suggest that most common cis-regulatory variation in C. grandiflora is under weak purifying selection, and that gene-specific functional constraints are more important for the maintenance of cis-regulatory variation than genome-scale variation in the intensity of selection. Our results agree with previous findings that suggest TE silencing affects nearby gene expression, and provide evidence for a link between gbM and cis-regulatory constraint, possibly reflecting greater dosage sensitivity of body-methylated genes. Given the extensive conservation of gbM in flowering plants, this suggests that gbM could be an important predictor of cis-regulatory variation in a wide range of plant species.