Project description:The genus Flaveria has been extensively used as a model to study the evolution of C4 photosynthesis as it contains both C3 and C4 species as well as a number of species that exhibit intermediate types of photosynthesis. The current phylogenetic tree of the Flaveria genus contains 21 of the 23 known Flaveria species and has been constructed using a combination of morphologicial data and three non-coding DNA sequences (nuclear encoded ETS, ITS and chloroplast encoded trnl-F). However, recent studies have suggested that phylogenetic trees inferred using a small number of molecular sequences may often be incorrect. Moreover, studies in other genera have often shown substantial differences between trees inferred using morphological data and those using molecular sequence. To provide new insight into the phylogeny of the genus Flaveria we utilize RNA-Seq data to construct a multi-gene concatenated phylogenetic tree of 17 Flaveria species. Furthermore, we use this new data to identify 14 C4 specific non-synonymous mutation sites, 12 of which (86%) can be independently verified by public sequence data. We propose that the data collection method provided in this study can be used as a generic method for facilitating phylogenetic tree reconstruction in the absence of reference genomes for the target species. 18 Flaveria sample including 11 species are sequenced, other three samples were also sequenced as out-group. In all, 21 samples.
Project description:The genus Flaveria has been extensively used as a model to study the evolution of C4 photosynthesis as it contains both C3 and C4 species as well as a number of species that exhibit intermediate types of photosynthesis. The current phylogenetic tree of the Flaveria genus contains 21 of the 23 known Flaveria species and has been constructed using a combination of morphologicial data and three non-coding DNA sequences (nuclear encoded ETS, ITS and chloroplast encoded trnl-F). However, recent studies have suggested that phylogenetic trees inferred using a small number of molecular sequences may often be incorrect. Moreover, studies in other genera have often shown substantial differences between trees inferred using morphological data and those using molecular sequence. To provide new insight into the phylogeny of the genus Flaveria we utilize RNA-Seq data to construct a multi-gene concatenated phylogenetic tree of 17 Flaveria species. Furthermore, we use this new data to identify 14 C4 specific non-synonymous mutation sites, 12 of which (86%) can be independently verified by public sequence data. We propose that the data collection method provided in this study can be used as a generic method for facilitating phylogenetic tree reconstruction in the absence of reference genomes for the target species.
Project description:Technology for crosslinking and immunoprecipitation followed by sequencing (CLIP-seq) has identified the transcriptomic targets of hundreds of RNA-binding proteins in cells. To improve the power of existing and future CLIP-seq datasets, we introduce Skipper, an end-to-end workflow that converts unprocessed reads into annotated binding sites using an improved statistical framework. Compared to existing methods, Skipper on average calls 3.1-4.2 times more transcriptomic binding sites and sometimes >10 times more sites, providing deeper insight into post-transcriptional gene regulation. Skipper also calls binding to annotated repetitive elements and identifies bound elements for 99% of enhanced CLIP experiments. We perform nine translation factor enhanced CLIPs and apply Skipper to learn determinants of translation factor occupancy including transcript region, sequence, and subcellular localization. Furthermore, we observe depletion of genetic variation in occupied sites and nominate transcripts subject to selective constraint because of translation factor occupancy. Skipper offers fast, easy, customizable analysis of CLIP-seq data.