Project description:BackgroundThe suamc genus Rhus (sensu stricto) includes two subgenera, Lobadium (ca. 25 spp.) and Rhus (ca. 10 spp.). Their members, R. glabra and R. typhina (Rosanae: Sapindales: Anacardiaceae), are two economic important species. Chloroplast genome information is of great significance for the study of plant phylogeny and taxonomy.ResultsThe three complete chloroplast genomes from two Rhus glabra and one R. typhina accessions were obtained with a total of each about 159k bp in length including a large single-copy region (LSC, about 88k bp), a small single-copy regions (SSC, about 19k bp) and a pair of inverted repeats regions (IRa/IRb, about 26k bp), to form a canonical quadripartite structure. Each genome contained 88 protein-coding genes, 37 transfer RNA genes, eight ribosomal RNA genes and two pseudogenes. The overall GC content of the three genomes all were same (37.8%), and RSCU values showed that they all had the same codon prefers, i.e., to use codon ended with A/U (93%) except termination codon. Three variable hotspots, i.e., ycf4-cemA, ndhF-rpl32-trnL and ccsA-ndhD, and a total of 152-156 simple sequence repeats (SSR) were identified. The nonsynonymous (Ka)/synonymous (Ks) ratio was calculated, and cemA and ycf2 genes are important indicators of gene evolution. The phylogenetic analyses of the family Anacardiaceae showed that the eight genera were grouped into three clusters, and supported the monophyly of the subfamilies and all the genera. The accessions of five Rhus species formed four clusters, while, one individual of R. typhina grouped with the R. glabra accessions instead of clustering into the two other individuals of R. typhina in the subgenus Rhus, which showed a paraphyletic relationship.ConclusionsComparing the complete chloroplast genomes of the Rhus species, it was found that most SSRs were A/T rich and located in the intergenic spacer, and the nucleotide divergence exhibited higher levels in the non-coding region than in the coding region. The Ka/Ks ratio of cemA gene was > 1 for species collected in America, while it was < 1 for other species in China, which dedicated that the Rhus species from North America and East Asia have different evolutionary pressure. The phylogenetic analysis of the complete chloroplast genome clarified the Rhus placement and relationship. The results obtained in this study are expected to provide valuable genetic resources to perform species identification, molecular breeding, and intraspecific diversity of the Rhus species.
Project description:A fundamental challenge in the post-genome era is to understand and annotate the consequences of genetic variation, particularly within the context of human tissues. We describe a set of integrated experiments designed to investigate the effects of common genetic variability on DNA methylation and mRNA expression distinct human brain regions. We show that brain tissues may be readily distinguished based on methylation status or expression profile. We find an abundance of genetic cis regulation mRNA expression and show for the first time abundant quantitative trait loci for DNA CpG methylation. We observe that the largest magnitude effects occur across distinct brain regions. We believe these data, which we have made publicly available, will be useful in understanding the biological effects of genetic variation. Authorized Access data: Mapping of GEO sample accessions to dbGaP subject/sample IDs is available through dbGaP Authorized Access, see http://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000249
Project description:BackgroundAs it becomes increasingly possible to obtain DNA sequences of orthologous genes from diverse sets of taxa, species trees are frequently being inferred from multilocus data. However, the behavior of many methods for performing this inference has remained largely unexplored. Some methods have been proven to be consistent given certain evolutionary models, whereas others rely on criteria that, although appropriate for many parameter values, have peculiar zones of the parameter space in which they fail to converge on the correct estimate as data sets increase in size.ResultsHere, using North American pines, we empirically evaluate the behavior of 24 strategies for species tree inference using three alternative outgroups (72 strategies total). The data consist of 120 individuals sampled in eight ingroup species from subsection Strobus and three outgroup species from subsection Gerardianae, spanning ?47 kilobases of sequence at 121 loci. Each "strategy" for inferring species trees consists of three features: a species tree construction method, a gene tree inference method, and a choice of outgroup. We use multivariate analysis techniques such as principal components analysis and hierarchical clustering to identify tree characteristics that are robustly observed across strategies, as well as to identify groups of strategies that produce trees with similar features. We find that strategies that construct species trees using only topological information cluster together and that strategies that use additional non-topological information (e.g., branch lengths) also cluster together. Strategies that utilize more than one individual within a species to infer gene trees tend to produce estimates of species trees that contain clades present in trees estimated by other strategies. Strategies that use the minimize-deep-coalescences criterion to construct species trees tend to produce species tree estimates that contain clades that are not present in trees estimated by the Concatenation, RTC, SMRT, STAR, and STEAC methods, and that in general are more balanced than those inferred by these other strategies.ConclusionsWhen constructing a species tree from a multilocus set of sequences, our observations provide a basis for interpreting differences in species tree estimates obtained via different approaches that have a two-stage structure in common, one step for gene tree estimation and a second step for species tree estimation. The methods explored here employ a number of distinct features of the data, and our analysis suggests that recovery of the same results from multiple methods that tend to differ in their patterns of inference can be a valuable tool for obtaining reliable estimates.
Project description:A fundamental challenge in the post-genome era is to understand and annotate the consequences of genetic variation, particularly within the context of human tissues. We describe a set of integrated experiments designed to investigate the effects of common genetic variability on DNA methylation and mRNA expression distinct human brain regions. We show that brain tissues may be readily distinguished based on methylation status or expression profile. We find an abundance of genetic cis regulation mRNA expression and show for the first time abundant quantitative trait loci for DNA CpG methylation. We observe that the largest magnitude effects occur across distinct brain regions. We believe these data, which we have made publicly available, will be useful in understanding the biological effects of genetic variation. Authorized Access data: Mapping of GEO sample accessions to dbGaP subject/sample IDs is available through dbGaP Authorized Access, see http://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000249 Because of our interest in genomic regulation of expression and neurological disorders we embarked upon a series of experiments to provide a brain region-specific contextual framework for genetic and epigenetic regulation of gene expression. We obtained frozen brain tissue from the cerebellum and frontal cortex from 318 subjects (total 724 tissue samples).
Project description:Molt is critical for birds as it replaces damaged feathers and worn plumage, enhancing flight performance, thermoregulation, and communication. In passerines, molt generally occurs on the breeding grounds during the postbreeding period once a year. However, some species of migrant passerines that breed in the Nearctic and Western Palearctic regions have evolved different molting strategies that involve molting on the overwintering grounds. Some species forego molt on the breeding grounds and instead complete their prebasic molt on the overwintering grounds. Other species molt some or all feathers a second time (prealternate molt) during the overwintering period. Using phylogenetic analyses, we explored the potential drivers of the evolution of winter molts in Nearctic and Western Palearctic breeding passerines. Our results indicate an association between longer photoperiods and the presence of prebasic and prealternate molts on the overwintering grounds for both Nearctic and Western Palearctic species. We also found a relationship between prealternate molt and generalist and water habitats for Western Palearctic species. Finally, the complete prealternate molt in Western Palearctic passerines was linked to longer days on the overwintering grounds and longer migration distance. Longer days may favor the evolution of winter prebasic molt by increasing the time window when birds can absorb essential nutrients for molt. Alternatively, for birds undertaking a prealternate molt at the end of the overwintering period, longer days may increase exposure to feather-degrading ultra-violet radiation, necessitating the replacement of feathers. Our study underlines the importance of the overwintering grounds in the critical process of molt for many passerines that breed in the Nearctic and Western Palearctic regions.
Project description:Due to climate change, the ranges of many North American tree species are expected to shift northward. Sugar maple (Acer saccharum Marshall) reaches its northern continuous distributional limit in northeastern North America at the transition between boreal mixed-wood and temperate deciduous forests. We hypothesized that marginal fragmented northern populations from the boreal mixed wood would have a distinct pattern of genetic structure and diversity. We analyzed variation at 18 microsatellite loci from 23 populations distributed along three latitudinal transects (west, central, and east) that encompass the continuous-discontinuous species range. Each transect was divided into two zones, continuous (temperate deciduous) and discontinuous (boreal mixed wood), based on sugar maple stand abundance. Respective positive and negative relationships were found between the distance of each population to the northern limit (D_north), and allelic richness (AR) and population differentiation (FST). These relations were tested for each transect separately; the pattern (discontinuous-continuous) remained significant only for the western transect. structure analysis revealed the presence of four clusters. The most northern populations of each transect were assigned to a distinct group. Asymmetrical gene flow occurred from the southern into the four northernmost populations. Southern populations in Québec may have originated from two different postglacial migration routes. No evidence was found to validate the hypothesis that northern populations were remnants of a larger population that had migrated further north of the species range after the retreat of the ice sheet. The northernmost sugar maple populations possibly originated from long-distance dispersal.
Project description:BackgroundFrancisella tularensis is an intracellular pathogen that causes tularemia in humans and the public health importance of this bacterium has been well documented in recent history. Francisella philomiragia, a distant relative of F. tularensis, is thought to constitute an environmental lineage along with Francisella novicida. Nevertheless, both F. philomiragia and F. novicida have been associated with human disease, primarily in immune-compromised individuals. To understand the genetic relationships and evolutionary contexts among different lineages within the genus Francisella, the genome of Francisella spp. strain TX07-7308 was sequenced and compared to the genomes of F. philomiragia strains ATCC 25017 and 25015, F. novicida strain U112, and F. tularensis strain Schu S4.ResultsThe size of strain ATCC 25017 chromosome was 2,045,775 bp and contained 1,983 protein-coding genes. The size of strain TX07-7308 chromosome was 2,035,931 bp and contained 1,980 protein-coding genes. Pairwise BLAST comparisons indicated that strains TX07-7308 and ATCC 25017 contained 1,700 protein coding genes in common. NUCmer analyses revealed that the chromosomes of strains TX07-7308 and ATCC 25017 were mostly collinear except for a few gaps, translocations, and/or inversions. Using the genome sequence data and comparative analyses with other members of the genus Francisella (e.g., F. novicida strain U112 and F. tularensis strain Schu S4), several strain-specific genes were identified. Strains TX07-7308 and ATCC 25017 contained an operon with six open reading frames encoding proteins related to enzymes involved in thiamine biosynthesis that was absent in F. novicida strain U112 and F. tularensis strain Schu S4. Strain ATCC 25017 contained an operon putatively involved in lactose metabolism that was absent in strain TX07-7308, F. novicida strain U112, and F. tularensis strain Schu S4. In contrast, strain TX07-7308 contained an operon putatively involved in glucuronate metabolism that was absent in the genomes of strain ATCC 25017, F. novicida strain U112, and F. tularensis strain Schu S4. The polymorphic nature of polysaccharide biosynthesis/modification gene clusters among different Francisella strains was also evident from genome analyses.ConclusionsFrom genome comparisons, it appeared that genes encoding novel functions have contributed to the metabolic enrichment of the environmental lineages within the genus Francisella. The inability to acquire new genes coupled with the loss of ancestral traits and the consequent reductive evolution may be a cause for, as well as an effect of, niche selection of F. tularensis. Sequencing and comparison of the genomes of more isolates are required to obtain further insights into the ecology and evolution of different species within the genus Francisella.
Project description:Few clades of plants have proven as difficult to classify as cacti. One explanation may be an unusually high level of convergent and parallel evolution (homoplasy). To evaluate support for this phylogenetic hypothesis at the molecular level, we sequenced the genomes of four cacti in the especially problematic tribe Pachycereeae, which contains most of the large columnar cacti of Mexico and adjacent areas, including the iconic saguaro cactus (Carnegiea gigantea) of the Sonoran Desert. We assembled a high-coverage draft genome for saguaro and lower coverage genomes for three other genera of tribe Pachycereeae (Pachycereus, Lophocereus, and Stenocereus) and a more distant outgroup cactus, Pereskia We used these to construct 4,436 orthologous gene alignments. Species tree inference consistently returned the same phylogeny, but gene tree discordance was high: 37% of gene trees having at least 90% bootstrap support conflicted with the species tree. Evidently, discordance is a product of long generation times and moderately large effective population sizes, leading to extensive incomplete lineage sorting (ILS). In the best supported gene trees, 58% of apparent homoplasy at amino sites in the species tree is due to gene tree-species tree discordance rather than parallel substitutions in the gene trees themselves, a phenomenon termed "hemiplasy." The high rate of genomic hemiplasy may contribute to apparent parallelisms in phenotypic traits, which could confound understanding of species relationships and character evolution in cacti.