Project description:Downy mildew caused by Plasmopara viticola is one of the most devastating diseases of grapevines worldwide. So far, the genetic diversity and origin of the Chinese P. viticola population are unclear. In the present study, 103?P. viticola isolates were sequenced at four gene regions: internal transcribed spacer one (ITS), large subunit of ribosomal RNA (LSU), actin gene (ACT) and beta-tubulin (TUB). The sequences were analyzed to obtain polymorphism and diversity information of the Chinese population as well as to infer the relationships between Chinese and American isolates. High genetic diversity was observed for the Chinese population, with evidence of sub-structuring based on climate. Phylogenetic analysis and haplotype networks showed evidence of close relationships between some American and Chinese isolates, consistent with recent introduction from America to China via planting materials. However, there is also evidence for endemic Chinese P. viticola isolates. Our results suggest that the current Chinese Plasmopara viticola population is an admixture of endemic and introduced isolates.
Project description:In this paper, we present the academic genealogy of presidents of the Psychometric Society by constructing a genealogical tree, in which Ph.D. students are encoded as descendants of their advisors. Results show that most of the presidents belong to five distinct lineages that can be traced to Wilhelm Wundt, James Angell, William James, Albert Michotte or Carl Friedrich Gauss. Important psychometricians Lee Cronbach and Charles Spearman play only a marginal role. The genealogy systematizes important historical knowledge that can be used to inform studies on the history of psychometrics and exposes the rich and multidisciplinary background of the Psychometric Society.
Project description:Genome-wide association studies (GWASs) have identified numerous loci that influence risk for psychiatric diseases. Genetically engineered mice are often used to characterize genes implicated by GWASs. These studies are based on the assumption that observed genotype-phenotype relationships will generalize to humans, implying that the results would at least generalize to other inbred mouse strains. Given current concerns about reproducibility, we sought to directly test this assumption. We produced F1 crosses between male C57BL/6J mice heterozygous for null alleles of Cacna1c and Tcf7l2 and wild-type females from 30 inbred laboratory strains. We found extremely strong interactions with genetic background that sometimes supported diametrically opposing conclusions. These results do not negate the invaluable contributions of mouse genetics to biomedical science, but they do show that genotype-phenotype relationships cannot be reliably inferred by studying a single genetic background, and thus constitute a major challenge to the status quo. VIDEO ABSTRACT.
Project description:The majority of diseases that are a significant challenge for public and individual heath are caused by a combination of hereditary and environmental factors. In this paper we introduce Lineage, a novel visual analysis tool designed to support domain experts who study such multifactorial diseases in the context of genealogies. Incorporating familial relationships between cases with other data can provide insights into shared genomic variants and shared environmental exposures that may be implicated in such diseases. We introduce a data and task abstraction, and argue that the problem of analyzing such diseases based on genealogical, clinical, and genetic data can be mapped to a multivariate graph visualization problem. The main contribution of our design study is a novel visual representation for tree-like, multivariate graphs, which we apply to genealogies and clinical data about the individuals in these families. We introduce data-driven aggregation methods to scale to multiple families. By designing the genealogy graph layout to align with a tabular view, we are able to incorporate extensive, multivariate attributes in the analysis of the genealogy without cluttering the graph. We validate our designs by conducting case studies with our domain collaborators.
Project description:As an effort to help contain the COVID-19 pandemic, large numbers of SARS-CoV-2 genomes have been sequenced from most regions in the world. More than one million viral sequences are publicly available as of April 2021. Many studies estimate viral genealogies from these sequences, as these can provide valuable information about the spread of the pandemic across time and space. Additionally such data are a rich source of information about molecular evolutionary processes including natural selection, for example allowing the identification of new variants with transmissibility and immunity evasion advantages, and allowing the investigation of viral spread. To validate new methods and to verify results obtained from these vast datasets, one needs an efficient simulator able to simulate the pandemic to approximate world-scale scenarios and generate viral genealogies of millions of samples. Here, we introduce a new fast simulator VGsim which addresses this problem. The simulation process is split into two phases. During the forward run the algorithm generates a chain of events reflecting the dynamics of the pandemic using an hierarchical version of the Gillespie algorithm. During the backward run a coalescent-like approach generates a tree genealogy of samples conditioning on the events chain generated during the forward run. Our software can model complex population structure, epistasis and immunity escape. The code is freely available at https://github.com/Genomics-HSE/VGsim.
Project description:The partitioning and subsequent inheritance of cellular factors like proteins and RNAs is a ubiquitous feature of cell division. However, direct quantitative measures of how such nongenetic inheritance affects subsequent changes in gene expression have been lacking. We tracked families of the yeast Saccharomyces cerevisiae as they switch between two semi-stable epigenetic states. We found that long after two cells have divided, they continued to switch in a synchronized manner, whereas individual cells have exponentially distributed switching times. By comparing these results to a Poisson process, we show that the time evolution of an epigenetic state depends initially on inherited factors, with stochastic processes requiring several generations to decorrelate closely related cells. Finally, a simple stochastic model demonstrates that a single fluctuating regulatory protein that is synthesized in large bursts can explain the bulk of our results.
Project description:To clarify the chronologic genetic diversity of coxsackievirus A16 (CV-A16) strains associated with hand, foot, and mouth disease (HFMD) epidemics in a restricted area and their genetic relation with those isolated in other areas, we investigated the genetic diversity of the 129 CV-A16 strains associated with HFMD epidemics in Fukushima, Japan, from 1983 to 2003, and compared their genetic relation to 49 CV-A16 strains isolated in other areas of Japan and in China by using phylogenetic analysis based on the VP4 sequences. Phylogenetic reconstruction of the CV-A16 strains isolated in Fukushima from 1983 to 2003 demonstrated three distinct genetically divergent clusters related to HFMD epidemics that occurred from 1984 to 1994 (including the 1985 and 1991 outbreaks), HFMD epidemics from 1987 to 1998 (including the 1988 and 1998 outbreaks), and HFMD epidemics from 1995 to 2003 (including the 1995 and 2002 outbreaks). CV-A16 strains isolated during each period in Fukushima formed a single cluster with those isolated during essentially the same time period in other areas of Japan and in China. Our results demonstrated that prevalent CV-A16 strains causing HFMD in Fukushima, Japan, genetically changed twice during 21 epidemics, and changes were also observed in the CV-A16 strains causing HFMD epidemics in other areas. We concluded that repeated outbreaks of CV-A16-related HFMD in Japan were caused, in part, by the introduction of genetically changed CV-A16 strains, which might be transmitted overseas.
Project description:The human population displays wide variety in demographic history, ancestry, content of DNA derived from hominins or ancient populations, adaptation, traits, copy number variation, drug response, and more. These polymorphisms are of broad interest to population geneticists, forensics investigators, and medical professionals. Historically, much of that knowledge was gained from population survey projects. Although many commercial arrays exist for genome-wide single-nucleotide polymorphism genotyping, their design specifications are limited and they do not allow a full exploration of biodiversity. We thereby aimed to design the Diversity of REcent and Ancient huMan (DREAM)-an all-inclusive microarray that would allow both identification of known associations and exploration of standing questions in genetic anthropology, forensics, and personalized medicine. DREAM includes probes to interrogate ancestry informative markers obtained from over 450 human populations, over 200 ancient genomes, and 10 archaic hominins. DREAM can identify 94% and 61% of all known Y and mitochondrial haplogroups, respectively, and was vetted to avoid interrogation of clinically relevant markers. To demonstrate its capabilities, we compared its FST distributions with those of the 1000 Genomes Project and commercial arrays. Although all arrays yielded similarly shaped (inverse J) FST distributions, DREAM's autosomal and X-chromosomal distributions had the highest mean FST, attesting to its ability to discern subpopulations. DREAM performances are further illustrated in biogeographical, identical by descent, and copy number variation analyses. In summary, with approximately 800,000 markers spanning nearly 2,000 genes, DREAM is a useful tool for genetic anthropology, forensic, and personalized medicine studies.
Project description:BackgroundWhether or not a mutant allele in a population is under selection is an important issue in population genetics, and various neutrality tests have been invented so far to detect selection. However, detection of negative selection has been notoriously difficult, partly because negatively selected alleles are usually rare in the population and have little impact on either population dynamics or the shape of the gene genealogy. Recently, through studies of genetic disorders and genome-wide analyses, many structural variations were shown to occur recurrently in the population. Such "recurrent mutations" might be revealed as deleterious by exploiting the signal of negative selection in the gene genealogy enhanced by their recurrence.ResultsMotivated by the above idea, we devised two new test statistics. One is the total number of mutants at a recurrently mutating locus among sampled sequences, which is tested conditionally on the number of forward mutations mapped on the sequence genealogy. The other is the size of the most common class of identical-by-descent mutants in the sample, again tested conditionally on the number of forward mutations mapped on the sequence genealogy. To examine the performance of these two tests, we simulated recurrently mutated loci each flanked by sites with neutral single nucleotide polymorphisms (SNPs), with no recombination. Using neutral recurrent mutations as null models, we attempted to detect deleterious recurrent mutations. Our analyses demonstrated high powers of our new tests under constant population size, as well as their moderate power to detect selection in expanding populations. We also devised a new maximum parsimony algorithm that, given the states of the sampled sequences at a recurrently mutating locus and an incompletely resolved genealogy, enumerates mutation histories with a minimum number of mutations while partially resolving genealogical relationships when necessary.ConclusionsWith their considerably high powers to detect negative selection, our new neutrality tests may open new venues for dealing with the population genetics of recurrent mutations as well as help identifying some types of genetic disorders that may have escaped identification by currently existing methods.