Project description:DNA polymerase (dpol) β has served as a model for structural, kinetic, and computational characterization of the DNA synthesis reaction. The laboratory directed by Samuel H. Wilson has utilized a multifunctional approach to analyze the function of this enzyme at the biological, chemical, and molecular levels for nearly 50 years. Over this time, it has become evident that correlating static crystallographic structures of dpol β with solution kinetic measurements is a daunting task. However, aided by computational and spectroscopic approaches, novel and unexpected insights have emerged. While dpols generally insert wrong nucleotides with similar poor efficiencies, their capacity to insert the right nucleotide depends on the identity of the dpol. Accordingly, the ability to choose right from wrong depends on the efficiency of right, rather than wrong, nucleotide insertion. Structures of dpol β in various liganded forms published by the Wilson laboratory, and others, have provided molecular insights into the molecular attributes that hasten correct nucleotide insertion and deter incorrect nucleotide insertion. Computational approaches have bridged the gap between structures of intermediate complexes and provided insights into this basic and essential chemical reaction.
Project description:Total-evidence dating (TED) allows evolutionary biologists to incorporate a wide range of dating information into a unified statistical analysis. One might expect this to improve the agreement between rocks and clocks but this is not necessarily the case. We explore the reasons for such discordance using a mammalian dataset with rich molecular, morphological and fossil information. There is strong conflict in this dataset between morphology and molecules under standard stochastic models. This causes TED to push divergence events back in time when using inadequate models or vague priors, a phenomenon we term 'deep root attraction' (DRA). We identify several causes of DRA. Failure to account for diversified sampling results in dramatic DRA, but this can be addressed using existing techniques. Inadequate morphological models also appear to be a major contributor to DRA. The major reason seems to be that current models do not account for dependencies among morphological characters, causing distorted topology and branch length estimates. This is particularly problematic for huge morphological datasets, which may contain large numbers of correlated characters. Finally, diversification and fossil sampling priors that do not incorporate all the available background information can contribute to DRA, but these priors can also be used to compensate for DRA. Specifically, we show that DRA in the mammalian dataset can be addressed by introducing a modest extra penalty for ghost lineages that are unobserved in the fossil record, for instance by assuming rapid diversification, rare extinction or high fossil sampling rate; any of these assumptions produces highly congruent divergence time estimates with a minimal gap between rocks and clocks. Under these conditions, fossils have a stabilizing influence on divergence time estimates and significantly increase the precision of those estimates, which are generally close to the dates suggested by palaeontologists.This article is part of the themed issue 'Dating species divergences using rocks and clocks'.
Project description:Nucleosome structure and stability affect genetic accessibility by altering the local chromatin morphology. Recent FRET experiments on nucleosomes have given valuable insight into the structural transformations they can adopt. Yet, even if performed under seemingly identical conditions, experiments performed in bulk and at the single molecule level have given mixed answers due to the limitations of each technique. To compare such experiments, however, they must be performed under identical conditions. Here we develop an experimental framework that overcomes the conventional limitations of each method: single molecule FRET experiments are carried out at bulk concentrations by adding unlabeled nucleosomes, while bulk FRET experiments are performed in microplates at concentrations near those used for single molecule detection. Additionally, the microplate can probe many conditions simultaneously before expending valuable instrument time for single molecule experiments. We highlight this experimental strategy by exploring the role of selective acetylation of histone H3 on nucleosome structure and stability; in bulk, H3-acetylated nucleosomes were significantly less stable than non-acetylated nucleosomes. Single molecule FRET analysis further revealed that acetylation of histone H3 promoted the formation of an additional conformational state, which is suppressed at higher nucleosome concentrations and which could be an important structural intermediate in nucleosome regulation.
Project description:Measuring the pace at which speciation and extinction occur is fundamental to understanding the origin and evolution of biodiversity. Both the fossil record and molecular phylogenies of living species can provide independent estimates of speciation and extinction rates, but often produce strikingly divergent results. Despite its implications, the theoretical reasons for this discrepancy remain unknown. Here, we reveal a conceptual and methodological basis able to reconcile palaeontological and molecular evidence: discrepancies are driven by different implicit assumptions about the processes of speciation and species evolution in palaeontological and neontological analyses. We present the "birth-death chronospecies" model that clarifies the definition of speciation and extinction processes allowing for a coherent joint analysis of fossil and phylogenetic data. Using simulations and empirical analyses we demonstrate not only that this model explains much of the apparent incongruence between fossils and phylogenies, but that differences in rate estimates are actually informative about the prevalence of different speciation modes.
Project description:The understanding of complex biological networks often relies on both a dedicated layout and a topology. Currently, there are three major competing layout-aware systems biology formats, but there are no software tools or software libraries supporting all of them. This complicates the management of molecular network layouts and hinders their reuse and extension. In this paper, we present a high-level overview of the layout formats in systems biology, focusing on their commonalities and differences, review their support in existing software tools, libraries and repositories and finally introduce a new conversion module within the MINERVA platform. The module is available via a REST API and offers, besides the ability to convert between layout-aware systems biology formats, the possibility to export layouts into several graphical formats. The module enables conversion of very large networks with thousands of elements, such as disease maps or metabolic reconstructions, rendering it widely applicable in systems biology.
Project description:Despite continuous updates of the human reference genome, there are still hundreds of unresolved gaps which account for about 5% of the total sequence length. Given the availability of whole genome de novo assemblies, especially those derived from long-read sequencing data, gap-closing sequences can be determined. By comparing 17 de novo long-read sequencing assemblies with the human reference genome, we identified a total of 1,125 gap-closing sequences for 132 (16.9% of 783) gaps and added up to 2.2 Mb novel sequences to the human reference genome. More than 90% of the non-redundant sequences could be verified by unmapped reads from the Simons Genome Diversity Project dataset. In addition, 15.6% of the non-reference sequences were found in at least one of four non-human primate genomes. We further demonstrated that the non-redundant sequences had high content of simple repeats and satellite sequences. Moreover, 43 (32.6%) of the 132 closed gaps were shown to be polymorphic; such sequences may play an important biological role and can be useful in the investigation of human genetic diversity.
Project description:BACKGROUND:The fast reduction of prices of DNA sequencing allowed rapid accumulation of genome data. However, the process of obtaining complete genome sequences is still very time consuming and labor demanding. In addition, data produced from various sequencing technologies or alternative assemblies remain underexplored to improve assembly of incomplete genome sequences. FINDINGS:We have developed FGAP, a tool for closing gaps of draft genome sequences that takes advantage of different datasets. FGAP uses BLAST to align multiple contigs against a draft genome assembly aiming to find sequences that overlap gaps. The algorithm selects the best sequence to fill and eliminate the gap. CONCLUSIONS:FGAP reduced the number of gaps by 78% in an E. coli draft genome assembly using two different sequencing technologies, Illumina and 454. Using PacBio long reads, 98% of gaps were solved. In human chromosome 14 assemblies, FGAP reduced the number of gaps by 35%. All the inserted sequences were validated with a reference genome using QUAST. The source code and a web tool are available at http://www.bioinfo.ufpr.br/fgap/.
Project description:Topological phase transition is accompanied with a change of topological numbers. According to the bulk-edge correspondence, the gap closing and the breakdown of the adiabaticity are necessary at the phase transition point to make the topological number ill-defined. However, the gap closing is not always needed. In this paper, we show that two topological distinct phases can be continuously connected without gap closing, provided the symmetry of the system changes during the process. Here we propose the generic principles how this is possible by demonstrating various examples such as 1D polyacetylene with the charge-density-wave order, 2D silicene with the antiferromagnetic order, 2D silicene or quantum well made of HgTe with superconducting proximity effects and 3D superconductor Cu doped Bi2Se3. It is argued that such an unusual phenomenon can occur when we detour around the gap closing point provided the connection of the topological numbers is lost along the detour path.
Project description:The COVID-19 pandemic has disproportionately impacted minority communities, yet little data exists regarding whether disparities have improved at a health system level. This study examined whether sociodemographic disparities in hospitalization and clinical outcomes changed between two temporal waves of hospitalized COVID-19 patients. This is a retrospective cohort study of primary care patients at Mass General Brigham (a large northeastern health system serving 1.27 million primary care patients) hospitalized in-system with COVID-19 between March 1, 2020, and March 1, 2021, categorized into two 6-month "wave" periods. We used chi-square tests to compare demographics between waves, and regression analysis to characterize the association of race/ethnicity and language with in-hospital severe outcomes (death, hospice discharge, intensive unit care need). Hispanic/Latino, Black, and non-English-speaking patients constituted 30.3%, 12.5%, and 29.7% of COVID-19 admissions in wave 1 (N = 5844) and 22.2%, 9.0%, and 22.7% in wave 2 (N = 4007), compared to 2019 general admission proportions of 8.8%, 6.3%, and 7.7%, respectively. Admissions from highly socially vulnerable census tracts decreased between waves. Non-English speakers had significantly higher odds of severe illness during wave 1 (OR 1.35; 95% CI: 1.10, 1.66) compared to English speakers; this association was non-significant during wave 2 (OR 1.01; 95% CI: 0.76, 1.36). Comparing two COVID-19 temporal waves, significant sociodemographic disparities in COVID-19 admissions improved between waves but continued to persist over a year, demonstrating the need for ongoing interventions to truly close equity gaps. Non-English-speaking language status independently predicted worse hospitalization outcomes in wave 1, underscoring the importance of targeted and effective in-hospital supports for non-English speakers.
Project description:Despite early predictions and rapid progress in research, the introduction of personal genomics into clinical practice has been slow. Several factors contribute to this translational gap between knowledge and clinical application. The evidence available to support genetic test use is often limited, and implementation of new testing programs can be challenging. In addition, the heterogeneity of genomic risk information points to the need for strategies to select and deliver the information most appropriate for particular clinical needs. Accomplishing these tasks also requires recognition that some expectations for personal genomics are unrealistic, notably expectations concerning the clinical utility of genomic risk assessment for common complex diseases. Efforts are needed to improve the body of evidence addressing clinical outcomes for genomics, apply implementation science to personal genomics, and develop realistic goals for genomic risk assessment. In addition, translational research should emphasize the broader benefits of genomic knowledge, including applications of genomic research that provide clinical benefit outside the context of personal genomic risk.