Project description:Despite widespread advances in DNA sequencing in the past decade, the functional consequences of most rare genetic variants remain poorly understood, severely limiting our ability to connect variants to their consequences on protein function, identify biochemical mechanisms by which variation causes disease, and interpret variant pathogenicity. Multiplexed Assays of Variant Effect (MAVEs), which can measure the function of tens of thousands variants, are beginning to address this problem. However, existing MAVEs cannot be applied to the approximately 10% of human genes encoding secreted proteins, about a quarter of which are associated with disease. We developed a flexible and scalable human cell surface display method, Multiplexed Surface Tethering of Extracellular Proteins (MultiSTEP), that can simultaneously measure the functional effects of tens of thousands of variants in secreted proteins. We used MultiSTEP to study the consequences of missense variation in coagulation factor IX (FIX), a vitamin K-dependent plasma serine protease where variation can cause FIX deficiency and the bleeding disorder hemophilia B. We used a panel of antibodies to detect FIX secretion or FIX post-translational modification, measuring a total of 45,024 effects for 9,007 variants. 43.8% of all possible F9 missense variants impact FIX secretion, post-translational modification or both. We also identify new signals of functional constraint on secretion including within the signal peptide, folded domains, and for nearly all variants that caused gain or loss of cysteine. FIX secretion scores correlate strongly with FIX levels in patient plasma and also reveal that most F9 missense variants causing severe hemophilia do so by profoundly impacting secretion. We integrate the secretion and post-translational modification data to develop a F9 variant classifier that can identify loss of function variants with high specificity. We use the resulting classifications to reinterpret and upgrade 62 of 97 F9 variants of uncertain significance (VUS) in the MyLifeOurFuture hemophilia genotyping project to likely pathogenic. Lastly, we show that MultiSTEP can be applied to a wide variety of secreted proteins, ranging from small signaling proteins like insulin to large proteins like factor VIII. Thus, we establish a multiplexed, multimodal, and generalizable method for systematically assessing variant effects for secreted proteins at scale, paving the way for improved understanding of biochemical mechanisms of disease and clinical variant interpretation.
Project description:The mitochondrial m.3243A>G variant is known to cause retinal dystrophy and vision loss. We used single cell multimodal sequencing to understand how the presence of this mutation affects cellular phenotype in a cell type-specific manner.
Project description:The simultaneous measurement of multiple modalities, known as multimodal analysis, represents an exciting frontier for single-cell genomics and necessitates new computational methods that can define cellular states based on multiple data types. Here, we introduce ‘weighted-nearest neighbor’ analysis, an unsupervised framework to learn the relative utility of each data type in each cell, enabling an integrative analysis of multiple modalities. We apply our procedure to a CITE-seq dataset of hundreds of thousands of human white blood cells alongside a panel of 228 antibodies to construct a multimodal reference atlas of the circulating immune system. We demonstrate that integrative analysis substantially improves our ability to resolve cell states and validate the presence of previously unreported lymphoid subpopulations. Moreover, we demonstrate how to leverage this reference to rapidly map new datasets, and to interpret immune responses to vaccination and COVID-19. Our approach represents a broadly applicable strategy to analyze single-cell multimodal datasets, including paired measurements of RNA and chromatin state, and to look beyond the transcriptome towards a unified and multimodal definition of cellular identity.
Project description:The use of CRISPR/Cas proteins for the creation of multiplex genome-engineering represents an important avenue for crop improvement, and further improvements for creation of knock-in plant lines via CRISPR-based technologies may enable the high-throughput creation of designer alleles. To circumvent limitations of the commonly used CRISPR/Cas9 system for multiplex genome-engineering, we explored the use of Moraxella bovoculi 3 Cas12a (Mb3Cas12a) for multiplex genome-editing in Arabidopsis thaliana. We identified optimized promoter sequences for driving expression of single transcript multiplex crRNA arrays in A. thaliana, resulting in stable germline transmission of Mb3Cas12a-edited alleles at multiple target sites. By utilizing this system, we demonstrate single-transcript multiplexed genome-engineering using of up to 13 crRNA targets. We further show high target specificity of Mb3Cas12a-based genome-editing via whole-genome sequencing. Taken together, our method provides a simplified platform for efficient multiplex-genome-engineering in plant-based systems.