Project description:Although Mendelian disorders are overwhelmingly attributed to protein-coding pathogenic variants, a majority of unsolved cases do not harbor obvious causal pathogenic variants in the coding sequence, suggesting a potential non-coding etiology. However, classification of pathogenicity in non-coding sequence remains prohibitive due to a vastly increased search space and the lack of a standardized rubric for interpretation. Here, we present an integrated single cell multiomic framework to nominate pathogenic non-coding variants for the congenital cranial dysinnervation disorders (CCDDs). The CCDDs are Mendelian neurodevelopmental disorders that result from aberrant development of cranial motor neurons in the embryonic brainstem. We created a non-coding reference atlas of single cell chromatin accessibility profiles for 86,089 embryonic mouse cranial motor neurons (cMNs). We found that high-quality single cell ATAC-seq (scATAC) profiles alone were a strong predictor of enhancement (64% in vivo validation rate). To further aid in interpretation, we integrated single cell histone modification and gene expression information to distinguish individual enhancers and their cognate genes. Relatively subtle differences in cellular composition of input data often led to substantial differences in predicted enhancer strength, cognate gene, and tissue of activity. Next, we mapped candidate non-coding variants from 899 whole genome sequences from 270 CCDD pedigrees to the murine cMN-specific regulatory elements and trained a machine learning classifier to accurately predict the functional effects of patient variants within these elements. We then performed high coverage scATACseq and site-specific footprinting analysis on an allelic series of CRISPR-humanised mice to validate our machine learning predictions and render important clues to the mode of pathogenicity. Finally, we performed peak- and gene-centric allelic aggregation to nominate non-coding variants, including those regulating MN1 and EBF3, respectively. Altogether this work extends non-coding variant analysis to Mendelian disease and presents a generalizable framework for nominating novel non-coding variants in other rare disorders.
Project description:Although Mendelian disorders are overwhelmingly attributed to protein-coding pathogenic variants, a majority of unsolved cases do not harbor obvious causal pathogenic variants in the coding sequence, suggesting a potential non-coding etiology. However, classification of pathogenicity in non-coding sequence remains prohibitive due to a vastly increased search space and the lack of a standardized rubric for interpretation. Here, we present an integrated single cell multiomic framework to nominate pathogenic non-coding variants for the congenital cranial dysinnervation disorders (CCDDs). The CCDDs are Mendelian neurodevelopmental disorders that result from aberrant development of cranial motor neurons in the embryonic brainstem. We created a non-coding reference atlas of single cell chromatin accessibility profiles for 86,089 embryonic mouse cranial motor neurons (cMNs). We found that high-quality single cell ATAC-seq (scATAC) profiles alone were a strong predictor of enhancement (64% in vivo validation rate). To further aid in interpretation, we integrated single cell histone modification and gene expression information to distinguish individual enhancers and their cognate genes. Relatively subtle differences in cellular composition of input data often led to substantial differences in predicted enhancer strength, cognate gene, and tissue of activity. Next, we mapped candidate non-coding variants from 899 whole genome sequences from 270 CCDD pedigrees to the murine cMN-specific regulatory elements and trained a machine learning classifier to accurately predict the functional effects of patient variants within these elements. We then performed high coverage scATACseq and site-specific footprinting analysis on an allelic series of CRISPR-humanised mice to validate our machine learning predictions and render important clues to the mode of pathogenicity. Finally, we performed peak- and gene-centric allelic aggregation to nominate non-coding variants, including those regulating MN1 and EBF3, respectively. Altogether this work extends non-coding variant analysis to Mendelian disease and presents a generalizable framework for nominating novel non-coding variants in other rare disorders.
Project description:Although Mendelian disorders are overwhelmingly attributed to protein-coding pathogenic variants, a majority of unsolved cases do not harbor obvious causal pathogenic variants in the coding sequence, suggesting a potential non-coding etiology. However, classification of pathogenicity in non-coding sequence remains prohibitive due to a vastly increased search space and the lack of a standardized rubric for interpretation. Here, we present an integrated single cell multiomic framework to nominate pathogenic non-coding variants for the congenital cranial dysinnervation disorders (CCDDs). The CCDDs are Mendelian neurodevelopmental disorders that result from aberrant development of cranial motor neurons in the embryonic brainstem. We created a non-coding reference atlas of single cell chromatin accessibility profiles for 86,089 embryonic mouse cranial motor neurons (cMNs). We found that high-quality single cell ATAC-seq (scATAC) profiles alone were a strong predictor of enhancement (64% in vivo validation rate). To further aid in interpretation, we integrated single cell histone modification and gene expression information to distinguish individual enhancers and their cognate genes. Relatively subtle differences in cellular composition of input data often led to substantial differences in predicted enhancer strength, cognate gene, and tissue of activity. Next, we mapped candidate non-coding variants from 899 whole genome sequences from 270 CCDD pedigrees to the murine cMN-specific regulatory elements and trained a machine learning classifier to accurately predict the functional effects of patient variants within these elements. We then performed high coverage scATACseq and site-specific footprinting analysis on an allelic series of CRISPR-humanised mice to validate our machine learning predictions and render important clues to the mode of pathogenicity. Finally, we performed peak- and gene-centric allelic aggregation to nominate non-coding variants, including those regulating MN1 and EBF3, respectively. Altogether this work extends non-coding variant analysis to Mendelian disease and presents a generalizable framework for nominating novel non-coding variants in other rare disorders.
Project description:Although Mendelian disorders are overwhelmingly attributed to protein-coding pathogenic variants, a majority of unsolved cases do not harbor obvious causal pathogenic variants in the coding sequence, suggesting a potential non-coding etiology. However, classification of pathogenicity in non-coding sequence remains prohibitive due to a vastly increased search space and the lack of a standardized rubric for interpretation. Here, we present an integrated single cell multiomic framework to nominate pathogenic non-coding variants for the congenital cranial dysinnervation disorders (CCDDs). The CCDDs are Mendelian neurodevelopmental disorders that result from aberrant development of cranial motor neurons in the embryonic brainstem. We created a non-coding reference atlas of single cell chromatin accessibility profiles for 86,089 embryonic mouse cranial motor neurons (cMNs). We found that high-quality single cell ATAC-seq (scATAC) profiles alone were a strong predictor of enhancement (64% in vivo validation rate). To further aid in interpretation, we integrated single cell histone modification and gene expression information to distinguish individual enhancers and their cognate genes. Relatively subtle differences in cellular composition of input data often led to substantial differences in predicted enhancer strength, cognate gene, and tissue of activity. Next, we mapped candidate non-coding variants from 899 whole genome sequences from 270 CCDD pedigrees to the murine cMN-specific regulatory elements and trained a machine learning classifier to accurately predict the functional effects of patient variants within these elements. We then performed high coverage scATACseq and site-specific footprinting analysis on an allelic series of CRISPR-humanised mice to validate our machine learning predictions and render important clues to the mode of pathogenicity. Finally, we performed peak- and gene-centric allelic aggregation to nominate non-coding variants, including those regulating MN1 and EBF3, respectively. Altogether this work extends non-coding variant analysis to Mendelian disease and presents a generalizable framework for nominating novel non-coding variants in other rare disorders.
Project description:Although Mendelian disorders are overwhelmingly attributed to protein-coding pathogenic variants, a majority of unsolved cases do not harbor obvious causal pathogenic variants in the coding sequence, suggesting a potential non-coding etiology. However, classification of pathogenicity in non-coding sequence remains prohibitive due to a vastly increased search space and the lack of a standardized rubric for interpretation. Here, we present an integrated single cell multiomic framework to nominate pathogenic non-coding variants for the congenital cranial dysinnervation disorders (CCDDs). The CCDDs are Mendelian neurodevelopmental disorders that result from aberrant development of cranial motor neurons in the embryonic brainstem. We created a non-coding reference atlas of single cell chromatin accessibility profiles for 86,089 embryonic mouse cranial motor neurons (cMNs). We found that high-quality single cell ATAC-seq (scATAC) profiles alone were a strong predictor of enhancement (64% in vivo validation rate). To further aid in interpretation, we integrated single cell histone modification and gene expression information to distinguish individual enhancers and their cognate genes. Relatively subtle differences in cellular composition of input data often led to substantial differences in predicted enhancer strength, cognate gene, and tissue of activity. Next, we mapped candidate non-coding variants from 899 whole genome sequences from 270 CCDD pedigrees to the murine cMN-specific regulatory elements and trained a machine learning classifier to accurately predict the functional effects of patient variants within these elements. We then performed high coverage scATACseq and site-specific footprinting analysis on an allelic series of CRISPR-humanised mice to validate our machine learning predictions and render important clues to the mode of pathogenicity. Finally, we performed peak- and gene-centric allelic aggregation to nominate non-coding variants, including those regulating MN1 and EBF3, respectively. Altogether this work extends non-coding variant analysis to Mendelian disease and presents a generalizable framework for nominating novel non-coding variants in other rare disorders.