Project description:<p> This study is part of the '<i>First 1,000 Days of Life and Beyond</i>' study at the Inova Translational Medicine Institute. Whole-genome sequencing data from 1,291 parent-offspring trios was used to study the properties of clustered <i>de novo</i> mutations. The maternal clusters were found to be enriched in regions with accelerated maternal mutation rate and show distinct mutational signatures. </p> <p>For additional details, please refer to: "<i>Germline de novo mutation clusters arise during oocyte aging in genomic regions with increased double-strand break incidence</i>". Jakob M. Goldmann, Vladimir B. Seplyarskiy, Wendy S.W. Wong, Thierry Vilboux, Pieter B. Neerincx, Dale L. Bodian, Benjamin D. Solomon, Joris A. Veltman, John F. Deeken, Christian Gilissen, John E. Niederhuber. <a href="https://www.ncbi.nlm.nih.gov/pubmed/29507425">Nature Genetics</a>. </p>
Project description:Comparison of whole genome exome array CGH to a commercial SNP array for detection of de novo and homozygous copy number variants in 99 autism simplex trios. Will update once manuscript is prepared.
Project description:MotivationWhole-genome and -exome sequencing on parent-offspring trios is a powerful approach to identifying disease-associated genes by detecting de novo mutations in patients. Accurate detection of de novo mutations from sequencing data is a critical step in trio-based genetic studies. Existing bioinformatic approaches usually yield high error rates due to sequencing artifacts and alignment issues, which may either miss true de novo mutations or call too many false ones, making downstream validation and analysis difficult. In particular, current approaches have much worse specificity than sensitivity, and developing effective filters to discriminate genuine from spurious de novo mutations remains an unsolved challenge.ResultsIn this article, we curated 59 sequence features in whole genome and exome alignment context which are considered to be relevant to discriminating true de novo mutations from artifacts, and then employed a machine-learning approach to classify candidates as true or false de novo mutations. Specifically, we built a classifier, named De Novo Mutation Filter (DNMFilter), using gradient boosting as the classification algorithm. We built the training set using experimentally validated true and false de novo mutations as well as collected false de novo mutations from an in-house large-scale exome-sequencing project. We evaluated DNMFilter's theoretical performance and investigated relative importance of different sequence features on the classification accuracy. Finally, we applied DNMFilter on our in-house whole exome trios and one CEU trio from the 1000 Genomes Project and found that DNMFilter could be coupled with commonly used de novo mutation detection approaches as an effective filtering approach to significantly reduce false discovery rate without sacrificing sensitivity.AvailabilityThe software DNMFilter implemented using a combination of Java and R is freely available from the website at http://humangenome.duke.edu/software.
Project description:In order to study parent-of-origin effects on gene expression, we performed RNAseq analysis (100bp single end reads) of 165 children who formed part of mother/father/child trios where genotype data was available from the HapMap and/or 1000 Genomes Projects. Based on phased genotypes at heterozygous SNP positions, we generated allelic counts for expression of the maternal and paternal alleles in each individual. This analysis reveals significant bias in the expression of the parental alleles for dozens of genes, including both previously known and novel imprinted transcripts.
Project description:We describe a multiple de novo CNV (MdnCNV) phenomenon in which individuals with genomic disorders carry five to ten constitutional de novo CNVs. Five such families are studied, which consists of four trios and one singleton. Various array platforms are used to interogate these families to identify de novo CNVs.
Project description:We describe a multiple de novo CNV (MdnCNV) phenomenon in which individuals with genomic disorders carry five to ten constitutional de novo CNVs. Five such families are studied, which consists of four trios and one singleton. Various array platforms are used to interogate these families to identify de novo CNVs.
Project description:We describe a multiple de novo CNV (MdnCNV) phenomenon in which individuals with genomic disorders carry five to ten constitutional de novo CNVs. Five such families are studied, which consists of four trios and one singleton. Various array platforms are used to interogate these families to identify de novo CNVs.
Project description:We describe a multiple de novo CNV (MdnCNV) phenomenon in which individuals with genomic disorders carry five to ten constitutional de novo CNVs. Five such families are studied, which consists of four trios and one singleton. Various array platforms are used to interogate these families to identify de novo CNVs.
Project description:This is the validation data for candidate de novo CNV calls made in the asthma trios by Itsara et al., Genome Research 2010. In this study, de novo CNV calls in the asthma data set were initially made with Illumina 550K SNP arrays. Validation was performed with custom Nimblegen array CGH for which DNA was available. de novo CNVs would be expected to validate in the child of each trio tested, and not be detected in either parent. We attempted to validate 9 de novo CNVs in the same number of trios. In 3 cases, paternal DNA was not available leaving a total of 24 distinct samples for hybridization. All samples were hybridized against a previously well-characterized reference (NA15510; see Tuzun et al., Nat Genet 2005).