Project description:The genealogy process is typically the most time-consuming part of-and a limiting factor in the success of-forensic genetic genealogy, which is a new approach to solving violent crimes and identifying human remains. We formulate a stochastic dynamic program that-given the list of matches and their genetic distances to the unknown target-chooses the best decision at each point in time: which match to investigate (i.e., find its ancestors and look for most recent common ancestors between the match and the target), which set of potential most recent common ancestors to descend from (i.e., find its descendants, with the goal of identifying a marriage between the maternal and paternal sides of the target's family tree), or whether to terminate the investigation. The objective is to maximize the probability of finding the target minus a cost associated with the expected size of the final family tree. We estimate the parameters of our model using data from 17 cases (eight solved, nine unsolved) from the DNA Doe Project. We assess the Proposed Strategy using simulated versions of the 17 DNA Doe Project cases, and compare it to a Benchmark Strategy that ranks matches by their genetic distance to the target and only descends from known common ancestors between a pair of matches. The Proposed Strategy solves cases ≈10 - fold faster than the Benchmark Strategy, and does so by aggressively descending from a set of potential most recent common ancestors between the target and a match even when this set has a low probability of containing the correct most recent common ancestor. Our analysis provides a mathematical foundation for improving the genealogy process in forensic genetic genealogy.
Project description:Cerebral cavernous malformations (CCM) are vascular malformations consisting of collections of enlarged capillaries occurring in the brain or spinal cord. These vascular malformations can occur sporadically or susceptibility to develop these can be inherited as an autosomal dominant trait due to mutation in one of three genes. Over a decade ago, we described a 77.6 Kb germline deletion spanning exons 2-10 in the CCM2 gene found in multiple affected individuals from seemingly unrelated families. Segregation analysis using linked, microsatellite markers indicated that this deletion may have arisen at least twice independently. In the ensuing decades, many more CCM patients have been identified with this deletion. In this present study we examined 27 reportedly unrelated affected individuals with this deletion. To investigate the origin of the deletion at base pair level resolution, we sequenced approximately 10 Kb upstream and downstream from the recombination junction on the deleted allele. All patients showed the identical SNP haplotype across this combined 20 Kb interval. In parallel, genealogical records have traced 11 of these individuals to five separate pedigrees dating as far back as the 1600-1700s. These haplotype and genealogical data suggest that these families and the remaining "unrelated" samples converge on a common ancestor due to a founder mutation occurring centuries ago on the North American continent. We also note that another gene, NACAD, is included in this deletion. Although patient self-reporting does not indicate an apparent phenotypic consequence for heterozygous deletion of NACAD, further investigation is warranted for these patients.
Project description:BackgroundYunnan is located in Southwest China and consists of great cultural, linguistic, and genetic diversity. However, the genomic diversity of ethnic minorities in Yunnan is largely under-investigated. To gain insights into population history and local adaptation of Yunnan minorities, we analyzed 242 whole-exome sequencing data with high coverage (~ 100-150 ×) of Yunnan minorities representing Achang, Jingpo, Dai, and Deang, who were linguistically assumed to be derived from three ancient lineages (the tri-genealogy hypothesis), i.e., Di-Qiang, Bai-Yue, and Bai-Pu.ResultsYunnan minorities show considerable genetic differences. Di-Qiang populations likely migrated from the Tibetan area about 6700 years ago. Genetic divergence between Bai-Yue and Di-Qiang was estimated to be 7000 years, and that between Bai-Yue and Bai-Pu was estimated to be 5500 years. Bai-Pu is relatively isolated, but gene flow from surrounding Di-Qiang and Bai-Yue populations was also found. Furthermore, we identified genetic variants that are differentiated within Yunnan minorities possibly due to the living circumstances and habits. Notably, we found that adaptive variants related to malaria and glucose metabolism suggest the adaptation to thalassemia and G6PD deficiency resulting from malaria resistance in the Dai population.ConclusionsWe provided genetic evidence of the tri-genealogy hypothesis as well as new insights into the genetic history and local adaptation of the Yunnan minorities.
Project description:In 2022, the National Technology Validation and Implementation Collaborative (NTVIC) was established. Its mission is to collaborate across the US on validation, method development, and implementation. The NTVIC is comprised of 13 federal, state and local government crime laboratory leaders, joined by university researchers, and private technology and research companies. One of the NTVIC's first initiatives was to generate this draft policy document. This document provides guidelines and considerations for crime laboratories and investigative agencies exploring the establishment of a forensic investigative genetic genealogy (FIGG) program. While each jurisdiction is responsible for its own program policy, sharing minimum standards and best practices to optimize resources, promote technology implementation and elevate quality is a goal of the NTVIC.
Project description:Downy mildew caused by Plasmopara viticola is one of the most devastating diseases of grapevines worldwide. So far, the genetic diversity and origin of the Chinese P. viticola population are unclear. In the present study, 103 P. viticola isolates were sequenced at four gene regions: internal transcribed spacer one (ITS), large subunit of ribosomal RNA (LSU), actin gene (ACT) and beta-tubulin (TUB). The sequences were analyzed to obtain polymorphism and diversity information of the Chinese population as well as to infer the relationships between Chinese and American isolates. High genetic diversity was observed for the Chinese population, with evidence of sub-structuring based on climate. Phylogenetic analysis and haplotype networks showed evidence of close relationships between some American and Chinese isolates, consistent with recent introduction from America to China via planting materials. However, there is also evidence for endemic Chinese P. viticola isolates. Our results suggest that the current Chinese Plasmopara viticola population is an admixture of endemic and introduced isolates.
Project description:Background: Forensic investigative genetic genealogy (FIGG) has developed rapidly in recent years and is considered a novel tool for crime investigation. However, crime scene samples are often of low quality and quantity and are challenging to analyze. Deciding which approach should be used for kinship inference in forensic practice remains a troubling problem for investigators. Methods: In this study, we selected four popular approaches-KING, IBS, TRUFFLE, and GERMLINE-comprising one method of moment (MoM) estimator and three identical by descent (IBD) segment-based tools and compared their performance at varying numbers of SNPs and levels of genotyping errors using both simulated and real family data. We also explored the possibility of making robust kinship inferences for samples with ultra-high genotyping errors by integrating MoM and the IBD segment-based methods. Results: The results showed that decreasing the number of SNPs had little effect on kinship inference when no fewer than 164 K SNPs were used for all four approaches. However, as the number decreased further, decreased efficiency was observed for the three IBD segment-based methods. Genotyping errors also had a significant effect on kinship inference, especially when they exceeded 1%. In contrast, MoM was much more robust to genotyping errors. Furthermore, the combination of the MoM and the IBD segment-based methods showed a higher overall accuracy, indicating its potential to improve the tolerance to genotyping errors. Conclusions: In conclusion, this study shows that different approaches have unique characteristics and should be selected for different scenarios. More importantly, the integration of the MoM and the IBD segment-based methods can improve the robustness of kinship inference and has great potential for applications in forensic practice.
Project description:In this paper, we present the academic genealogy of presidents of the Psychometric Society by constructing a genealogical tree, in which Ph.D. students are encoded as descendants of their advisors. Results show that most of the presidents belong to five distinct lineages that can be traced to Wilhelm Wundt, James Angell, William James, Albert Michotte or Carl Friedrich Gauss. Important psychometricians Lee Cronbach and Charles Spearman play only a marginal role. The genealogy systematizes important historical knowledge that can be used to inform studies on the history of psychometrics and exposes the rich and multidisciplinary background of the Psychometric Society.
Project description:Genome-wide association studies (GWASs) have identified numerous loci that influence risk for psychiatric diseases. Genetically engineered mice are often used to characterize genes implicated by GWASs. These studies are based on the assumption that observed genotype-phenotype relationships will generalize to humans, implying that the results would at least generalize to other inbred mouse strains. Given current concerns about reproducibility, we sought to directly test this assumption. We produced F1 crosses between male C57BL/6J mice heterozygous for null alleles of Cacna1c and Tcf7l2 and wild-type females from 30 inbred laboratory strains. We found extremely strong interactions with genetic background that sometimes supported diametrically opposing conclusions. These results do not negate the invaluable contributions of mouse genetics to biomedical science, but they do show that genotype-phenotype relationships cannot be reliably inferred by studying a single genetic background, and thus constitute a major challenge to the status quo. VIDEO ABSTRACT.
Project description:Next-generation sequencing (NGS), also known as massively sequencing, enables large dense SNP panel analyses which generate the genetic component of forensic investigative genetic genealogy (FIGG). While the costs of implementing large SNP panel analyses into the laboratory system may seem high and daunting, the benefits of the technology may more than justify the investment. To determine if an infrastructural investment in public laboratories and using large SNP panel analyses would reap substantial benefits to society, a cost-benefit analysis (CBA) was performed. This CBA applied the logic that an increase of DNA profile uploads to a DNA database due to a sheer increase in number of markers and a greater sensitivity of detection afforded with NGS and a higher hit/association rate due to large SNP/kinship resolution and genealogy will increase investigative leads, will be more effective for identifying recidivists which in turn reduces future victims of crime, and will bring greater safety and security to communities. Analyses were performed for worst case/best case scenarios as well as by simulation sampling the range spaces with multiple input values simultaneously to generate best estimate summary statistics. This study shows that the benefits, both tangible and intangible, over the lifetime of an advanced database system would be huge and can be projected to be for less than $1 billion per year (over a 10-year period) investment can reap on average > $4.8 billion in tangible and intangible cost-benefits per year. More importantly, on average > 50,000 individuals need not become victims if FIGG were employed, assuming investigative associations generated were acted upon. The benefit to society is immense making the laboratory investment a nominal cost. The benefits likely are underestimated herein. There is latitude in the estimated costs, and even if they were doubled or tripled, there would still be substantial benefits gained with a FIGG-based approach. While the data used in this CBA are US centric (primarily because data were readily accessible), the model is generalizable and could be used by other jurisdictions to perform relevant and representative CBAs.