Project description:Mendelian randomization (MR) is a method for estimating the causal relationship between an exposure and an outcome using a genetic factor as an instrumental variable (IV) for the exposure. In the traditional MR setting, data on the IV, exposure, and outcome are available for all participants. However, obtaining complete exposure data may be difficult in some settings, due to high measurement costs or lack of appropriate biospecimens. We used simulated data sets to assess statistical power and bias for MR when exposure data are available for a subset (or an independent set) of participants. We show that obtaining exposure data for a subset of participants is a cost-efficient strategy, often having negligible effects on power in comparison with a traditional complete-data analysis. The size of the subset needed to achieve maximum power depends on IV strength, and maximum power is approximately equal to the power of traditional IV estimators. Weak IVs are shown to lead to bias towards the null when the subsample is small and towards the confounded association when the subset is relatively large. Various approaches for confidence interval calculation are considered. These results have important implications for reducing the costs and increasing the feasibility of MR studies.
Project description:BackgroundWith genome-wide association data for many exposures and outcomes now available from large biobanks, one-sample Mendelian randomization (MR) is increasingly used to investigate causal relationships. Many robust MR methods are available to address pleiotropy, but these assume independence between the gene-exposure and gene-outcome association estimates. Unlike in two-sample MR, in one-sample MR the two estimates are obtained from the same individuals, and the assumption of independence does not hold in the presence of confounding.MethodsWith simulations mimicking a typical study in UK Biobank, we assessed the performance, in terms of bias and precision of the MR estimate, of the fixed-effect and (multiplicative) random-effects meta-analysis method, weighted median estimator, weighted mode estimator and MR-Egger regression, used in both one-sample and two-sample data. We considered scenarios differing by the: presence/absence of a true causal effect; amount of confounding; and presence and type of pleiotropy (none, balanced or directional).ResultsEven in the presence of substantial correlation due to confounding, all two-sample methods used in one-sample MR performed similarly to when used in two-sample MR, except for MR-Egger which resulted in bias reflecting direction and magnitude of the confounding. Such bias was much reduced in the presence of very high variability in instrument strength across variants (IGX2 of 97%).ConclusionsTwo-sample MR methods can be safely used for one-sample MR performed within large biobanks, expect for MR-Egger. MR-Egger is not recommended for one-sample MR unless the correlation between the gene-exposure and gene-outcome estimates due to confounding can be kept low, or the variability in instrument strength is very high.
Project description:Large-scale plasma proteomics studies have been transformed due to the multiplexing and automation of sample preparation workflows. However, these workflows can suffer from reproducibility issues, a lack of standardized quality control (QC) metrics, and the assessment of variation before liquid chromatography-tandem mass spectrometry (LC-MS/MS) analysis. The incorporation of robust QC metrics in sample preparation workflows ensures better reproducibility, lower assay variation, and better-informed decisions for troubleshooting. Our laboratory conducted a plasma proteomics study of a cohort of patient samples (N = 808) using tandem mass tag (TMT) 16-plex batches (N = 58). The proteomic workflow consisted of protein depletion, protein digestion, TMT labeling, and fractionation. Five QC sample types (QCstd, QCdig, QCpool, QCTMT, and QCBSA) were created to measure the performance of sample preparation prior to the final LC-MS/MS analysis. We measured <10% CV for individual sample preparation steps in the proteomic workflow based on data from various QC sample steps. The establishment of robust measures for QC of sample preparation steps allowed for greater confidence in prepared samples for subsequent LC-MS/MS analysis. This study also provides recommendations for standardized QC metrics that can assist with future large-scale cohort sample preparation workflows.
Project description:Although numerous observational studies have reported on the association between alcohol consumption and cancer, insufficient studies have estimated the causality. Our study evaluated the causal relationship between various types of cancer according to the frequency of drinking and the amount of alcohol consumed. The research data were obtained from the publicly available MR-Base platform. The frequency and amount of drinking were selected as the exposure, and 16 cancer types were selected as the outcome. Two-sample summary data Mendelian randomization (2SMR) was conducted to examine the causality between alcohol consumption and cancer type. Additionally, for cancers suspected of pleiotropy, outliers were removed and re-analyzed through radial MR. The MR results using the inverse variance weighted (IVW) method were different before and after removing outliers. The biggest differences were found for esophageal cancer and biliary tract cancer. For esophageal cancer, after removing outliers (rs13102973, rs540606, rs650558), the OR (95% CI) was 3.44 (1.19-9.89), which was statistically significant (p = 0.02172). Even in biliary tract cancer, after removing outliers (rs13231886, rs58905411), the OR (95% CI) was 3.86 (0.89-16.859), which was of borderline statistical significance (p = 0.07223). The strongest association was found for esophageal cancer. For other cancers, the evidence was not sufficient to draw conclusions. More research is needed to understand the causality between drinking and cancer.
Project description:The incidences of periodontitis and osteoporosis are rising worldwide. Observational studies have shown that periodontitis is associated with increased risk of osteoporosis. We performed a Mendelian randomization (MR) study to genetically investigate the causality of periodontitis on osteoporosis. We explored the causal effect of periodontitis on osteoporosis by MR analysis. A total of 9 single nucleotide polymorphisms (SNP) were related to periodontitis. The primary approach in this MR analysis was the inverse variance-weighted (IVW) method. Simple median, weighted median, and penalized weighted median were used to analyze sensitivity. The fixed-effect IVW model and random-effect IVW model showed no significant causal effect of genetically predicted periodontitis on the risk of osteoporosis (OR=1.032; 95%CI: 0.923-1.153; P=0.574; OR=1.032; 95%CI: 0.920-1.158; P=0.588, respectively). Similar results were observed in simple mode (OR=1.031; 95%CI: 0.780-1.361, P=0.835), weighted mode (OR=1.120; 95%CI: 0.944-1.328, P=0.229), simple median (OR=1.003; 95%CI: 0.839-1.197, P=0.977), weighted median (OR=1.078; 95%CI: 0.921-1.262, P=0.346), penalized weight median (OR 1.078; 95%CI: 0.919-1.264, P=0.351), and MR-Egger method (OR=1.360; 95%CI: 0.998-1.853, P=0.092). There was no heterogeneity in the IVW and MR-Egger analyses (Q=7.454, P=0.489 and Q=3.901, P=0.791, respectively). MR-Egger regression revealed no evidence of a pleiotropic influence through genetic variants (intercept: -0.004; P=0.101). The leave-one-out sensitivity analysis indicated no driven influence of any individual SNP on the association between periodontitis and osteoporosis. The Mendelian randomization analysis did not show a significant detrimental effect of periodontitis on the risk of osteoporosis.
Project description:Quality control (QC) is a critical step in large-scale studies of genetic variation. While, on average, high-throughput single nucleotide polymorphism (SNP) genotyping assays are now very accurate, the errors that remain tend to cluster into a small percentage of "problem" SNPs, which exhibit unusually high error rates. Because most large-scale studies of genetic variation are searching for phenomena that are rare (e.g., SNPs associated with a phenotype), even this small percentage of problem SNPs can cause important practical problems. Here we describe and illustrate how patterns of linkage disequilibrium (LD) can be used to improve QC in large-scale, population-based studies. This approach has the advantage over existing filters (e.g., HWE or call rate) that it can actually reduce genotyping error rates by automatically correcting some genotyping errors. Applying this LD-based QC procedure to data from The International HapMap Project, we identify over 1,500 SNPs that likely have high error rates in the CHB and JPT samples and estimate corrected genotypes. Our method is implemented in the software package fastPHASE, available from the Stephens Lab website (http://stephenslab.uchicago.edu/software.html).
Project description:BackgroundTo evaluate the causal relationship between lipoprotein(a) Lp(a) and stroke risk.MethodAdopting two grand scale genome-wide association study (GWAS) databases, the instrumental variables were selected on the basis that the genetic loci met the criteria of being independent of each other and closely related to Lp(a). Summary-level data for outcomes, ischemic stroke and its subtypes were acquired from the UK Biobank and MEGASTROKE consortium databases. Two-sample MR analyses were achieved using inverse variance-weighted (IVW) meta-analysis (primary analysis), weighted median analysis, and the MR Egger regression method. Multivariable-adjusted Cox regression models were also used for observational analysis.ResultGenetically predicted Lp(a) was marginally related with higher odds of total stroke (odds ratio (OR) [95% confidence intervals (CI)]: 1.003 [1.001-1.006], p = 0.010), ischemic stroke (OR [95% CI]: 1.004[1.001-1.007], p = 0.004), and large-artery atherosclerotic stroke (OR [95% CI]: 1.012 [1.004-1.019], p = 0.002) when the IVW estimator was used on the MEGASTROKE data. The associations of Lp(a) with stroke and ischemic stroke were also remarkable in the primary analysis using the UK Biobank data. Higher Lp(a) levels were also related with increased total stroke and ischemic stroke risk in the observational research data in the UK Biobank database.ConclusionGenetically predicted higher Lp(a) perhaps rise the risk of total stroke, ischemic stroke, and large-artery atherosclerotic stroke.
Project description:BackgroundThyroid hormones (THs) play a crucial role in regulating various biological processes, particularly the normal development and functioning of the central nervous system (CNS). Epilepsy is a prevalent neurological disorder with multiple etiologies. Further in-depth research on the role of thyroid hormones in epilepsy is warranted.MethodsGenome-wide association study (GWAS) data for thyroid function and epilepsy were obtained from the ThyroidOmics Consortium and the International League Against Epilepsy (ILAE) Consortium cohort, respectively. A total of five indicators of thyroid function and ten types of epilepsy were included in the analysis. Two-sample Mendelian randomization (MR) analyses were conducted to investigate potential causal relations between thyroid functions and various epilepsies. Multiple testing correction was performed using Bonferroni correction. Heterogeneity was calculated with the Cochran's Q statistic test. Horizontal pleiotropy was evaluated by the MR-Egger regression intercept. The sensitivity was also examined by leave-one-out strategy.ResultsThe findings indicated the absence of any causal relationship between abnormalities in thyroid hormone and various types of epilepsy. The study analyzed the odds ratio (OR) between thyroid hormones and various types of epilepsy in five scenarios, including free thyroxine (FT4) on focal epilepsy with hippocampal sclerosis (IVW, OR = 0.9838, p = 0.02223), hyperthyroidism on juvenile absence epilepsy (IVW, OR = 0.9952, p = 0.03777), hypothyroidism on focal epilepsy with hippocampal sclerosis (IVW, OR = 1.0075, p = 0.01951), autoimmune thyroid diseases (AITDs) on generalized epilepsy in all documented cases (weighted mode, OR = 1.0846, p = 0.0346) and on childhood absence epilepsy (IVW, OR = 1.0050, p = 0.04555). After Bonferroni correction, none of the above results showed statistically significant differences.ConclusionThis study indicates that there is no causal relationship between thyroid-related disorders and various types of epilepsy. Future research should aim to avoid potential confounding factors that might impact the study.
Project description:Mendelian randomization analyses are often performed using summarized data. The causal estimate from a one-sample analysis (in which data are taken from a single data source) with weak instrumental variables is biased in the direction of the observational association between the risk factor and outcome, whereas the estimate from a two-sample analysis (in which data on the risk factor and outcome are taken from non-overlapping datasets) is less biased and any bias is in the direction of the null. When using genetic consortia that have partially overlapping sets of participants, the direction and extent of bias are uncertain. In this paper, we perform simulation studies to investigate the magnitude of bias and Type 1 error rate inflation arising from sample overlap. We consider both a continuous outcome and a case-control setting with a binary outcome. For a continuous outcome, bias due to sample overlap is a linear function of the proportion of overlap between the samples. So, in the case of a null causal effect, if the relative bias of the one-sample instrumental variable estimate is 10% (corresponding to an F parameter of 10), then the relative bias with 50% sample overlap is 5%, and with 30% sample overlap is 3%. In a case-control setting, if risk factor measurements are only included for the control participants, unbiased estimates are obtained even in a one-sample setting. However, if risk factor data on both control and case participants are used, then bias is similar with a binary outcome as with a continuous outcome. Consortia releasing publicly available data on the associations of genetic variants with continuous risk factors should provide estimates that exclude case participants from case-control samples.
Project description:BackgroundPhosphodiesterases (PDEs) have been associated with psychiatric disorders in observational studies; however, the causality of associations remains unestablished.MethodsSpecifically, cyclic nucleotide PDEs were collected from genome-wide association studies (GWASs), including PDEs obtained by hydrolyzing both cyclic adenosine monophosphate (cAMP) and cyclic guanosine monophosphate (cGMP) (PDE1A, PDE2A, and PDE3A), specific to cGMP (PDE5A, PDE6D, and PDE9A) and cAMP (PDE4D and PDE7A). We performed a bidirectional two-sample Mendelian randomization (MR) analysis to investigate the relationship between PDEs and nine psychiatric disorders. The inverse-variance-weighted (IVW) method, MR-Egger, and weighted median were used to estimate causal effects. The Cochran's Q test, MR-Egger intercept test, MR Steiger test, leave-one-out analyses, funnel plot, and MR pleiotropy residual sum and outlier (MR-PRESSO) were used for sensitivity analyses.ResultsThe PDEs specific to cAMP were associated with higher-odds psychiatric disorders. For example, PDE4D and schizophrenia (SCZ) (odds ratios (OR) = 1.0531, PIVW = 0.0414), as well as major depressive disorder (MDD) (OR = 1.0329, PIVW = 0.0011). Similarly, PDE7A was associated with higher odds of attention-deficit/hyperactivity disorder (ADHD) (OR = 1.0861, PIVW = 0.0038). Exploring specific PDE subtypes and increase intracellular cAMP levels can inform the development of targeted interventions. We also observed PDEs (which hydrolyzes both cAMP and cGMP) was associated with psychiatric disorders [OR of PDE1A was 1.0836 for autism spectrum disorder; OR of PDE2A was 0.8968 for Tourette syndrome (TS) and 0.9449 for SCZ; and OR of PDE3A was 0.9796 for MDD; P < 0.05]. Furthermore, psychiatric disorders also had some causal effects on PDEs [obsessive-compulsive disorder on increased PDE6D and decreased PDE2A and PDE4D; anorexia nervosa on decreased PDE9A]. The results of MR were found to be robust using multiple sensitivity analysis.ConclusionsIn this study, potential causal relationships between plasma PDE proteins and psychiatric disorders were established. Exploring other PDE subtypes not included in this study could provide a more comprehensive understanding of the role of PDEs in psychiatric disorders. The development of specific medications targeting PDE subtypes may be a promising therapeutic approach for treating psychiatric disorders.