Project description:Imputed HLA alleles and variation. Imputation was carried out using the Multi-Ethnic HLA reference panel (version 1.0 2021) available on the Michigan imputation server
Project description:Low-pass sequencing (sequencing a genome to an average depth less than 1× coverage) combined with genotype imputation has been proposed as an alternative to genotyping arrays for trait mapping and calculation of polygenic scores. To empirically assess the relative performance of these technologies for different applications, we performed low-pass sequencing (targeting coverage levels of 0.5× and 1×) and array genotyping (using the Illumina Global Screening Array (GSA)) on 120 DNA samples derived from African and European-ancestry individuals that are part of the 1000 Genomes Project. We then imputed both the sequencing data and the genotyping array data to the 1000 Genomes Phase 3 haplotype reference panel using a leave- one-out design. We evaluated overall imputation accuracy from these different assays as well as overall power for GWAS from imputed data, and computed polygenic risk scores for coronary artery disease and breast cancer using previously derived weights. We conclude that low-pass sequencing plus imputation, in addition to providing a substantial increase in statistical power for genome wide association studies, provides increased accuracy for polygenic risk prediction at effective coverages of ∼ 0.5× and higher compared to the Illumina GSA.
Project description:Missing values in proteomic data sets have real consequences on downstream data analysis and reproducibility. Although several imputation methods exist to handle missing values, no single imputation method is best suited for a diverse range of data sets, and no clear strategy exists for evaluating imputation methods for large-scale DIA-MS data sets, especially at different levels of protein quantification. To navigate through the different imputation strategies available in the literature, we have established a workflow to assess imputation methods on large-scale label-free DIA-MS data sets. We used three DIA-MS data sets with real missing values to evaluate eight different imputation methods with multiple parameters at different levels of protein quantification; dilution series data set, a small pilot data set, and a larger proteomic data set of clinical ovarian cancer patient samples.
Project description:Genes & Health is a cohort of British Bangladeshi and British Pakistani individuals. In the Feb2020 data freeze, 28,022 individuals were genotyped on the Illumina Infinium Global Screening Array v3 chip (with the additional Multi-Disease variants). Imputation was performed using the Michigan Imputation Server with the GenomeAsia pilot reference panel, which contains 1,739 genomes and 642 from them are from South Asian groups. This dataset contains imputed SNPs with imputation accuracy >0.3 and MAF >0.1%.
Project description:This dataset contains the imputed genotypes for the gencord samples.
Genotyping was done using Illumina OMNI2.5M.
Imputation was done using SHAPEIT2/IMPUTE2 with 1000 genomes project phase 3 reference panel.
Project description:Missing values in proteomic data sets have real consequences on downstream data analysis and reproducibility. Although several imputation methods exist to handle missing values, no single imputation method is best suited for a diverse range of data sets, and no clear strategy exists for evaluating imputation methods for large-scale DIA-MS data sets, especially at different levels of protein quantification. To navigate through the different imputation strategies available in the literature, we have established a workflow to assess imputation methods on large-scale label-free DIA-MS data sets. We used three DIA-MS data sets with real missing values to evaluate eight different imputation methods with multiple parameters at different levels of protein quantification; dilution series data set, a small pilot data set, and a larger proteomic data set.
Project description:This data set includes the following summary level data file used for the imputation data: imputation.sv.assoc.txt: results from single variant association analysis in imputed samples
Project description:Imputation of Rice Diversity Panel 1 and 2 using 3000 Rice Genomes dataset; assembly of the Global Oryza sativa Reference Panel via reciprocal imputation of the HDRA Panel (RDP1+RDP2) and 3000 Rice Genomes Panel
Project description:Missing values in proteomic data sets have real consequences on downstream data analysis and reproducibility. Although several imputation methods exist to handle missing values, no single imputation method is best suited for a diverse range of data sets, and no clear strategy exists for evaluating imputation methods for large-scale DIA-MS data sets, especially at different levels of protein quantification. To navigate through the different imputation strategies available in the literature, we have established a workflow to assess imputation methods on large-scale label-free DIA-MS data sets. We used three DIA-MS data sets with real missing values to evaluate eight different imputation methods with multiple parameters at different levels of protein quantification; dilution series data set, a small pilot data set, and a larger proteomic data set.