Dataset Information

Bridging ImmunoGenomic Data Analysis Workflow Gaps (BIGDAWG): An integrated case-control analysis pipeline.

ABSTRACT: Bridging ImmunoGenomic Data-Analysis Workflow Gaps (BIGDAWG) is an integrated data-analysis pipeline designed for the standardized analysis of highly-polymorphic genetic data, specifically for the HLA and KIR genetic systems. Most modern genetic analysis programs are designed for the analysis of single nucleotide polymorphisms, but the highly polymorphic nature of HLA and KIR data require specialized methods of data analysis. BIGDAWG performs case-control data analyses of highly polymorphic genotype data characteristic of the HLA and KIR loci. BIGDAWG performs tests for Hardy-Weinberg equilibrium, calculates allele frequencies and bins low-frequency alleles for k×2 and 2×2 chi-squared tests, and calculates odds ratios, confidence intervals and p-values for each allele. When multi-locus genotype data are available, BIGDAWG estimates user-specified haplotypes and performs the same binning and statistical calculations for each haplotype. For the HLA loci, BIGDAWG performs the same analyses at the individual amino-acid level. Finally, BIGDAWG generates figures and tables for each of these comparisons. BIGDAWG obviates the error-prone reformatting needed to traffic data between multiple programs, and streamlines and standardizes the data-analysis process for case-control studies of highly polymorphic data. BIGDAWG has been implemented as the bigdawg R package and as a free web application at bigdawg.immunogenomics.org.

SUBMITTER: Pappas DJ

PROVIDER: S-EPMC4828284 | biostudies-literature | 2016 Mar

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Bridging ImmunoGenomic Data Analysis Workflow Gaps (BIGDAWG): An integrated case-control analysis pipeline.

Pappas Derek J DJ Marin Wesley W Hollenbach Jill A JA Mack Steven J SJ

Human immunology 20151218 3

Bridging ImmunoGenomic Data-Analysis Workflow Gaps (BIGDAWG) is an integrated data-analysis pipeline designed for the standardized analysis of highly-polymorphic genetic data, specifically for the HLA and KIR genetic systems. Most modern genetic analysis programs are designed for the analysis of single nucleotide polymorphisms, but the highly polymorphic nature of HLA and KIR data require specialized methods of data analysis. BIGDAWG performs case-control data analyses of highly polymorphic geno ...[more]

PMID: 26708359

Similar Datasets

Project description:The economic trend and the health care landscape are rapidly evolving across Asia. Effective real-world data (RWD) for regulatory and clinical decision-making is a crucial milestone associated with this evolution. This necessitates a critical evaluation of RWD generation within distinct nations for the use of various RWD warehouses in the generation of real-world evidence (RWE). In this article, we outline the RWD generation trends for 2 contrasting nation archetypes: "Solo Scholars"-nations with relatively self-sufficient RWD research systems-and "Global Collaborators"-countries largely reliant on international infrastructures for RWD generation. The key trends and patterns in RWD generation, country-specific insights into the predominant databases used in each country to produce RWE, and insights into the broader landscape of RWD database use across these countries are discussed. Conclusively, the data point out the heterogeneous nature of RWD generation practices across 10 different Asian nations and advocate for strategic enhancements in data harmonization. The evidence highlights the imperative for improved database integration and the establishment of standardized protocols and infrastructure for leveraging electronic medical records (EMR) in streamlining RWD acquisition. The clinical data analysis and reporting system of Hong Kong is an excellent example of a successful EMR system that showcases the capacity of integrated robust EMR platforms to consolidate and produce diverse RWE. This, in turn, can potentially reduce the necessity for reliance on numerous condition-specific local and global registries or limited and largely unavailable medical insurance or claims databases in most Asian nations. Linking health technology assessment processes with open data initiatives such as the Observational Medical Outcomes Partnership Common Data Model and the Observational Health Data Sciences and Informatics could enable the leveraging of global data resources to inform local decision-making. Advancing such initiatives is crucial for reinforcing health care frameworks in resource-limited settings and advancing toward cohesive, evidence-driven health care policy and improved patient outcomes in the region.

Project description:BACKGROUND:One of the major challenges facing investigators in the microbiome field is turning large numbers of reads generated by next-generation sequencing (NGS) platforms into biological knowledge. Effective analytical workflows that guarantee reproducibility, repeatability, and result provenance are essential requirements of modern microbiome research. For nearly a decade, several state-of-the-art bioinformatics tools have been developed for understanding microbial communities living in a given sample. However, most of these tools are built with many functions that require an in-depth understanding of their implementation and the choice of additional tools for visualizing the final output. Furthermore, microbiome analysis can be time-consuming and may even require more advanced programming skills which some investigators may be lacking. RESULTS:We have developed a wrapper named iMAP (Integrated Microbiome Analysis Pipeline) to provide the microbiome research community with a user-friendly and portable tool that integrates bioinformatics analysis and data visualization. The iMAP tool wraps functionalities for metadata profiling, quality control of reads, sequence processing and classification, and diversity analysis of operational taxonomic units. This pipeline is also capable of generating web-based progress reports for enhancing an approach referred to as review-as-you-go (RAYG). For the most part, the profiling of microbial community is done using functionalities implemented in Mothur or QIIME2 platform. Also, it uses different R packages for graphics and R-markdown for generating progress reports. We have used a case study to demonstrate the application of the iMAP pipeline. CONCLUSIONS:The iMAP pipeline integrates several functionalities for better identification of microbial communities present in a given sample. The pipeline performs in-depth quality control that guarantees high-quality results and accurate conclusions. The vibrant visuals produced by the pipeline facilitate a better understanding of the complex and multidimensional microbiome data. The integrated RAYG approach enables the generation of web-based reports, which provides the investigators with the intermediate output that can be reviewed progressively. The intensively analyzed case study set a model for microbiome data analysis.

Dataset Information

Bridging ImmunoGenomic Data Analysis Workflow Gaps (BIGDAWG): An integrated case-control analysis pipeline.

Publications

Bridging ImmunoGenomic Data Analysis Workflow Gaps (BIGDAWG): An integrated case-control analysis pipeline.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets