Unknown

Dataset Information

0

R-Gada: a fast and flexible pipeline for copy number analysis in association studies.


ABSTRACT:

Background

Genome-wide association studies (GWAS) using Copy Number Variation (CNV) are becoming a central focus of genetic research. CNVs have successfully provided target genome regions for some disease conditions where simple genetic variation (i.e., SNPs) has previously failed to provide a clear association.

Results

Here we present a new R package, that integrates: (i) data import from most common formats of Affymetrix, Illumina and aCGH arrays; (ii) a fast and accurate segmentation algorithm to call CNVs based on Genome Alteration Detection Analysis (GADA); and (iii) functions for displaying and exporting the Copy Number calls, identification of recurrent CNVs, multivariate analysis of population structure, and tools for performing association studies. Using a large dataset containing 270 HapMap individuals (Affymetrix Human SNP Array 6.0 Sample Dataset) we demonstrate a flexible pipeline implemented with the package. It requires less than one minute per sample (3 million probe arrays) on a single core computer, and provides a flexible parallelization for very large datasets. Case-control data were generated from the HapMap dataset to demonstrate a GWAS analysis.

Conclusions

The package provides the tools for creating a complete integrated pipeline from data normalization to statistical association. It can efficiently handle a massive volume of data consisting of millions of genetic markers and hundreds or thousands of samples with very accurate results.

SUBMITTER: Pique-Regi R 

PROVIDER: S-EPMC2915992 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC2732310 | biostudies-literature
| S-EPMC3153341 | biostudies-literature
2018-06-22 | GSE99822 | GEO
| S-EPMC7487704 | biostudies-literature
| S-EPMC5960125 | biostudies-other
2023-10-18 | GSE215252 | GEO
| S-EPMC7224564 | biostudies-literature
| S-EPMC9995309 | biostudies-literature
| S-EPMC2603547 | biostudies-literature
| S-EPMC6248831 | biostudies-literature