Unknown

Dataset Information

0

BCRgt: a Bayesian cluster regression-based genotyping algorithm for the samples with copy number alterations.


ABSTRACT: Accurate genotype calling is a pre-requisite of a successful Genome-Wide Association Study (GWAS). Although most genotyping algorithms can achieve an accuracy rate greater than 99% for genotyping DNA samples without copy number alterations (CNAs), almost all of these algorithms are not designed for genotyping tumor samples that are known to have large regions of CNAs.This study aims to develop a statistical method that can accurately genotype tumor samples with CNAs. The proposed method adds a Bayesian layer to a cluster regression model and is termed a Bayesian Cluster Regression-based genotyping algorithm (BCRgt). We demonstrate that high concordance rates with HapMap calls can be achieved without using reference/training samples, when CNAs do not exist. By adding a training step, we have obtained higher genotyping concordance rates, without requiring large sample sizes. When CNAs exist in the samples, accuracy can be dramatically improved in regions with DNA copy loss and slightly improved in regions with copy number gain, comparing with the Bayesian Robust Linear Model with Mahalanobis distance classifier (BRLMM).In conclusion, we have demonstrated that BCRgt can provide accurate genotyping calls for tumor samples with CNAs.

SUBMITTER: Yang S 

PROVIDER: S-EPMC4003822 | biostudies-literature | 2014 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

BCRgt: a Bayesian cluster regression-based genotyping algorithm for the samples with copy number alterations.

Yang Shengping S   Cui Xiangqin X   Fang Zhide Z  

BMC bioinformatics 20140315


<h4>Background</h4>Accurate genotype calling is a pre-requisite of a successful Genome-Wide Association Study (GWAS). Although most genotyping algorithms can achieve an accuracy rate greater than 99% for genotyping DNA samples without copy number alterations (CNAs), almost all of these algorithms are not designed for genotyping tumor samples that are known to have large regions of CNAs.<h4>Results</h4>This study aims to develop a statistical method that can accurately genotype tumor samples with  ...[more]

Similar Datasets

| S-EPMC3023756 | biostudies-literature
2013-02-01 | GSE43933 | GEO
| S-EPMC2939611 | biostudies-other
| S-EPMC3834792 | biostudies-literature
| S-EPMC2704547 | biostudies-literature
| S-EPMC3668900 | biostudies-other
| S-EPMC2674052 | biostudies-literature
| S-EPMC3472297 | biostudies-literature
| S-EPMC2527508 | biostudies-literature
| S-EPMC4196891 | biostudies-literature