Unknown

Dataset Information

0

Modified screening and ranking algorithm for copy number variation detection.


ABSTRACT:

Motivation

Copy number variation (CNV) is a type of structural variation, usually defined as genomic segments that are 1?kb or larger, which present variable copy numbers when compared with a reference genome. The screening and ranking algorithm (SaRa) was recently proposed as an efficient approach for multiple change-points detection, which can be applied to CNV detection. However, some practical issues arise from application of SaRa to single nucleotide polymorphism data.

Results

In this study, we propose a modified SaRa on CNV detection to address these issues. First, we use the quantile normalization on the original intensities to guarantee that the normal mean model-based SaRa is a robust method. Second, a novel normal mixture model coupled with a modified Bayesian information criterion is proposed for candidate change-point selection and further clustering the potential CNV segments to copy number states. Simulations revealed that the modified SaRa became a robust method for identifying change-points and achieved better performance than the circular binary segmentation (CBS) method. By applying the modified SaRa to real data from the HapMap project, we illustrated its performance on detecting CNV segments. In conclusion, our modified SaRa method improves SaRa theoretically and numerically, for identifying CNVs with high-throughput genotyping data.

Availability and implementation

The modSaRa package is implemented in R program and freely available at http://c2s2.yale.edu/software/modSaRa.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Xiao F 

PROVIDER: S-EPMC4410664 | biostudies-literature | 2015 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

Modified screening and ranking algorithm for copy number variation detection.

Xiao Feifei F   Min Xiaoyi X   Zhang Heping H  

Bioinformatics (Oxford, England) 20141225 9


<h4>Motivation</h4>Copy number variation (CNV) is a type of structural variation, usually defined as genomic segments that are 1 kb or larger, which present variable copy numbers when compared with a reference genome. The screening and ranking algorithm (SaRa) was recently proposed as an efficient approach for multiple change-points detection, which can be applied to CNV detection. However, some practical issues arise from application of SaRa to single nucleotide polymorphism data.<h4>Results</h  ...[more]

Similar Datasets

| S-EPMC3779928 | biostudies-literature
| S-EPMC7278034 | biostudies-literature
| S-EPMC3245619 | biostudies-literature
| S-EPMC4254366 | biostudies-literature
| S-EPMC5922522 | biostudies-literature
| S-EPMC5233178 | biostudies-literature
| S-EPMC4591043 | biostudies-literature
| S-EPMC4165854 | biostudies-literature
| S-EPMC4510559 | biostudies-literature
| S-EPMC4411081 | biostudies-literature