Dataset Information

Penalized weighted low-rank approximation for robust recovery of recurrent copy number variations.

ABSTRACT:

Background

Copy number variation (CNV) analysis has become one of the most important research areas for understanding complex disease. With increasing resolution of array-based comparative genomic hybridization (aCGH) arrays, more and more raw copy number data are collected for multiple arrays. It is natural to realize the co-existence of both recurrent and individual-specific CNVs, together with the possible data contamination during the data generation process. Therefore, there is a great need for an efficient and robust statistical model for simultaneous recovery of both recurrent and individual-specific CNVs.

Result

We develop a penalized weighted low-rank approximation method (WPLA) for robust recovery of recurrent CNVs. In particular, we formulate multiple aCGH arrays into a realization of a hidden low-rank matrix with some random noises and let an additional weight matrix account for those individual-specific effects. Thus, we do not restrict the random noise to be normally distributed, or even homogeneous. We show its performance through three real datasets and twelve synthetic datasets from different types of recurrent CNV regions associated with either normal random errors or heavily contaminated errors.

Conclusion

Our numerical experiments have demonstrated that the WPLA can successfully recover the recurrent CNV patterns from raw data under different scenarios. Compared with two other recent methods, it performs the best regarding its ability to simultaneously detect both recurrent and individual-specific CNVs under normal random errors. More importantly, the WPLA is the only method which can effectively recover the recurrent CNVs region when the data is heavily contaminated.

SUBMITTER: Gao X

PROVIDER: S-EPMC4676147 | biostudies-literature | 2015 Dec

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Penalized weighted low-rank approximation for robust recovery of recurrent copy number variations.

Gao Xiaoli X

BMC bioinformatics 20151210

<h4>Background</h4>Copy number variation (CNV) analysis has become one of the most important research areas for understanding complex disease. With increasing resolution of array-based comparative genomic hybridization (aCGH) arrays, more and more raw copy number data are collected for multiple arrays. It is natural to realize the co-existence of both recurrent and individual-specific CNVs, together with the possible data contamination during the data generation process. Therefore, there is a gr ...[more]

PMID: 26652207

Dataset Information

Penalized weighted low-rank approximation for robust recovery of recurrent copy number variations.

Background

Result

Conclusion

Publications

Penalized weighted low-rank approximation for robust recovery of recurrent copy number variations.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Identifying disease-associated copy number variations by a doubly penalized regression model.
| S-EPMC6663092 | biostudies-literature

Estimates of penetrance for recurrent pathogenic copy-number variations.
| S-EPMC3664238 | biostudies-literature

A Novel Graph-based Algorithm to Infer Recurrent Copy Number Variations in Cancer.
| S-EPMC5063805 | biostudies-literature

Characterization of Copy-Number Variations and Possible Candidate Genes in Recurrent Pregnancy Losses.
| S-EPMC7911754 | biostudies-literature

Copy number variations among silkworms.
| S-EPMC3997817 | biostudies-literature

Copy number variations and stroke.
| S-EPMC5110597 | biostudies-literature

Human subtelomeric copy number variations.
| S-EPMC2731494 | biostudies-literature

Next-generation sequencing identifies recurrent copy number variations in invasive breast carcinomas from Ghana.
| S-EPMC7390688 | biostudies-literature

Penalized weighted proportional hazards model for robust variable selection and outlier detection.
| S-EPMC9283382 | biostudies-literature

Copy-number variations and human disease.
| S-EPMC1950804 | biostudies-literature