Unknown

Dataset Information

0

BEclear: Batch Effect Detection and Adjustment in DNA Methylation Data.


ABSTRACT: Batch effects describe non-natural variations of, for example, large-scale genomic data sets. If not corrected by suitable numerical algorithms, batch effects may seriously affect the analysis of these datasets. The novel array platform independent software tool BEclear enables researchers to identify those portions of the data that deviate statistically significant from the remaining data and to replace these portions by typical values reconstructed from neighboring data entries based on latent factor models. In contrast to other comparable methods that often use some sort of global normalization of the data, BEclear avoids changing the apparently unaffected parts of the data. We tested the performance of this approach on DNA methylation data for various tumor data sets taken from The Cancer Genome Atlas and compared the results to those obtained with the existing algorithms ComBat, Surrogate Variable Analysis, RUVm and Functional normalization. BEclear constantly performed at par with or better than these methods. BEclear is available as an R package at the Bioconductor project http://bioconductor.org/packages/release/bioc/html/BEclear.html.

SUBMITTER: Akulenko R 

PROVIDER: S-EPMC4999208 | biostudies-literature | 2016

REPOSITORIES: biostudies-literature

altmetric image

Publications

BEclear: Batch Effect Detection and Adjustment in DNA Methylation Data.

Akulenko Ruslan R   Merl Markus M   Helms Volkhard V  

PloS one 20160825 8


Batch effects describe non-natural variations of, for example, large-scale genomic data sets. If not corrected by suitable numerical algorithms, batch effects may seriously affect the analysis of these datasets. The novel array platform independent software tool BEclear enables researchers to identify those portions of the data that deviate statistically significant from the remaining data and to replace these portions by typical values reconstructed from neighboring data entries based on latent  ...[more]

Similar Datasets

| S-EPMC4065794 | biostudies-literature
| S-EPMC4710051 | biostudies-literature
| S-EPMC7518324 | biostudies-literature
| S-EPMC7214039 | biostudies-literature
| S-EPMC5864890 | biostudies-literature
| S-EPMC3265417 | biostudies-literature
| S-EPMC3046121 | biostudies-literature
| S-EPMC6454417 | biostudies-literature
| S-EPMC6129283 | biostudies-literature
| S-EPMC8324985 | biostudies-literature