Unknown

Dataset Information

0

DBNorm: normalizing high-density oligonucleotide microarray data based on distributions.


ABSTRACT: BACKGROUND:Data from patients with rare diseases is often produced using different platforms and probe sets because patients are widely distributed in space and time. Aggregating such data requires a method of normalization that makes patient records comparable. RESULTS:This paper proposed DBNorm, implemented as an R package, is an algorithm that normalizes arbitrarily distributed data to a common, comparable form. Specifically, DBNorm merges data distributions by fitting functions to each of them, and using the probability of each element drawn from the fitted distribution to merge it into a global distribution. DBNorm contains state-of-the-art fitting functions including Polynomial, Fourier and Gaussian distributions, and also allows users to define their own fitting functions if required. CONCLUSIONS:The performance of DBNorm is compared with z-score, average difference, quantile normalization and ComBat on a set of datasets, including several that are publically available. The performance of these normalization methods are compared using statistics, visualization, and classification when class labels are known based on a number of self-generated and public microarray datasets. The experimental results show that DBNorm achieves better normalization results than conventional methods. Finally, the approach has the potential to be applicable outside bioinformatics analysis.

SUBMITTER: Meng Q 

PROVIDER: S-EPMC5706403 | biostudies-literature | 2017 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

DBNorm: normalizing high-density oligonucleotide microarray data based on distributions.

Meng Qinxue Q   Catchpoole Daniel D   Skillicorn David D   Kennedy Paul J PJ  

BMC bioinformatics 20171129 1


<h4>Background</h4>Data from patients with rare diseases is often produced using different platforms and probe sets because patients are widely distributed in space and time. Aggregating such data requires a method of normalization that makes patient records comparable.<h4>Results</h4>This paper proposed DBNorm, implemented as an R package, is an algorithm that normalizes arbitrarily distributed data to a common, comparable form. Specifically, DBNorm merges data distributions by fitting function  ...[more]

Similar Datasets

| S-EPMC1459208 | biostudies-literature
| S-EPMC1189082 | biostudies-literature
| S-EPMC55837 | biostudies-literature
| S-EPMC2082023 | biostudies-literature
| S-EPMC2231405 | biostudies-literature
| S-EPMC1409679 | biostudies-literature
| S-EPMC1501020 | biostudies-literature
| S-EPMC1832147 | biostudies-literature
| S-EPMC3734108 | biostudies-literature
| S-EPMC1508160 | biostudies-literature