Unknown

Dataset Information

0

Identifying and Reducing Systematic Errors in Chromosome Conformation Capture Data.


ABSTRACT: Chromosome conformation capture (3C)-based techniques have recently been used to uncover the mystic genomic architecture in the nucleus. These techniques yield indirect data on the distances between genomic loci in the form of contact frequencies that must be normalized to remove various errors. This normalization process determines the quality of data analysis. In this study, we describe two systematic errors that result from the heterogeneous local density of restriction sites and different local chromatin states, methods to identify and remove those artifacts, and three previously described sources of systematic errors in 3C-based data: fragment length, mappability, and local DNA composition. To explain the effect of systematic errors on the results, we used three different published data sets to show the dependence of the results on restriction enzymes and experimental methods. Comparison of the results from different restriction enzymes shows a higher correlation after removing systematic errors. In contrast, using different methods with the same restriction enzymes shows a lower correlation after removing systematic errors. Notably, the improved correlation of the latter case caused by systematic errors indicates that a higher correlation between results does not ensure the validity of the normalization methods. Finally, we suggest a method to analyze random error and provide guidance for the maximum reproducibility of contact frequency maps.

SUBMITTER: Hahn S 

PROVIDER: S-EPMC4696798 | biostudies-literature | 2015

REPOSITORIES: biostudies-literature

altmetric image

Publications

Identifying and Reducing Systematic Errors in Chromosome Conformation Capture Data.

Hahn Seungsoo S   Kim Dongsup D  

PloS one 20151230 12


Chromosome conformation capture (3C)-based techniques have recently been used to uncover the mystic genomic architecture in the nucleus. These techniques yield indirect data on the distances between genomic loci in the form of contact frequencies that must be normalized to remove various errors. This normalization process determines the quality of data analysis. In this study, we describe two systematic errors that result from the heterogeneous local density of restriction sites and different lo  ...[more]

Similar Datasets

| S-EPMC8446342 | biostudies-literature
2021-03-08 | GSE163666 | GEO
2021-03-08 | GSE165895 | GEO
| S-EPMC4324155 | biostudies-literature
2021-03-08 | GSE165894 | GEO
| S-EPMC4831724 | biostudies-literature
| PRJNA698572 | ENA
| S-EPMC3797596 | biostudies-literature
| S-EPMC4639323 | biostudies-literature
| S-EPMC4818521 | biostudies-literature