Methylation profiling

Dataset Information

0

Synthetic spike-in controls enable sensitive and reproducible cell-free methylome interrogation


ABSTRACT: Background.The cell-free methylated DNA immunoprecipitation-sequencing (cfMeDIP-seq) method, is adapted to work with low input DNA and with circulating cell-free DNA (cfDNA). This method allowsfor epigenetic profiling from liquid biopsy samples, providing potential information about tissue of origin. Similar to classical immunoprecipitation based enrichment protocols, interpretation requires a referenceor control to draw inference against a composite experimental baseline and against designed standards allowing for cross-experiment comparisons. Methods.To meet the need for a reference control in cfMeDIP-seqexperiments, we designed spike-in controlsand integrated the use of unique molecular index (UMI) to adjust for polymerase chain reaction (PCR)bias, and immunoprecipitation bias caused by the fragment length, G+C content, and CpG density ofthe DNA fragments. This enables for absolute quantification of methylated DNA in picomoles, while retaining epigenomic information that allows for sensitive, tissue-specific detection as well as comparableresults between different experiments. We designed 54 DNA fragments with combinations of methylationstatus (methylated and unmethylated), fragment length in base pair (bp) (80 bp,160 bp,320 bp), G+C content (35%,50%,65%), and fraction of CpGs within a fragment (1/80 bp,1/40 bp,1/20 bp). We checked spike-in control DNA sequence to ensure they had no cross alignment to the human genome and minimized formation of secondary structures to avoid issues with amplification. We carried outcfMeDIP-seq on either solely spike-in DNA fragments, spike-in DNA added to sheared HCT116 genomic DNA or spike-inDNA added tocfDNAfrom acute myeloid leukemia (AML) samples to assess technical and biological biases, determine optimal amount of spike-in DNA required for an experiment and to assess batch effects,respectively. Results. We show thatcfMeDIP-seqenriches for highly methylated regions, with less than 0.01%non-specific binding and preference to high G+C content and CpG fraction DNA fragments. The use of 0.01 ngof spike-in control DNA results in sufficient sequencing reads to adjust for variance due to fragment length,G+C content and CpG fraction without negatively impacting the number of sequencing reads generatedfor each sample. With known amount of each spike-in control, we generated a generalized linear modelthat can absolutely quantify molar amount from read counts while adjusting for fragment length, G+C content, and CpG fraction. Using our spike-in controls, we show that we can greatly mitigate batch effects,reducing batch associated variance in the data to ≤5%of the total variance. Conclusions.The incorporation of spike-in controls allows for easier interpretation of data generated from cfMeDIP-seq and MeDIP-seq experiments when compared to relative read count. Through the use of a generalized linear model tailored to each experiment, molar amount for each genomic region can becalculated, greatly mitigating both biological and technical biases in the data. We have created an Rpackage, spiky, to convert read counts to DNA picomoles while adjusting for fragment length, G+C contentand CpG fraction.

ORGANISM(S): synthetic construct Homo sapiens

PROVIDER: GSE166259 | GEO | 2021/02/16

REPOSITORIES: GEO

Dataset's files

Source:
Action DRS
Other
Items per page:
1 - 1 of 1

Similar Datasets

| phs000608 | dbGaP
2022-08-18 | GSE211508 | GEO
| phs001255 | dbGaP
| phs000846 | dbGaP
| PRJNA326160 | ENA
2016-12-26 | GSE81806 | GEO
2016-12-26 | GSE81736 | GEO
2009-09-19 | E-MTAB-144 | biostudies-arrayexpress
2023-12-10 | PXD042368 | Pride
2020-02-21 | GSE144781 | GEO