Unknown

Dataset Information

0

METHimpute: imputation-guided construction of complete methylomes from WGBS data.


ABSTRACT: BACKGROUND:Whole-genome bisulfite sequencing (WGBS) has become the standard method for interrogating plant methylomes at base resolution. However, deep WGBS measurements remain cost prohibitive for large, complex genomes and for population-level studies. As a result, most published plant methylomes are sequenced far below saturation, with a large proportion of cytosines having either missing data or insufficient coverage. RESULTS:Here we present METHimpute, a Hidden Markov Model (HMM) based imputation algorithm for the analysis of WGBS data. Unlike existing methods, METHimpute enables the construction of complete methylomes by inferring the methylation status and level of all cytosines in the genome regardless of coverage. Application of METHimpute to maize, rice and Arabidopsis shows that the algorithm infers cytosine-resolution methylomes with high accuracy from data as low as 6X, compared to data with 60X, thus making it a cost-effective solution for large-scale studies. CONCLUSIONS:METHimpute provides methylation status calls and levels for all cytosines in the genome regardless of coverage, thus yielding complete methylomes even with low-coverage WGBS datasets. The method has been extensively tested in plants, but should also be applicable to other species. An implementation is available on Bioconductor.

SUBMITTER: Taudt A 

PROVIDER: S-EPMC5992726 | biostudies-literature | 2018 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

METHimpute: imputation-guided construction of complete methylomes from WGBS data.

Taudt Aaron A   Roquis David D   Vidalis Amaryllis A   Wardenaar René R   Johannes Frank F   Colomé-Tatché Maria M  

BMC genomics 20180607 1


<h4>Background</h4>Whole-genome bisulfite sequencing (WGBS) has become the standard method for interrogating plant methylomes at base resolution. However, deep WGBS measurements remain cost prohibitive for large, complex genomes and for population-level studies. As a result, most published plant methylomes are sequenced far below saturation, with a large proportion of cytosines having either missing data or insufficient coverage.<h4>Results</h4>Here we present METHimpute, a Hidden Markov Model (  ...[more]

Similar Datasets

| S-EPMC8756163 | biostudies-literature
| S-EPMC6427844 | biostudies-literature
| S-EPMC7567826 | biostudies-literature
| S-EPMC5862356 | biostudies-literature
| S-EPMC8297389 | biostudies-literature
| S-EPMC5504746 | biostudies-literature
| S-EPMC8493217 | biostudies-literature
| S-EPMC5056679 | biostudies-literature
| S-EPMC7407276 | biostudies-literature
| S-EPMC2795971 | biostudies-literature