Unknown

Dataset Information

0

Benchmark study comparing liftover tools for genome conversion of epigenome sequencing data.


ABSTRACT: As reference genome assemblies are updated there is a need to convert epigenome sequence data from older genome assemblies to newer versions, to facilitate data integration and visualization on the same coordinate system. Conversion can be done by re-alignment of the original sequence data to the new assembly or by converting the coordinates of the data between assemblies using a mapping file, an approach referred to as 'liftover'. Compared to re-alignment approaches, liftover is a more rapid and cost-effective solution. Here, we benchmark six liftover tools commonly used for conversion between genome assemblies by coordinates, including UCSC liftOver, rtracklayer::liftOver, CrossMap, NCBI Remap, flo and segment_liftover to determine how they performed for whole genome bisulphite sequencing (WGBS) and ChIP-seq data. Our results show high correlation between the six tools for conversion of 43 WGBS paired samples. For the chromatin sequencing data we found from interval conversion of 366 ChIP-Seq datasets, segment_liftover generates more reliable results than USCS liftOver. However, we found some regions do not always remain the same after liftover. To further increase the accuracy of liftover and avoid misleading results, we developed a three-step guideline that removes aberrant regions to ensure more robust genome conversion between reference assemblies.

SUBMITTER: Luu PL 

PROVIDER: S-EPMC7671393 | biostudies-literature | 2020 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Benchmark study comparing liftover tools for genome conversion of epigenome sequencing data.

Luu Phuc-Loi PL   Ong Phuc-Thinh PT   Dinh Thanh-Phuoc TP   Clark Susan J SJ  

NAR genomics and bioinformatics 20200806 3


As reference genome assemblies are updated there is a need to convert epigenome sequence data from older genome assemblies to newer versions, to facilitate data integration and visualization on the same coordinate system. Conversion can be done by re-alignment of the original sequence data to the new assembly or by converting the coordinates of the data between assemblies using a mapping file, an approach referred to as 'liftover'. Compared to re-alignment approaches, liftover is a more rapid an  ...[more]

Similar Datasets

| S-EPMC7736078 | biostudies-literature
| S-EPMC6937713 | biostudies-literature
| S-EPMC8617278 | biostudies-literature
| S-EPMC3673212 | biostudies-literature
| S-EPMC3956068 | biostudies-literature
| S-EPMC3620462 | biostudies-literature
| S-EPMC8578599 | biostudies-literature
| S-EPMC5799157 | biostudies-literature