Unknown

Dataset Information

0

HiTea: a computational pipeline to identify non-reference transposable element insertions in Hi-C data.


ABSTRACT: Hi-C is a common technique for assessing 3D chromatin conformation. Recent studies have shown that long-range interaction information in Hi-C data can be used to generate chromosome-length genome assemblies and identify large-scale structural variations. Here, we demonstrate the use of Hi-C data in detecting mobile transposable element (TE) insertions genome-wide. Our pipeline Hi-C-based TE analyzer (HiTea) capitalizes on clipped Hi-C reads and is aided by a high proportion of discordant read pairs in Hi-C data to detect insertions of three major families of active human TEs. Despite the uneven genome coverage in Hi-C data, HiTea is competitive with the existing callers based on whole-genome sequencing (WGS) data and can supplement the WGS-based characterization of the TE-insertion landscape. We employ the pipeline to identify TE-insertions from human cell-line Hi-C samples.

Availability and implementation

HiTea is available at https://github.com/parklab/HiTea and as a Docker image.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Jain D 

PROVIDER: S-EPMC8599941 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC5555480 | biostudies-literature
| S-EPMC8440170 | biostudies-literature
| S-EPMC8906366 | biostudies-literature
| S-EPMC4460805 | biostudies-other
| S-EPMC10381067 | biostudies-literature
2023-08-24 | GSE210585 | GEO
| S-EPMC7061991 | biostudies-literature
| S-EPMC8219666 | biostudies-literature
| S-EPMC6853696 | biostudies-literature
| S-EPMC3914624 | biostudies-literature