Dataset Information

HiTea: a computational pipeline to identify non-reference transposable element insertions in Hi-C data.

ABSTRACT: Hi-C is a common technique for assessing 3D chromatin conformation. Recent studies have shown that long-range interaction information in Hi-C data can be used to generate chromosome-length genome assemblies and identify large-scale structural variations. Here, we demonstrate the use of Hi-C data in detecting mobile transposable element (TE) insertions genome-wide. Our pipeline Hi-C-based TE analyzer (HiTea) capitalizes on clipped Hi-C reads and is aided by a high proportion of discordant read pairs in Hi-C data to detect insertions of three major families of active human TEs. Despite the uneven genome coverage in Hi-C data, HiTea is competitive with the existing callers based on whole-genome sequencing (WGS) data and can supplement the WGS-based characterization of the TE-insertion landscape. We employ the pipeline to identify TE-insertions from human cell-line Hi-C samples.

Availability and implementation

HiTea is available at https://github.com/parklab/HiTea and as a Docker image.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Jain D

PROVIDER: S-EPMC8599941 | biostudies-literature | 2021 May

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

HiTea: a computational pipeline to identify non-reference transposable element insertions in Hi-C data.

Jain Dhawal D Chu Chong C Alver Burak Han BH Lee Soohyun S Lee Eunjung Alice EA Park Peter J PJ

Bioinformatics (Oxford, England) 20210501 8

Hi-C is a common technique for assessing 3D chromatin conformation. Recent studies have shown that long-range interaction information in Hi-C data can be used to generate chromosome-length genome assemblies and identify large-scale structural variations. Here, we demonstrate the use of Hi-C data in detecting mobile transposable element (TE) insertions genome-wide. Our pipeline Hi-C-based TE analyzer (HiTea) capitalizes on clipped Hi-C reads and is aided by a high proportion of discordant read pa ...[more]

PMID: 33136153

Dataset Information

HiTea: a computational pipeline to identify non-reference transposable element insertions in Hi-C data.

Availability and implementation

Supplementary information

Publications

HiTea: a computational pipeline to identify non-reference transposable element insertions in Hi-C data.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

McClintock: An Integrated Pipeline for Detecting Transposable Element Insertions in Whole-Genome Shotgun Sequencing Data.
| S-EPMC5555480 | biostudies-literature

SnapHiC: a computational pipeline to identify chromatin loops from single-cell Hi-C data.
| S-EPMC8440170 | biostudies-literature

Identification and Genotyping of Transposable Element Insertions From Genome Sequencing Data.
| S-EPMC8906366 | biostudies-literature

Transposable Element Insertions in Long Intergenic Non-Coding RNA Genes.
| S-EPMC4460805 | biostudies-other

Transposable element insertions in 1000 Swedish individuals.
| S-EPMC10381067 | biostudies-literature

SnapHiC-D: a computational pipeline to identify differential chromatin contacts from single cell Hi-C data
2023-08-24 | GSE210585 | GEO

Contribution of unfixed transposable element insertions to human regulatory variation.
| S-EPMC7061991 | biostudies-literature

Comprehensive identification of transposable element insertions using multiple sequencing technologies.
| S-EPMC8219666 | biostudies-literature

Chicdiff: a computational pipeline for detecting differential chromosomal interactions in Capture Hi-C data.
| S-EPMC6853696 | biostudies-literature

An age-of-allele test of neutrality for transposable element insertions.
| S-EPMC3914624 | biostudies-literature