Unknown

Dataset Information

0

Annotating regulatory elements by heterogeneous network embedding.


ABSTRACT:

Motivation

Regulatory elements (REs), such as enhancers and promoters, are known as regulatory sequences functional in a heterogeneous regulatory network to control gene expression by recruiting transcription regulators and carrying genetic variants in a context specific way. Annotating those REs relies on costly and labor-intensive next-generation sequencing and RNA-guided editing technologies in many cellular contexts.

Results

We propose a systematic Gene Ontology Annotation method for Regulatory Elements (RE-GOA) by leveraging the powerful word embedding in natural language processing. We first assemble a heterogeneous network by integrating context specific regulations, protein-protein interactions and gene ontology (GO) terms. Then we perform network embedding and associate regulatory elements with GO terms by assessing their similarity in a low dimensional vector space. With three applications, we show that RE-GOA outperforms existing methods in annotating TFs' binding sites from ChIP-seq data, in functional enrichment analysis of differentially accessible peaks from ATAC-seq data, and in revealing genetic correlation among phenotypes from their GWAS summary statistics data.

Availability and implementation

The source code and the systematic RE annotation for human and mouse are available at https://github.com/AMSSwanglab/RE-GOA.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Lu Y 

PROVIDER: S-EPMC9326849 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC7646926 | biostudies-literature
| S-EPMC6927100 | biostudies-literature
| S-EPMC8098024 | biostudies-literature
| S-EPMC10130187 | biostudies-literature
| S-EPMC7017286 | biostudies-literature
| S-EPMC8223753 | biostudies-literature
| S-EPMC8465678 | biostudies-literature
| S-EPMC8414716 | biostudies-literature
| S-EPMC1309705 | biostudies-literature
| S-EPMC10998536 | biostudies-literature