Unknown

Dataset Information

0

Ranking of non-coding pathogenic variants and putative essential regions of the human genome.


ABSTRACT: A gene is considered essential if loss of function results in loss of viability, fitness or in disease. This concept is well established for coding genes; however, non-coding regions are thought less likely to be determinants of critical functions. Here we train a machine learning model using functional, mutational and structural features, including new genome essentiality metrics, 3D genome organization and enhancer reporter data to identify deleterious variants in non-coding regions. We assess the model for functional correlates by using data from tiling-deletion-based and CRISPR interference screens of activity of cis-regulatory elements in over 3 Mb of genome sequence. Finally, we explore two user cases that involve indels and the disruption of enhancers associated with a developmental disease. We rank variants in the non-coding genome according to their predicted deleteriousness. The model prioritizes non-coding regions associated with regulation of important genes and with cell viability, an in vitro surrogate of essentiality.

SUBMITTER: Wells A 

PROVIDER: S-EPMC6868241 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC5550444 | biostudies-literature
| S-EPMC8754628 | biostudies-literature
| S-EPMC3349421 | biostudies-other
| S-EPMC4572001 | biostudies-literature
| S-EPMC6817816 | biostudies-literature
| S-EPMC7027195 | biostudies-literature
| S-EPMC5929547 | biostudies-literature
| S-EPMC7158377 | biostudies-literature
| S-EPMC9896476 | biostudies-literature
| S-EPMC10583284 | biostudies-literature