Transmicron: Accurate prediction of insertion probabilities improves detection of cancer driver genes from transposon mutagenesis screens
Ontology highlight
ABSTRACT: Transposon screens are powerful in vivo assays used to identify loci driving carcinogenesis. These loci are identified as Common Insertion Sites (CIS), i.e. regions with more transposon insertions than expected by chance. However, the identification of CIS is strongly affected by biases in the insertion behaviour of transposon systems. Here, we introduce Transmicron, a novel method that differs from previous methods by i) modelling neutral insertion rates based on chromatin accessibility, transcriptional activity, and sequence context, and ii) estimating oncogenic selection for each genomic region using Poisson regression to model insertion counts while controlling for neutral insertion rates. To assess the benefits of our approach, we generated a dataset applying two different transposon systems under comparable conditions. Benchmarking for enrichment of known cancer genes showed improved performance of Transmicron against state-of-the-art methods. Modelling neutral insertion rates allowed for better control of false positives and stronger agreement of the results between transposon systems. Moreover, using Poisson regression to consider intra-sample and inter-sample information proved beneficial in small and moderately-sized datasets. Transmicron is open-source and freely available. Overall, this study contributes to the understanding of transposon biology and introduces a novel approach to use this knowledge for discovering cancer driver genes.
ORGANISM(S): Mus musculus
PROVIDER: GSE214379 | GEO | 2022/10/01
REPOSITORIES: GEO
ACCESS DATA