Dataset Information

IsoMiRmap: fast, deterministic and exhaustive mining of isomiRs from short RNA-seq datasets.

ABSTRACT:

Motivation

MicroRNA (miRNA) precursor arms give rise to multiple isoforms simultaneously called 'isomiRs.' IsomiRs from the same arm typically differ by a few nucleotides at either their 5' or 3' termini or both. In humans, the identities and abundances of isomiRs depend on a person's sex and genetic ancestry as well as on tissue type, tissue state and disease type/subtype. Moreover, nearly half of the time the most abundant isomiR differs from the miRNA sequence found in public databases. Accurate mining of isomiRs from deep sequencing data is thus important.

Results

We developed isoMiRmap, a fast, standalone, user-friendly mining tool that identifies and quantifies all isomiRs by directly processing short RNA-seq datasets. IsoMiRmap is a portable 'plug-and-play' tool, requires minimal setup, has modest computing and storage requirements, and can process an RNA-seq dataset with 50 million reads in just a few minutes on an average laptop. IsoMiRmap deterministically and exhaustively reports all isomiRs in a given deep sequencing dataset and quantifies them accurately (no double-counting). IsoMiRmap comprehensively reports all miRNA precursor locations from which an isomiR may be transcribed, tags as 'ambiguous' isomiRs whose sequences exist both inside and outside of the space of known miRNA sequences and reports the public identifiers of common single-nucleotide polymorphisms and documented somatic mutations that may be present in an isomiR. IsoMiRmap also identifies isomiRs with 3' non-templated post-transcriptional additions. Compared to similar tools, isoMiRmap is the fastest, reports more bona fide isomiRs, and provides the most comprehensive information related to an isomiR's transcriptional origin.

Availability and implementation

The codes for isoMiRmap are freely available at https://cm.jefferson.edu/isoMiRmap/ and https://github.com/TJU-CMC-Org/isoMiRmap/. IsomiR profiles for the datasets of the 1000 Genomes Project, spanning five population groups, and The Cancer Genome Atlas (TCGA), spanning 33 cancer studies, are also available at https://cm.jefferson.edu/isoMiRmap/.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Loher P

PROVIDER: S-EPMC8317110 | biostudies-literature | 2021 Jul

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

IsoMiRmap: fast, deterministic and exhaustive mining of isomiRs from short RNA-seq datasets.

Loher Phillipe P Karathanasis Nestoras N Londin Eric E F Bray Paul P Pliatsika Venetia V Telonis Aristeidis G AG Rigoutsos Isidore I

Bioinformatics (Oxford, England) 20210701 13

<h4>Motivation</h4>MicroRNA (miRNA) precursor arms give rise to multiple isoforms simultaneously called 'isomiRs.' IsomiRs from the same arm typically differ by a few nucleotides at either their 5' or 3' termini or both. In humans, the identities and abundances of isomiRs depend on a person's sex and genetic ancestry as well as on tissue type, tissue state and disease type/subtype. Moreover, nearly half of the time the most abundant isomiR differs from the miRNA sequence found in public database ...[more]

PMID: 33471076

Dataset Information

IsoMiRmap: fast, deterministic and exhaustive mining of isomiRs from short RNA-seq datasets.

Motivation

Results

Availability and implementation

Supplementary information

Publications

IsoMiRmap: fast, deterministic and exhaustive mining of isomiRs from short RNA-seq datasets.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

MINTmap: fast and exhaustive profiling of nuclear and mitochondrial tRNA fragments from short RNA-seq data.
| S-EPMC5318995 | biostudies-literature

Threshold-seq: a tool for determining the threshold in short RNA-seq datasets.
| S-EPMC5870860 | biostudies-literature

BATMAN: Fast and Accurate Integration of Single-Cell RNA-Seq Datasets via Minimum-Weight Matching.
| S-EPMC7276436 | biostudies-literature

Deterministic column subset selection for single-cell RNA-Seq.
| S-EPMC6347249 | biostudies-literature

VODKA2: A fast and accurate method to detect non-standard viral genomes from large RNA-seq datasets.
| S-EPMC10168208 | biostudies-literature

RNA splicing analysis using heterogeneous and large RNA-seq datasets.
| S-EPMC9984406 | biostudies-literature

Processing single-cell RNA-seq datasets using SingCellaR.
| S-EPMC8980964 | biostudies-literature

HRT Atlas v1.0 database: redefining human and mouse housekeeping genes and candidate reference transcripts by mining massive RNA-seq datasets.
| S-EPMC7778946 | biostudies-literature

Mining RNA-seq data for infections and contaminations.
| S-EPMC3760913 | biostudies-literature

Single-cell RNA-seq clustering: datasets, models, and algorithms.
| S-EPMC7549635 | biostudies-literature