Unknown

Dataset Information

0

Identifying centromeric satellites with dna-brnn.


ABSTRACT: SUMMARY:Human alpha satellite and satellite 2/3 contribute to several percent of the human genome. However, identifying these sequences with traditional algorithms is computationally intensive. Here we develop dna-brnn, a recurrent neural network to learn the sequences of the two classes of centromeric repeats. It achieves high similarity to RepeatMasker and is times faster. Dna-brnn explores a novel application of deep learning and may accelerate the study of the evolution of the two repeat classes. AVAILABILITY AND IMPLEMENTATION:https://github.com/lh3/dna-nn.

SUBMITTER: Li H 

PROVIDER: S-EPMC6821349 | biostudies-literature | 2019 Nov

REPOSITORIES: biostudies-literature

Similar Datasets

2018-01-08 | GSE105100 | GEO
| S-EPMC5844345 | biostudies-literature
| PRJNA16922 | ENA
| S-EPMC7756202 | biostudies-literature
| S-EPMC2725476 | biostudies-literature
| PRJNA414639 | ENA
| S-EPMC2725230 | biostudies-literature
| S-EPMC6657561 | biostudies-literature
| S-EPMC9295943 | biostudies-literature
| PRJEB15858 | ENA