Unknown

Dataset Information

0

Deep6mA: A deep learning framework for exploring similar patterns in DNA N6-methyladenine sites across different species.


ABSTRACT: N6-methyladenine (6mA) is an important DNA modification form associated with a wide range of biological processes. Identifying accurately 6mA sites on a genomic scale is crucial for under-standing of 6mA's biological functions. However, the existing experimental techniques for detecting 6mA sites are cost-ineffective, which implies the great need of developing new computational methods for this problem. In this paper, we developed, without requiring any prior knowledge of 6mA and manually crafted sequence features, a deep learning framework named Deep6mA to identify DNA 6mA sites, and its performance is superior to other DNA 6mA prediction tools. Specifically, the 5-fold cross-validation on a benchmark dataset of rice gives the sensitivity and specificity of Deep6mA as 92.96% and 95.06%, respectively, and the overall prediction accuracy is 94%. Importantly, we find that the sequences with 6mA sites share similar patterns across different species. The model trained with rice data predicts well the 6mA sites of other three species: Arabidopsis thaliana, Fragaria vesca and Rosa chinensis with a prediction accuracy over 90%. In addition, we find that (1) 6mA tends to occur at GAGG motifs, which means the sequence near the 6mA site may be conservative; (2) 6mA is enriched in the TATA box of the promoter, which may be the main source of its regulating downstream gene expression.

SUBMITTER: Li Z 

PROVIDER: S-EPMC7924747 | biostudies-literature | 2021 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

Deep6mA: A deep learning framework for exploring similar patterns in DNA N6-methyladenine sites across different species.

Li Zutan Z   Jiang Hangjin H   Kong Lingpeng L   Chen Yuanyuan Y   Lang Kun K   Fan Xiaodan X   Zhang Liangyun L   Pian Cong C  

PLoS computational biology 20210218 2


N6-methyladenine (6mA) is an important DNA modification form associated with a wide range of biological processes. Identifying accurately 6mA sites on a genomic scale is crucial for under-standing of 6mA's biological functions. However, the existing experimental techniques for detecting 6mA sites are cost-ineffective, which implies the great need of developing new computational methods for this problem. In this paper, we developed, without requiring any prior knowledge of 6mA and manually crafte  ...[more]

Similar Datasets

| S-EPMC6797597 | biostudies-literature
| S-EPMC7398112 | biostudies-literature
| S-EPMC7038560 | biostudies-literature
| S-EPMC6746913 | biostudies-literature
| S-EPMC8017269 | biostudies-literature
| S-EPMC7214014 | biostudies-literature
| S-EPMC7185115 | biostudies-literature
| S-EPMC7509169 | biostudies-literature
| S-EPMC6379983 | biostudies-literature
| S-EPMC8383060 | biostudies-literature