Unknown

Dataset Information

0

An Information Entropy-Based Approach for Computationally Identifying Histone Lysine Butyrylation.


ABSTRACT: Butyrylation plays a crucial role in the cellular processes. Due to limit of techniques, it is a challenging task to identify histone butyrylation sites on a large scale. To fill the gap, we propose an approach based on information entropy and machine learning for computationally identifying histone butyrylation sites. The proposed method achieves 0.92 of area under the receiver operating characteristic (ROC) curve over the training set by 3-fold cross validation and 0.80 over the testing set by independent test. Feature analysis implies that amino acid residues in the down/upstream of butyrylation sites would exhibit specific sequence motif to a certain extent. Functional analysis suggests that histone butyrylation was most possibly associated with four pathways (systemic lupus erythematosus, alcoholism, viral carcinogenesis and transcriptional misregulation in cancer), was involved in binding with other molecules, processes of biosynthesis, assembly, arrangement or disassembly and was located in such complex as consists of DNA, RNA, protein, etc. The proposed method is useful to predict histone butyrylation sites. Analysis of feature and function improves understanding of histone butyrylation and increases knowledge of functions of butyrylated histones.

SUBMITTER: Huang G 

PROVIDER: S-EPMC7033570 | biostudies-literature | 2019

REPOSITORIES: biostudies-literature

altmetric image

Publications

An Information Entropy-Based Approach for Computationally Identifying Histone Lysine Butyrylation.

Huang Guohua G   Zheng Yang Y   Wu Yao-Qun YQ   Han Guo-Sheng GS   Yu Zu-Guo ZG  

Frontiers in genetics 20200214


Butyrylation plays a crucial role in the cellular processes. Due to limit of techniques, it is a challenging task to identify histone butyrylation sites on a large scale. To fill the gap, we propose an approach based on information entropy and machine learning for computationally identifying histone butyrylation sites. The proposed method achieves 0.92 of area under the receiver operating characteristic (ROC) curve over the training set by 3-fold cross validation and 0.80 over the testing set by  ...[more]

Similar Datasets

| S-EPMC6154804 | biostudies-literature
| S-EPMC7021909 | biostudies-literature
| S-EPMC7712588 | biostudies-literature
| S-EPMC3715501 | biostudies-literature
| S-EPMC2911958 | biostudies-literature
| S-EPMC7495898 | biostudies-literature
| S-EPMC2921183 | biostudies-literature
| S-EPMC7064923 | biostudies-literature
| S-EPMC5741255 | biostudies-literature
2018-09-08 | GSE111308 | GEO