Unknown

Dataset Information

0

Predicting Apoptosis Protein Subcellular Locations based on the Protein Overlapping Property Matrix and Tri-Gram Encoding.


ABSTRACT: To reveal the working pattern of programmed cell death, knowledge of the subcellular location of apoptosis proteins is essential. Besides the costly and time-consuming method of experimental determination, research into computational locating schemes, focusing mainly on the innovation of representation techniques on protein sequences and the selection of classification algorithms, has become popular in recent decades. In this study, a novel tri-gram encoding model is proposed, which is based on using the protein overlapping property matrix (POPM) for predicting apoptosis protein subcellular location. Next, a 1000-dimensional feature vector is built to represent a protein. Finally, with the help of support vector machine-recursive feature elimination (SVM-RFE), we select the optimal features and put them into a support vector machine (SVM) classifier for predictions. The results of jackknife tests on two benchmark datasets demonstrate that our proposed method can achieve satisfactory prediction performance level with less computing capacity required and could work as a promising tool to predict the subcellular locations of apoptosis proteins.

SUBMITTER: Yang Y 

PROVIDER: S-EPMC6539631 | biostudies-literature | 2019 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

Predicting Apoptosis Protein Subcellular Locations based on the Protein Overlapping Property Matrix and Tri-Gram Encoding.

Yang Yang Y   Zheng Huiwen H   Wang Chunhua C   Xiao Wanyue W   Liu Taigang T  

International journal of molecular sciences 20190511 9


To reveal the working pattern of programmed cell death, knowledge of the subcellular location of apoptosis proteins is essential. Besides the costly and time-consuming method of experimental determination, research into computational locating schemes, focusing mainly on the innovation of representation techniques on protein sequences and the selection of classification algorithms, has become popular in recent decades. In this study, a novel tri-gram encoding model is proposed, which is based on  ...[more]

Similar Datasets

| S-EPMC7421015 | biostudies-literature
| S-EPMC8603309 | biostudies-literature
| S-EPMC1525000 | biostudies-literature
| S-EPMC3900678 | biostudies-literature
| S-EPMC5210537 | biostudies-literature
| S-EPMC8687432 | biostudies-literature
| S-EPMC2396522 | biostudies-literature
| S-EPMC8660898 | biostudies-literature
| S-EPMC3050600 | biostudies-literature
| S-EPMC524420 | biostudies-literature