Unknown

Dataset Information

0

BS-KNN: An Effective Algorithm for Predicting Protein Subchloroplast Localization.


ABSTRACT: Chloroplasts are organelles found in cells of green plants and eukaryotic algae that conduct photosynthesis. Knowing a protein's subchloroplast location provides in-depth insights about the protein's function and the microenvironment where it interacts with other molecules. In this paper, we present BS-KNN, a bit-score weighted K-nearest neighbor method for predicting proteins' subchloroplast locations. The method makes predictions based on the bit-score weighted Euclidean distance calculated from the composition of selected pseudo-amino acids. Our method achieved 76.4% overall accuracy in assigning proteins to 4 subchloroplast locations in cross-validation. When tested on an independent set that was not seen by the method during the training and feature selection, the method achieved a consistent overall accuracy of 76.0%. The method was also applied to predict subchloroplast locations of proteins in the chloroplast proteome and validated against proteins in Arabidopsis thaliana. The software and datasets of the proposed method are available at https://edisk.fandm.edu/jing.hu/bsknn/bsknn.html.

SUBMITTER: Hu J 

PROVIDER: S-EPMC3256996 | biostudies-literature | 2012

REPOSITORIES: biostudies-literature

altmetric image

Publications

BS-KNN: An Effective Algorithm for Predicting Protein Subchloroplast Localization.

Hu Jing J   Yan Xianghe X  

Evolutionary bioinformatics online 20120105


Chloroplasts are organelles found in cells of green plants and eukaryotic algae that conduct photosynthesis. Knowing a protein's subchloroplast location provides in-depth insights about the protein's function and the microenvironment where it interacts with other molecules. In this paper, we present BS-KNN, a bit-score weighted K-nearest neighbor method for predicting proteins' subchloroplast locations. The method makes predictions based on the bit-score weighted Euclidean distance calculated fr  ...[more]

Similar Datasets

| S-EPMC4860209 | biostudies-literature
| S-EPMC7506799 | biostudies-literature
| S-EPMC5896989 | biostudies-literature
| S-EPMC3050600 | biostudies-literature
| S-EPMC524420 | biostudies-literature
| S-EPMC3267700 | biostudies-literature
| S-EPMC186639 | biostudies-literature
| S-EPMC5001230 | biostudies-literature
| S-EPMC3584913 | biostudies-literature
| S-EPMC5489166 | biostudies-literature