Unknown

Dataset Information

0

PCA-HPR: a principle component analysis model for human promoter recognition.


ABSTRACT: We describe a promoter recognition method named PCA-HPR to locate eukaryotic promoter regions and predict transcription start sites (TSSs). We computed codon (3-mer) and pentamer (5-mer) frequencies and created codon and pentamer frequency feature matrices to extract informative and discriminative features for effective classification. Principal component analysis (PCA) is applied to the feature matrices and a subset of principal components (PCs) are selected for classification. Our system uses three neural network classifiers to distinguish promoters versus exons, promoters versus introns, and promoters versus 3' un-translated region (3'UTR). We compared PCA-HPR with three well-known existing promoter prediction systems such as DragonGSF, Eponine and FirstEF. Validation shows that PCA-HPR achieves the best performance with three test sets for all the four predictive systems.

SUBMITTER: Li X 

PROVIDER: S-EPMC2533055 | biostudies-literature | 2008 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

PCA-HPR: a principle component analysis model for human promoter recognition.

Li Xiaomeng X   Zeng Jia J   Yan Hong H  

Bioinformation 20080619 9


We describe a promoter recognition method named PCA-HPR to locate eukaryotic promoter regions and predict transcription start sites (TSSs). We computed codon (3-mer) and pentamer (5-mer) frequencies and created codon and pentamer frequency feature matrices to extract informative and discriminative features for effective classification. Principal component analysis (PCA) is applied to the feature matrices and a subset of principal components (PCs) are selected for classification. Our system uses  ...[more]

Similar Datasets

| S-EPMC1386710 | biostudies-literature
| S-EPMC4223091 | biostudies-literature
| S-EPMC7999099 | biostudies-literature
| S-EPMC3203156 | biostudies-literature
| S-EPMC5690092 | biostudies-literature
| S-EPMC10569523 | biostudies-literature
| S-EPMC5320558 | biostudies-literature
| S-EPMC4086984 | biostudies-literature
| S-EPMC5643796 | biostudies-literature
| S-EPMC5525078 | biostudies-other