Unknown

Dataset Information

0

A prostate cancer model build by a novel SVM-ID3 hybrid feature selection method using both genotyping and phenotype data from dbGaP.


ABSTRACT: Through Genome Wide Association Studies (GWAS) many Single Nucleotide Polymorphism (SNP)-complex disease relations can be investigated. The output of GWAS can be high in amount and high dimensional, also relations between SNPs, phenotypes and diseases are most likely to be nonlinear. In order to handle high volume-high dimensional data and to be able to find the nonlinear relations we have utilized data mining approaches and a hybrid feature selection model of support vector machine and decision tree has been designed. The designed model is tested on prostate cancer data and for the first time combined genotype and phenotype information is used to increase the diagnostic performance. We were able to select phenotypic features such as ethnicity and body mass index, and SNPs those map to specific genes such as CRR9, TERT. The performance results of the proposed hybrid model, on prostate cancer dataset, with 90.92% of sensitivity and 0.91 of area under ROC curve, shows the potential of the approach for prediction and early detection of the prostate cancer.

SUBMITTER: Yucebas SC 

PROVIDER: S-EPMC3961262 | biostudies-literature | 2014

REPOSITORIES: biostudies-literature

altmetric image

Publications

A prostate cancer model build by a novel SVM-ID3 hybrid feature selection method using both genotyping and phenotype data from dbGaP.

Yücebaş Sait Can SC   Aydın Son Yeşim Y  

PloS one 20140320 3


Through Genome Wide Association Studies (GWAS) many Single Nucleotide Polymorphism (SNP)-complex disease relations can be investigated. The output of GWAS can be high in amount and high dimensional, also relations between SNPs, phenotypes and diseases are most likely to be nonlinear. In order to handle high volume-high dimensional data and to be able to find the nonlinear relations we have utilized data mining approaches and a hybrid feature selection model of support vector machine and decision  ...[more]

Similar Datasets

| S-EPMC8669608 | biostudies-literature
| S-EPMC6466481 | biostudies-literature
| S-EPMC7085772 | biostudies-literature
| S-EPMC8147911 | biostudies-literature
| S-EPMC9044222 | biostudies-literature
| S-EPMC10496003 | biostudies-literature
| S-EPMC5525094 | biostudies-other
| S-EPMC9985192 | biostudies-literature
| S-EPMC3796884 | biostudies-literature
2010-08-27 | GSE23816 | GEO