Unknown

Dataset Information

0

Support Vector Machines with Disease-gene-centric Network Penalty for High Dimensional Microarray Data.


ABSTRACT: With the availability of genetic pathways or networks and accumulating knowledge on genes with variants predisposing to diseases (disease genes), we propose a disease-gene-centric support vector machine (DGC-SVM) that directly incorporates these two sources of prior information into building microarray-based classifiers for binary classification problems. DGC-SVM aims to detect the genes clustering together and around some key disease genes in a gene network. To achieve this goal, we propose a penalty over suitably defined groups of genes. A hierarchy is imposed on an undirected gene network to facilitate the definition of such gene groups. Our proposed DGC-SVM utilizes the hinge loss penalized by a sum of the L(infinity)-norm being applied to each group. The simulation studies show that DGC-SVM not only detects more disease genes along pathways than the existing standard SVM and SVM with an L(1)-penalty (L1-SVM), but also captures disease genes that potentially affect the outcome only weakly. Two real data applications demonstrate that DGC-SVM improves gene selection with predictive performance comparable to the standard-SVM and L1-SVM. The proposed method has the potential to be an effective classification tool that encourages gene selection along paths to or clustering around known disease genes for microarray data.

SUBMITTER: Zhu Y 

PROVIDER: S-EPMC2854644 | biostudies-literature | 2009

REPOSITORIES: biostudies-literature

altmetric image

Publications

Support Vector Machines with Disease-gene-centric Network Penalty for High Dimensional Microarray Data.

Zhu Yanni Y   Pan Wei W   Shen Xiaotong X  

Statistics and its interface 20090101 3


With the availability of genetic pathways or networks and accumulating knowledge on genes with variants predisposing to diseases (disease genes), we propose a disease-gene-centric support vector machine (DGC-SVM) that directly incorporates these two sources of prior information into building microarray-based classifiers for binary classification problems. DGC-SVM aims to detect the genes clustering together and around some key disease genes in a gene network. To achieve this goal, we propose a p  ...[more]

Similar Datasets

| S-EPMC6553498 | biostudies-literature
| S-EPMC4709852 | biostudies-literature
| S-EPMC5008053 | biostudies-literature
| S-EPMC3668975 | biostudies-literature
| S-EPMC2492881 | biostudies-other
| S-EPMC5120762 | biostudies-literature
| S-EPMC5266326 | biostudies-literature
| S-EPMC5056733 | biostudies-literature
| S-EPMC2700806 | biostudies-literature
| S-EPMC2741455 | biostudies-literature