Unknown

Dataset Information

0

Feature selection and molecular classification of cancer using genetic programming.


ABSTRACT: Despite important advances in microarray-based molecular classification of tumors, its application in clinical settings remains formidable. This is in part due to the limitation of current analysis programs in discovering robust biomarkers and developing classifiers with a practical set of genes. Genetic programming (GP) is a type of machine learning technique that uses evolutionary algorithm to simulate natural selection as well as population dynamics, hence leading to simple and comprehensible classifiers. Here we applied GP to cancer expression profiling data to select feature genes and build molecular classifiers by mathematical integration of these genes. Analysis of thousands of GP classifiers generated for a prostate cancer data set revealed repetitive use of a set of highly discriminative feature genes, many of which are known to be disease associated. GP classifiers often comprise five or less genes and successfully predict cancer types and subtypes. More importantly, GP classifiers generated in one study are able to predict samples from an independent study, which may have used different microarray platforms. In addition, GP yielded classification accuracy better than or similar to conventional classification methods. Furthermore, the mathematical expression of GP classifiers provides insights into relationships between classifier genes. Taken together, our results demonstrate that GP may be valuable for generating effective classifiers containing a practical set of genes for diagnostic/prognostic cancer classification.

SUBMITTER: Yu J 

PROVIDER: S-EPMC1854845 | biostudies-other | 2007 Apr

REPOSITORIES: biostudies-other

altmetric image

Publications

Feature selection and molecular classification of cancer using genetic programming.

Yu Jianjun J   Yu Jindan J   Almal Arpit A AA   Dhanasekaran Saravana M SM   Ghosh Debashis D   Worzel William P WP   Chinnaiyan Arul M AM  

Neoplasia (New York, N.Y.) 20070401 4


Despite important advances in microarray-based molecular classification of tumors, its application in clinical settings remains formidable. This is in part due to the limitation of current analysis programs in discovering robust biomarkers and developing classifiers with a practical set of genes. Genetic programming (GP) is a type of machine learning technique that uses evolutionary algorithm to simulate natural selection as well as population dynamics, hence leading to simple and comprehensible  ...[more]

Similar Datasets

| S-EPMC5050509 | biostudies-literature
| S-EPMC9408964 | biostudies-literature
| S-EPMC8691854 | biostudies-literature
| S-EPMC6751684 | biostudies-literature
| S-EPMC6206917 | biostudies-literature
| S-EPMC6986087 | biostudies-literature
| S-EPMC10368662 | biostudies-literature
| S-EPMC9146727 | biostudies-literature
| S-EPMC6755645 | biostudies-literature
| S-EPMC6156846 | biostudies-other