Unknown

Dataset Information

0

A novel missense-mutation-related feature extraction scheme for 'driver' mutation identification.


ABSTRACT:

Motivation

It becomes widely accepted that human cancer is a disease involving dynamic changes in the genome and that the missense mutations constitute the bulk of human genetic variations. A multitude of computational algorithms, especially the machine learning-based ones, has consequently been proposed to distinguish missense changes that contribute to the cancer progression ('driver' mutation) from those that do not ('passenger' mutation). However, the existing methods have multifaceted shortcomings, in the sense that they either adopt incomplete feature space or depend on protein structural databases which are usually far from integrated.

Results

In this article, we investigated multiple aspects of a missense mutation and identified a novel feature space that well distinguishes cancer-associated driver mutations from passenger ones. An index (DX score) was proposed to evaluate the discriminating capability of each feature, and a subset of these features which ranks top was selected to build the SVM classifier. Cross-validation showed that the classifier trained on our selected features significantly outperforms the existing ones both in precision and robustness. We applied our method to several datasets of missense mutations culled from published database and literature and obtained more reasonable results than previous studies.

Availability

The software is available online at http://www.methodisthealth.com/software and https://sites.google.com/site/drivermutationidentification/.

Contact

xzhou@tmhs.org.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Tan H 

PROVIDER: S-EPMC3496432 | biostudies-literature | 2012 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

A novel missense-mutation-related feature extraction scheme for 'driver' mutation identification.

Tan Hua H   Bao Jiguang J   Zhou Xiaobo X  

Bioinformatics (Oxford, England) 20121007 22


<h4>Motivation</h4>It becomes widely accepted that human cancer is a disease involving dynamic changes in the genome and that the missense mutations constitute the bulk of human genetic variations. A multitude of computational algorithms, especially the machine learning-based ones, has consequently been proposed to distinguish missense changes that contribute to the cancer progression ('driver' mutation) from those that do not ('passenger' mutation). However, the existing methods have multifacet  ...[more]

Similar Datasets

| S-EPMC4163459 | biostudies-literature
| S-EPMC4139871 | biostudies-other
| S-EPMC3813554 | biostudies-literature
| S-EPMC7678044 | biostudies-literature
| S-EPMC8764372 | biostudies-literature
| S-EPMC5660011 | biostudies-other
| S-EPMC5104389 | biostudies-literature
| S-EPMC6102629 | biostudies-literature
| S-EPMC7940930 | biostudies-literature
| S-EPMC5454518 | biostudies-literature