Dataset Information

POIMs: positional oligomer importance matrices--understanding support vector machine-based signal detectors.

ABSTRACT:

Motivation

At the heart of many important bioinformatics problems, such as gene finding and function prediction, is the classification of biological sequences. Frequently the most accurate classifiers are obtained by training support vector machines (SVMs) with complex sequence kernels. However, a cumbersome shortcoming of SVMs is that their learned decision rules are very hard to understand for humans and cannot easily be related to biological facts.

Results

To make SVM-based sequence classifiers more accessible and profitable, we introduce the concept of positional oligomer importance matrices (POIMs) and propose an efficient algorithm for their computation. In contrast to the raw SVM feature weighting, POIMs take the underlying correlation structure of k-mer features induced by overlaps of related k-mers into account. POIMs can be seen as a powerful generalization of sequence logos: they allow to capture and visualize sequence patterns that are relevant for the investigated biological phenomena.

Availability

All source code, datasets, tables and figures are available at http://www.fml.tuebingen.mpg.de/raetsch/projects/POIM.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Sonnenburg S

PROVIDER: S-EPMC2718648 | biostudies-literature | 2008 Jul

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

POIMs: positional oligomer importance matrices--understanding support vector machine-based signal detectors.

Sonnenburg Sören S Zien Alexander A Philips Petra P Rätsch Gunnar G

Bioinformatics (Oxford, England) 20080701 13

<h4>Motivation</h4>At the heart of many important bioinformatics problems, such as gene finding and function prediction, is the classification of biological sequences. Frequently the most accurate classifiers are obtained by training support vector machines (SVMs) with complex sequence kernels. However, a cumbersome shortcoming of SVMs is that their learned decision rules are very hard to understand for humans and cannot easily be related to biological facts.<h4>Results</h4>To make SVM-based seq ...[more]

PMID: 18586746

Dataset Information

POIMs: positional oligomer importance matrices--understanding support vector machine-based signal detectors.

Motivation

Results

Availability

Supplementary information

Publications

POIMs: positional oligomer importance matrices--understanding support vector machine-based signal detectors.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Support vector machine classification of streptavidin-binding aptamers.
| S-EPMC4057401 | biostudies-literature

Analysis of Asperger Syndrome Using Genetic-Evolutionary Random Support Vector Machine Cluster.
| S-EPMC6262410 | biostudies-other

Segmentation of Doppler optical coherence tomography signatures using a support-vector machine.
| S-EPMC3087589 | biostudies-other

Targeted Local Support Vector Machine for Age-Dependent Classification.
| S-EPMC4183366 | biostudies-literature

Variational quantum approximate support vector machine with inference transfer.
| S-EPMC9968349 | biostudies-literature

Ecological footprint model using the support vector machine technique.
| S-EPMC3264588 | biostudies-literature

A novel machine learning strategy for model selections - Stepwise Support Vector Machine (StepSVM).
| S-EPMC7451646 | biostudies-literature

Multilevel Weighted Support Vector Machine for Classification on Healthcare Data with Missing Values.
| S-EPMC4873242 | biostudies-other

PSO-based support vector machine with cuckoo search technique for clinical disease diagnoses.
| S-EPMC4058169 | biostudies-other

Seismic Discrimination between Earthquakes and Explosions Using Support Vector Machine.
| S-EPMC7180981 | biostudies-literature