Unknown

Dataset Information

0

Unmasking Clever Hans predictors and assessing what machines really learn.


ABSTRACT: Current learning machines have successfully solved hard application problems, reaching high accuracy and displaying seemingly intelligent behavior. Here we apply recent techniques for explaining decisions of state-of-the-art learning machines and analyze various tasks from computer vision and arcade games. This showcases a spectrum of problem-solving behaviors ranging from naive and short-sighted, to well-informed and strategic. We observe that standard performance evaluation metrics can be oblivious to distinguishing these diverse problem solving behaviors. Furthermore, we propose our semi-automated Spectral Relevance Analysis that provides a practically effective way of characterizing and validating the behavior of nonlinear learning machines. This helps to assess whether a learned model indeed delivers reliably for the problem that it was conceived for. Furthermore, our work intends to add a voice of caution to the ongoing excitement about machine intelligence and pledges to evaluate and judge some of these recent successes in a more nuanced manner.

SUBMITTER: Lapuschkin S 

PROVIDER: S-EPMC6411769 | biostudies-literature | 2019 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

Unmasking Clever Hans predictors and assessing what machines really learn.

Lapuschkin Sebastian S   Wäldchen Stephan S   Binder Alexander A   Montavon Grégoire G   Samek Wojciech W   Müller Klaus-Robert KR  

Nature communications 20190311 1


Current learning machines have successfully solved hard application problems, reaching high accuracy and displaying seemingly intelligent behavior. Here we apply recent techniques for explaining decisions of state-of-the-art learning machines and analyze various tasks from computer vision and arcade games. This showcases a spectrum of problem-solving behaviors ranging from naive and short-sighted, to well-informed and strategic. We observe that standard performance evaluation metrics can be obli  ...[more]

Similar Datasets

| S-EPMC6941139 | biostudies-literature
| S-EPMC7176709 | biostudies-literature
| S-EPMC4809766 | biostudies-literature
| S-EPMC4446652 | biostudies-other
| S-EPMC7050998 | biostudies-literature
| S-EPMC7807764 | biostudies-literature
| S-EPMC5035636 | biostudies-literature
| S-EPMC4777392 | biostudies-literature