Dataset Information

Archetypal landscapes for deep neural networks.

ABSTRACT: The predictive capabilities of deep neural networks (DNNs) continue to evolve to increasingly impressive levels. However, it is still unclear how training procedures for DNNs succeed in finding parameters that produce good results for such high-dimensional and nonconvex loss functions. In particular, we wish to understand why simple optimization schemes, such as stochastic gradient descent, do not end up trapped in local minima with high loss values that would not yield useful predictions. We explain the optimizability of DNNs by characterizing the local minima and transition states of the loss-function landscape (LFL) along with their connectivity. We show that the LFL of a DNN in the shallow network or data-abundant limit is funneled, and thus easy to optimize. Crucially, in the opposite low-data/deep limit, although the number of minima increases, the landscape is characterized by many minima with similar loss values separated by low barriers. This organization is different from the hierarchical landscapes of structural glass formers and explains why minimization procedures commonly employed by the machine-learning community can navigate the LFL successfully and reach low-lying solutions.

SUBMITTER: Verpoort PC

PROVIDER: S-EPMC7486703 | biostudies-literature | 2020 Sep

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Archetypal landscapes for deep neural networks.

Verpoort Philipp C PC Lee Alpha A AA Wales David J DJ

Proceedings of the National Academy of Sciences of the United States of America 20200825 36

The predictive capabilities of deep neural networks (DNNs) continue to evolve to increasingly impressive levels. However, it is still unclear how training procedures for DNNs succeed in finding parameters that produce good results for such high-dimensional and nonconvex loss functions. In particular, we wish to understand why simple optimization schemes, such as stochastic gradient descent, do not end up trapped in local minima with high loss values that would not yield useful predictions. We ex ...[more]

PMID: 32843349

Dataset Information

Archetypal landscapes for deep neural networks.

Publications

Archetypal landscapes for deep neural networks.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Training deep quantum neural networks.
| S-EPMC7010779 | biostudies-literature

Learning Extremal Representations with Deep Archetypal Analysis
| S-EPMC8550171 | biostudies-literature

Classification of crystallization outcomes using deep convolutional neural networks.
| S-EPMC6010233 | biostudies-other

Deep biomarkers of human aging: Application of deep neural networks to biomarker development.
| S-EPMC4931851 | biostudies-other

Predicting enhancers with deep convolutional neural networks.
| S-EPMC5773911 | biostudies-literature

Deep Neural Networks for Multicomponent Molecular Systems.
| S-EPMC7450624 | biostudies-literature

Tiller estimation method using deep neural networks.
| S-EPMC9880423 | biostudies-literature

Density estimation using deep generative neural networks.
| S-EPMC8054014 | biostudies-literature

Learning hidden elasticity with deep neural networks.
| S-EPMC8346903 | biostudies-literature

Face detection in untrained deep neural networks.
| S-EPMC8677765 | biostudies-literature