Dataset Information

Unreasonable effectiveness of learning neural networks: From accessible states and robust ensembles to basic algorithmic schemes.

ABSTRACT: In artificial neural networks, learning from data is a computationally demanding task in which a large number of connection weights are iteratively tuned through stochastic-gradient-based heuristic processes over a cost function. It is not well understood how learning occurs in these systems, in particular how they avoid getting trapped in configurations with poor computational performance. Here, we study the difficult case of networks with discrete weights, where the optimization landscape is very rough even for simple architectures, and provide theoretical and numerical evidence of the existence of rare-but extremely dense and accessible-regions of configurations in the network weight space. We define a measure, the robust ensemble (RE), which suppresses trapping by isolated configurations and amplifies the role of these dense regions. We analytically compute the RE in some exactly solvable models and also provide a general algorithmic scheme that is straightforward to implement: define a cost function given by a sum of a finite number of replicas of the original cost function, with a constraint centering the replicas around a driving assignment. To illustrate this, we derive several powerful algorithms, ranging from Markov Chains to message passing to gradient descent processes, where the algorithms target the robust dense states, resulting in substantial improvements in performance. The weak dependence on the number of precision bits of the weights leads us to conjecture that very similar reasoning applies to more conventional neural networks. Analogous algorithmic schemes can also be applied to other optimization problems.

SUBMITTER: Baldassi C

PROVIDER: S-EPMC5137727 | biostudies-literature | 2016 Nov

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Unreasonable effectiveness of learning neural networks: From accessible states and robust ensembles to basic algorithmic schemes.

Baldassi Carlo C Borgs Christian C Chayes Jennifer T JT Ingrosso Alessandro A Lucibello Carlo C Saglietti Luca L Zecchina Riccardo R

Proceedings of the National Academy of Sciences of the United States of America 20161115 48

In artificial neural networks, learning from data is a computationally demanding task in which a large number of connection weights are iteratively tuned through stochastic-gradient-based heuristic processes over a cost function. It is not well understood how learning occurs in these systems, in particular how they avoid getting trapped in configurations with poor computational performance. Here, we study the difficult case of networks with discrete weights, where the optimization landscape is v ...[more]

PMID: 27856745

Similar Datasets

Project description:Highlights • COVID-19 outbreak belongs to the simple universality class of the SIR model and extensions thereof.• The unpredictable non-stationarity of the testing frames behind the figures reported by national authorities sets a fundamental limitation to any theoretical approach.• The time evolution of the reporting rates controls the occurrence of the apparent epidemic peak, which typically follows the true one in countries that were not vigorous enough in their testing at the onset of the outbreak. When the novel coronavirus disease SARS-CoV2 (COVID-19) was officially declared a pandemic by the WHO in March 2020, the scientific community had already braced up in the effort of making sense of the fast-growing wealth of data gathered by national authorities all over the world. However, despite the diversity of novel theoretical approaches and the comprehensiveness of many widely established models, the official figures that recount the course of the outbreak still sketch a largely elusive and intimidating picture. Here we show unambiguously that the dynamics of the COVID-19 outbreak belongs to the simple universality class of the SIR model and extensions thereof. Our analysis naturally leads us to establish that there exists a fundamental limitation to any theoretical approach, namely the unpredictable non-stationarity of the testing frames behind the reported figures. However, we show how such bias can be quantified self-consistently and employed to mine useful and accurate information from the data. In particular, we describe how the time evolution of the reporting rates controls the occurrence of the apparent epidemic peak, which typically follows the true one in countries that were not vigorous enough in their testing at the onset of the outbreak. The importance of testing early and resolutely appears as a natural corollary of our analysis, as countries that tested massively at the start clearly had their true peak earlier and less deaths overall.

Dataset Information

Unreasonable effectiveness of learning neural networks: From accessible states and robust ensembles to basic algorithmic schemes.

Publications

Unreasonable effectiveness of learning neural networks: From accessible states and robust ensembles to basic algorithmic schemes.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets