Dataset Information

Development of a Bayesian Estimator for Audio-Visual Integration: A Neurocomputational Study.

ABSTRACT: The brain integrates information from different sensory modalities to generate a coherent and accurate percept of external events. Several experimental studies suggest that this integration follows the principle of Bayesian estimate. However, the neural mechanisms responsible for this behavior, and its development in a multisensory environment, are still insufficiently understood. We recently presented a neural network model of audio-visual integration (Neural Computation, 2017) to investigate how a Bayesian estimator can spontaneously develop from the statistics of external stimuli. Model assumes the presence of two unimodal areas (auditory and visual) topologically organized. Neurons in each area receive an input from the external environment, computed as the inner product of the sensory-specific stimulus and the receptive field synapses, and a cross-modal input from neurons of the other modality. Based on sensory experience, synapses were trained via Hebbian potentiation and a decay term. Aim of this work is to improve the previous model, including a more realistic distribution of visual stimuli: visual stimuli have a higher spatial accuracy at the central azimuthal coordinate and a lower accuracy at the periphery. Moreover, their prior probability is higher at the center, and decreases toward the periphery. Simulations show that, after training, the receptive fields of visual and auditory neurons shrink to reproduce the accuracy of the input (both at the center and at the periphery in the visual case), thus realizing the likelihood estimate of unimodal spatial position. Moreover, the preferred positions of visual neurons contract toward the center, thus encoding the prior probability of the visual input. Finally, a prior probability of the co-occurrence of audio-visual stimuli is encoded in the cross-modal synapses. The model is able to simulate the main properties of a Bayesian estimator and to reproduce behavioral data in all conditions examined. In particular, in unisensory conditions the visual estimates exhibit a bias toward the fovea, which increases with the level of noise. In cross modal conditions, the SD of the estimates decreases when using congruent audio-visual stimuli, and a ventriloquism effect becomes evident in case of spatially disparate stimuli. Moreover, the ventriloquism decreases with the eccentricity.

SUBMITTER: Ursino M

PROVIDER: S-EPMC5633019 | biostudies-literature | 2017

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Development of a Bayesian Estimator for Audio-Visual Integration: A Neurocomputational Study.

Ursino Mauro M Crisafulli Andrea A di Pellegrino Giuseppe G Magosso Elisa E Cuppini Cristiano C

Frontiers in computational neuroscience 20171004

The brain integrates information from different sensory modalities to generate a coherent and accurate percept of external events. Several experimental studies suggest that this integration follows the principle of Bayesian estimate. However, the neural mechanisms responsible for this behavior, and its development in a multisensory environment, are still insufficiently understood. We recently presented a neural network model of audio-visual integration (Neural Computation, 2017) to investigate h ...[more]

PMID: 29046631

Similar Datasets

Project description:Adults combine information from different sensory modalities to estimate object properties such as size or location. This process is optimal in that (i) sensory information is weighted according to relative reliability: more reliable estimates have more influence on the combined estimate and (ii) the combined estimate is more reliable than the component uni-modal estimates. Previous studies suggest that optimal sensory integration does not emerge until around 10 years of age. Younger children rely on a single modality or combine information using inappropriate sensory weights. Children aged 4-11 and adults completed a simple audio-visual task in which they reported either the number of beeps or the number of flashes in uni-modal and bi-modal conditions. In bi-modal trials, beeps and flashes differed in number by 0, 1 or 2. Mutual interactions between the sensory signals were evident at all ages: the reported number of flashes was influenced by the number of simultaneously presented beeps and vice versa. Furthermore, for all ages, the relative strength of these interactions was predicted by the relative reliabilities of the two modalities, in other words, all observers weighted the signals appropriately. The degree of cross-modal interaction decreased with age: the youngest observers could not ignore the task-irrelevant modality-they fully combined vision and audition such that they perceived equal numbers of flashes and beeps for bi-modal stimuli. Older observers showed much smaller effects of the task-irrelevant modality. Do these interactions reflect optimal integration? Full or partial cross-modal integration predicts improved reliability in bi-modal conditions. In contrast, switching between modalities reduces reliability. Model comparison suggests that older observers employed partial integration, whereas younger observers (up to around 8 years) did not integrate, but followed a sub-optimal switching strategy, responding according to either visual or auditory information on each trial.

Project description:In order to parse the world around us, we must constantly determine which sensory inputs arise from the same physical source and should therefore be perceptually integrated. Temporal coherence between auditory and visual stimuli drives audio-visual (AV) integration, but the role played by AV spatial alignment is less well understood. Here, we manipulated AV spatial alignment and collected electroencephalography (EEG) data while human subjects performed a free-field variant of the "pip and pop" AV search task. In this paradigm, visual search is aided by a spatially uninformative auditory tone, the onsets of which are synchronized to changes in the visual target. In Experiment 1, tones were either spatially aligned or spatially misaligned with the visual display. Regardless of AV spatial alignment, we replicated the key pip and pop result of improved AV search times. Mirroring the behavioral results, we found an enhancement of early event-related potentials (ERPs), particularly the auditory N1 component, in both AV conditions. We demonstrate that both top-down and bottom-up attention contribute to these N1 enhancements. In Experiment 2, we tested whether spatial alignment influences AV integration in a more challenging context with competing multisensory stimuli. An AV foil was added that visually resembled the target and was synchronized to its own stream of synchronous tones. The visual components of the AV target and AV foil occurred in opposite hemifields; the two auditory components were also in opposite hemifields and were either spatially aligned or spatially misaligned with the visual components to which they were synchronized. Search was fastest when the auditory and visual components of the AV target (and the foil) were spatially aligned. Attention modulated ERPs in both spatial conditions, but importantly, the scalp topography of early evoked responses shifted only when stimulus components were spatially aligned, signaling the recruitment of different neural generators likely related to multisensory integration. These results suggest that AV integration depends on AV spatial alignment when stimuli in both modalities compete for selective integration, a common scenario in real-world perception.

Dataset Information

Development of a Bayesian Estimator for Audio-Visual Integration: A Neurocomputational Study.

Publications

Development of a Bayesian Estimator for Audio-Visual Integration: A Neurocomputational Study.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets