Dataset Information

Joint Optimization of Deep Neural Network-Based Dereverberation and Beamforming for Sound Event Detection in Multi-Channel Environments.

ABSTRACT: In this paper, we propose joint optimization of deep neural network (DNN)-supported dereverberation and beamforming for the convolutional recurrent neural network (CRNN)-based sound event detection (SED) in multi-channel environments. First, the short-time Fourier transform (STFT) coefficients are calculated from multi-channel audio signals under the noisy and reverberant environments, which are then enhanced by the DNN-supported weighted prediction error (WPE) dereverberation with the estimated masks. Next, the STFT coefficients of the dereverberated multi-channel audio signals are conveyed to the DNN-supported minimum variance distortionless response (MVDR) beamformer in which DNN-supported MVDR beamforming is carried out with the source and noise masks estimated by the DNN. As a result, the single-channel enhanced STFT coefficients are shown at the output and tossed to the CRNN-based SED system, and then, the three modules are jointly trained by the single loss function designed for SED. Furthermore, to ease the difficulty of training a deep learning model for SED caused by the imbalance in the amount of data for each class, the focal loss is used as a loss function. Experimental results show that joint training of DNN-supported dereverberation and beamforming with the SED model under the supervision of focal loss significantly improves the performance under the noisy and reverberant environments.

SUBMITTER: Noh K

PROVIDER: S-EPMC7180550 | biostudies-literature | 2020 Mar

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Joint Optimization of Deep Neural Network-Based Dereverberation and Beamforming for Sound Event Detection in Multi-Channel Environments.

Noh Kyoungjin K Chang Joon-Hyuk JH

Sensors (Basel, Switzerland) 20200328 7

In this paper, we propose joint optimization of deep neural network (DNN)-supported dereverberation and beamforming for the convolutional recurrent neural network (CRNN)-based sound event detection (SED) in multi-channel environments. First, the short-time Fourier transform (STFT) coefficients are calculated from multi-channel audio signals under the noisy and reverberant environments, which are then enhanced by the DNN-supported weighted prediction error (WPE) dereverberation with the estimated ...[more]

PMID: 32231161

Dataset Information

Joint Optimization of Deep Neural Network-Based Dereverberation and Beamforming for Sound Event Detection in Multi-Channel Environments.

Publications

Joint Optimization of Deep Neural Network-Based Dereverberation and Beamforming for Sound Event Detection in Multi-Channel Environments.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Deep Multi-Task Multi-Channel Learning for Joint Classification and Regression of Brain Status.
| S-EPMC5942232 | biostudies-literature

Deep neural network models of sound localization reveal how perception is adapted to real-world environments.
| S-EPMC8830739 | biostudies-literature

Joint Classification and Regression via Deep Multi-Task Multi-Channel Learning for Alzheimer's Disease Diagnosis.
| S-EPMC6764421 | biostudies-literature

Characterization inference based on joint-optimization of multi-layer semantics and deep fusion matching network.
| S-EPMC9044352 | biostudies-literature

Continuous robust sound event classification using time-frequency features and deep learning.
| S-EPMC5593179 | biostudies-other

Hidden hearing loss selectively impairs neural adaptation to loud sound environments.
| S-EPMC6191434 | biostudies-literature

Empowering deep neural quantum states through efficient optimization.
| S-EPMC11392813 | biostudies-literature

Deep Learning for Ultrasound Beamforming in Flexible Array Transducer.
| S-EPMC8609563 | biostudies-literature

Tracheal Sound Analysis Using a Deep Neural Network to Detect Sleep Apnea.
| S-EPMC6707047 | biostudies-literature

Bayesian prior uncertainty and surprisal elicit distinct neural patterns during sound localization in dynamic environments.
| S-EPMC11885517 | biostudies-literature