Dataset Information

Rethinking glottal midline detection.

ABSTRACT: A healthy voice is crucial for verbal communication and hence in daily as well as professional life. The basis for a healthy voice are the sound producing vocal folds in the larynx. A hallmark of healthy vocal fold oscillation is the symmetric motion of the left and right vocal fold. Clinically, videoendoscopy is applied to assess the symmetry of the oscillation and evaluated subjectively. High-speed videoendoscopy, an emerging method that allows quantification of the vocal fold oscillation, is more commonly employed in research due to the amount of data and the complex, semi-automatic analysis. In this study, we provide a comprehensive evaluation of methods that detect fully automatically the glottal midline. We used a biophysical model to simulate different vocal fold oscillations, extended the openly available BAGLS dataset using manual annotations, utilized both, simulations and annotated endoscopic images, to train deep neural networks at different stages of the analysis workflow, and compared these to established computer vision algorithms. We found that classical computer vision perform well on detecting the glottal midline in glottis segmentation data, but are outperformed by deep neural networks on this task. We further suggest GlottisNet, a multi-task neural architecture featuring the simultaneous prediction of both, the opening between the vocal folds and the symmetry axis, leading to a huge step forward towards clinical applicability of quantitative, deep learning-assisted laryngeal endoscopy, by fully automating segmentation and midline detection.

SUBMITTER: Kist AM

PROVIDER: S-EPMC7693305 | biostudies-literature | 2020 Nov

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Rethinking glottal midline detection.

Kist Andreas M AM Zilker Julian J Gómez Pablo P Schützenberger Anne A Döllinger Michael M

Scientific reports 20201126 1

A healthy voice is crucial for verbal communication and hence in daily as well as professional life. The basis for a healthy voice are the sound producing vocal folds in the larynx. A hallmark of healthy vocal fold oscillation is the symmetric motion of the left and right vocal fold. Clinically, videoendoscopy is applied to assess the symmetry of the oscillation and evaluated subjectively. High-speed videoendoscopy, an emerging method that allows quantification of the vocal fold oscillation, is ...[more]

PMID: 33244031

Similar Datasets

Project description:Objective. Motor-evoked potentials (MEPs) are among the most prominent responses to brain stimulation, such as supra-threshold transcranial magnetic stimulation and electrical stimulation. Understanding of the neurophysiology and the determination of the lowest stimulation strength that evokes responses requires the detection of even smaller responses, e.g. from single motor units. However, available detection and quantization methods suffer from a large noise floor. This paper develops a detection method that extracts MEPs hidden below the noise floor. With this method, we aim to estimate excitatory activations of the corticospinal pathways well below the conventional detection level.Approach. The presented MEP detection method presents a self-learning matched-filter approach for improved robustness against noise. The filter is adaptively generated per subject through iterative learning. For responses that are reliably detected by conventional detection, the new approach is fully compatible with established peak-to-peak readings and provides the same results but extends the dynamic range below the conventional noise floor.Main results. In contrast to the conventional peak-to-peak measure, the proposed method increases the signal-to-noise ratio by more than a factor of 5. The first detectable responses appear to be substantially lower than the conventional threshold definition of 50µV median peak-to-peak amplitude.Significance. The proposed method shows that stimuli well below the conventional 50µV threshold definition can consistently and repeatably evoke muscular responses and thus activate excitable neuron populations in the brain. As a consequence, the input-output (IO) curve is extended at the lower end, and the noise cut-off is shifted. Importantly, the IO curve extends so far that the 50µV point turns out to be closer to the center of the logarithmic sigmoid curve rather than close to the first detectable responses. The underlying method is applicable to a wide range of evoked potentials and other biosignals, such as in electroencephalography.

Dataset Information

Rethinking glottal midline detection.

Publications

Rethinking glottal midline detection.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets