Dataset Information

Uncovering the important acoustic features for detecting vocal fold paralysis with explainable machine learning.

ABSTRACT:

Objective

To detect unilateral vocal fold paralysis (UVFP) from voice recordings using an explainable model of machine learning.

Study design

Case series - retrospective with a control group.

Setting

Tertiary care laryngology practice between 2009 to 2019.

Methods

Patients with confirmed UVFP through endoscopic examination (N=77) and controls with normal voices matched for age and sex (N=77) were included. Two tasks were used to elicit voice samples: reading the Rainbow Passage and sustaining phonation of the vowel "a". The 88 extended Geneva Minimalistic Acoustic Parameter Set (eGeMAPS) features were extracted as inputs for four machine learning models of differing complexity. SHAP was used to identify important features.

Results

The median bootstrapped Area Under the Receiver Operating Characteristic Curve (ROC AUC) score ranged from 0.79 to 0.87 depending on model and task. After removing redundant features for explainability, the highest median ROC AUC score was 0.84 using only 13 features for the vowel task and 0.87 using 39 features for the reading task. The most important features included intensity measures, mean MFCC1, mean F1 amplitude and frequency, and shimmer variability depending on model and task.

Conclusion

Using the largest dataset studying UVFP to date, we achieve high performance from just a few seconds of voice recordings. Notably, we demonstrate that while similar categories of features related to vocal fold physiology were conserved across models, the models used different combinations of features and still achieved similar effect sizes. Machine learning thus provides a mechanism to detect UVFP and contextualize the accuracy relative to both model architecture and pathophysiology.

SUBMITTER: Low DM

PROVIDER: S-EPMC7836138 | biostudies-literature | 2021 Jun

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Identifying bias in models that detect vocal fold paralysis from audio recordings using explainable machine learning and clinician ratings.

Low Daniel M DM Rao Vishwanatha V Randolph Gregory G Song Phillip C PC Ghosh Satrajit S SS

medRxiv : the preprint server for health sciences 20240320

<h4>Introduction</h4>Detecting voice disorders from voice recordings could allow for frequent, remote, and low-cost screening before costly clinical visits and a more invasive laryngoscopy examination. Our goals were to detect unilateral vocal fold paralysis (UVFP) from voice recordings using machine learning, to identify which acoustic variables were important for prediction to increase trust, and to determine model performance relative to clinician performance.<h4>Methods</h4>Patients with con ...[more]

PMID: 33501466

Similar Datasets

Project description:Background and objectiveThe aim of this study was to evaluate the utility of two items in vocal fold paresis and paralysis screening after thyroid and parathyroid surgery: patient self-assessment of voice using the Voice Handicap Index and computer-based acoustic voice analysis using the Multi-Dimensional Voice Program.MethodsThis was a prospective study of 181 patients who underwent thyroid or parathyroid surgery over a 1-year study period (2017). Preoperatively, all patients underwent laryngoscopic vocal fold inspection and acoustic voice analysis, and they completed the Voice Handicap Index questionnaire. Postoperatively, all patients underwent laryngoscopy prior to hospital discharge; 2 weeks after the surgery, they completed the Voice Handicap Index questionnaire a second time. Two weeks postoperatively, patients with vocal fold paresis or paralysis and 20 randomly selected controls without vocal fold paresis or paralysis underwent a follow-up acoustic voice analysis.ResultsFourteen patients had a new postoperative vocal fold paresis or paralysis. Postoperatively, the total Voice Handicap Index score was significantly higher (p = 0.040) and the change between preoperative and postoperative scores was greater (p = 0.028) in vocal fold paresis or paralysis patients. A total postoperative Voice Handicap Index score > 30 had 55% sensitivity, and 90% specificity, for vocal fold paresis or paralysis. In the postoperative Multi-Dimensional Voice Program analysis, vocal fold paresis or paralysis patients had significantly more jitter (p = 0.044). Postoperative jitter > 1.33 corresponded to 55% sensitivity, and 95% specificity, for vocal fold paresis or paralysis.ConclusionsIn identifying postoperative vocal fold paresis or paralysis, patient self-assessment and jitter in acoustic voice analysis have high specificity but poor sensitivity. Without routine laryngoscopy, approximately half of the patients with postoperative vocal fold paresis or paralysis could be overlooked. However, if the patient has no complaints of voice disturbance 2 weeks after thyroid or parathyroid surgery, the likelihood of vocal fold paresis or paralysis is low.

Project description:Abstract Objectives Vocal fold medialization surgery is generally considered a phonosurgical procedure for improvement of vocal function in patients with glottic insufficiency. However, the literature describing this procedure for the management of dysphagia is limited. This study aims to assess the effects of medialization surgery on swallowing function in patients with unilateral vocal fold paralysis (UVFP). Methods We enrolled 32 patients with UVFP undergoing vocal fold medialization surgery (medialization laryngoplasty combined with arytenoid adduction [ML + AA], 12 cases; injection laryngoplasty [IL], 20 cases). We assessed the aerodynamic vocal function including maximum phonation time and mean flow rate to evaluate glottal closure status. The Hyodo score determined by flexible endoscopic evaluation and Functional Oral Intake Scale (FOIS) were evaluated pre‐ and postoperatively. Results Almost 60% of patients with UVFP had dysphagia, and one‐third were at high risk for aspiration. Aerodynamic parameters effectively improved after IL and ML + AA. With regard to swallowing, both the FOIS and total Hyodo score were significantly improved postoperatively. We found a particularly significant improvement in pharyngeal clearance. However, patients with high vagal nerve paralysis and postoperative insufficient glottal closure showed poor swallowing benefits after the interventions. In patients with recurrent laryngeal nerve palsy, there were no significant differences in postoperative swallowing function between the ML + AA and IL groups. Conclusion Vocal fold medialization surgery was effective in improving swallowing function in most cases with UVFP, except for those with high vagal paralysis and insufficient postoperative glottal closure. Both IL and ML + AA showed an equivalent effect on swallowing improvement. Level of evidence 3b. Almost 60% of patients with UVFP had dysphagia, and vocal fold medialization surgery improved swallowing in most cases, except for those with high vagal paralysis and postoperative glottic insufficiency. Both IL and ML + AA showed an equivalent effect on swallowing improvement.

Project description:ObjectivesTo determine the decannulation rate (DR) and revision surgery rate after surgery for bilateral vocal fold paralysis (BVFP).Data sourcesFive databases (MEDLINE, PubMed, Embase, Web of Science, Scopus) were searched for the period 1908-2020.MethodsThe systematic literature review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Data were pooled using a random-mixed-effects model. Randomized controlled trials and non-randomized studies (case-control, cohort, and case series) were included to assess DR and revision surgery rate after different surgical techniques for treatment of BVFP.ResultsThe search yielded 857 publications, of which 102 with 2802 patients were included. DR after different types of surgery was: arytenoid abduction (DR 0.93, 95%-confidence interval [CI], 0.86-0.97), endolaryngeal arytenoidectomy (DR 0.92, 95%-CI, 0.86-0.96), external arytenoidectomy (DR 0.94; 95%-CI, 0.71-0.99), external arytenoidectomy and lateralisation (DR 0.87; 95%-CI, 0.73-0.94), laterofixation (DR 0.95; 95%-CI, 0.91-0.97), posterior cordectomy (DR 0.97, 95%-CI, 0.94-0.99), posterior cordectomy and arytenoidectomy (DR 0.98, 95%-CI, 0.93-0.99), posterior cordectomy and subtotal arytenoidectomy (DR 0.98, 95%-CI, 0.88-1.00), posterior cordotomy (DR 0.96, 95%-CI, 0.84-0.99), reinnervation (0.69, 95%-CI, 0.12-0.97), subtotal arytenoidectomy (DR 1.00, 95%-CI, 0.00-1.00) and transverse cordotomy (DR 1.0, 95%-CI, 0.00-1.00). No significant difference between subgroups for DR could be found (Q = 15.67, df = 11, p = 0.1540). The between-study heterogeneity was low (τ2 = 2.2627; τ = 1.5042; I2 = 0.0%). Studies were at high risk of bias.ConclusionBLVP is a rare disease and the study quality is insufficient. The existing studies suggest a publication bias and the literature review revealed that there is a lack of prospective controlled studies. There is a lack of standardized measures that takes into account both speech quality and respiratory function and allows adequate comparison of surgical methods.