Project description:Listening effort is a valuable and important notion to measure because it is among the primary complaints of people with hearing loss. It is tempting and intuitive to accept speech intelligibility scores as a proxy for listening effort, but this link is likely oversimplified and lacks actionable explanatory power. This study was conducted to explain the mechanisms of listening effort that are not captured by intelligibility scores, using sentence-repetition tasks where specific kinds of mistakes were prospectively planned or analyzed retrospectively. Effort measured as changes in pupil size among 20 listeners with normal hearing and 19 listeners with cochlear implants. Experiment 1 demonstrates that mental correction of misperceived words increases effort even when responses are correct. Experiment 2 shows that for incorrect responses, listening effort is not a function of the proportion of words correct but is rather driven by the types of errors, position of errors within a sentence, and the need to resolve ambiguity, reflecting how easily the listener can make sense of a perception. A simple taxonomy of error types is provided that is both intuitive and consistent with data from these two experiments. The diversity of errors in these experiments implies that speech perception tasks can be designed prospectively to elicit the mistakes that are more closely linked with effort. Although mental corrective action and number of mistakes can scale together in many experiments, it is possible to dissociate them to advance toward a more explanatory (rather than correlational) account of listening effort.
Project description:The talking face affords multiple types of information. To isolate cortical sites with responsibility for integrating linguistically relevant visual speech cues, speech and nonspeech face gestures were presented in natural video and point-light displays during fMRI scanning at 3.0T. Participants with normal hearing viewed the stimuli and also viewed localizers for the fusiform face area (FFA), the lateral occipital complex (LOC), and the visual motion (V5/MT) regions of interest (ROIs). The FFA, the LOC, and V5/MT were significantly less activated for speech relative to nonspeech and control stimuli. Distinct activation of the posterior superior temporal sulcus and the adjacent middle temporal gyrus to speech, independent of media, was obtained in group analyses. Individual analyses showed that speech and nonspeech stimuli were associated with adjacent but different activations, with the speech activations more anterior. We suggest that the speech activation area is the temporal visual speech area (TVSA), and that it can be localized with the combination of stimuli used in this study.
Project description:Recently, the measurement of the pupil dilation response has been applied in many studies to assess listening effort. Meanwhile, the mechanisms underlying this response are still largely unknown. We present the results of a method that separates the influence of the parasympathetic and sympathetic branches of the autonomic nervous system on the pupil response during speech perception. This is achieved by changing the background illumination level. In darkness, the influence of the parasympathetic nervous system on the pupil response is minimal, whereas in light, there is an additional component from the parasympathetic nervous system. Nineteen hearing-impaired and 27 age-matched normal-hearing listeners performed speech reception threshold tests targeting a 50% correct performance level while pupil responses were recorded. The target speech was masked with a competing talker. The test was conducted twice, once in dark and once in a light condition. Need for Recovery and Checklist Individual Strength questionnaires were acquired as indices of daily-life fatigue. In dark, the peak pupil dilation (PPD) did not differ between the two groups, but in light, the normal-hearing group showed a larger PPD than the hearing-impaired group. Listeners with better hearing acuity showed larger differences in dilation between dark and light. These results indicate a larger effect of parasympathetic inhibition on the pupil dilation response of listeners with better hearing acuity, and a relatively high parasympathetic activity in those with worse hearing. Previously observed differences in PPD between normal and impaired listeners are probably not solely because of differences in listening effort.
Project description:Daily-life conversation relies on speech perception in quiet and noise. Because of the COVID-19 pandemic, face masks have become mandatory in many situations. Acoustic attenuation of sound pressure by the mask tissue reduces speech perception ability, especially in noisy situations. Masks also can impede the process of speech comprehension by concealing the movements of the mouth, interfering with lip reading. In this prospective observational, cross-sectional study including 17 participants with normal hearing, we measured the influence of acoustic attenuation caused by medical face masks (mouth and nose protection) according to EN 14683 and of N95 masks according to EN 1149 (EN 14683) on the speech recognition threshold and listening effort in various types of background noise. Averaged over all noise signals, a surgical mask significantly reduced the speech perception threshold in noise was by 1.6 dB (95% confidence interval [CI], 1.0, 2.1) and an N95 mask reduced it significantly by 2.7 dB (95% CI, 2.2, 3.2). Use of a surgical mask did not significantly increase the 50% listening effort signal-to-noise ratio (increase of 0.58 dB; 95% CI, 0.4, 1.5), but use of an N95 mask did so significantly, by 2.2 dB (95% CI, 1.2, 3.1). In acoustic measures, mask tissue reduced amplitudes by up to 8 dB at frequencies above 1 kHz, whereas no reduction was observed below 1 kHz. We conclude that face masks reduce speech perception and increase listening effort in different noise signals. Together with additional interference because of impeded lip reading, the compound effect of face masks could have a relevant impact on daily life communication even in those with normal hearing.
Project description:Facial emotion recognition occupies a prominent place in emotion psychology. How perceivers recognize messages conveyed by faces can be studied in either an explicit or an implicit way, and using different kinds of facial stimuli. In the present study, we explored for the first time how facial point-light displays (PLDs) (i.e., biological motion with minimal perceptual properties) can elicit both explicit and implicit mechanisms of facial emotion recognition. Participants completed tasks of explicit or implicit facial emotion recognition from PLDs. Results showed that point-light stimuli are sufficient to allow facial emotion recognition, be it explicit and implicit. We argue that this finding could encourage the use of PLDs in research on the perception of emotional cues from faces.
Project description:Identifying speech requires that listeners make rapid use of fine-grained acoustic cues-a process that is facilitated by being able to see the talker's face. Face masks present a challenge to this process because they can both alter acoustic information and conceal the talker's mouth. Here, we investigated the degree to which different types of face masks and noise levels affect speech intelligibility and subjective listening effort for young (N = 180) and older (N = 180) adult listeners. We found that in quiet, mask type had little influence on speech intelligibility relative to speech produced without a mask for both young and older adults. However, with the addition of moderate (- 5 dB SNR) and high (- 9 dB SNR) levels of background noise, intelligibility dropped substantially for all types of face masks in both age groups. Across noise levels, transparent face masks and cloth face masks with filters impaired performance the most, and surgical face masks had the smallest influence on intelligibility. Participants also rated speech produced with a face mask as more effortful than unmasked speech, particularly in background noise. Although young and older adults were similarly affected by face masks and noise in terms of intelligibility and subjective listening effort, older adults showed poorer intelligibility overall and rated the speech as more effortful to process relative to young adults. This research will help individuals make more informed decisions about which types of masks to wear in various communicative settings.
Project description:Listening to speech in noise is effortful for individuals with hearing loss, even if they have received a hearing prosthesis such as a hearing aid or cochlear implant (CI). At present, little is known about the neural functions that support listening effort. One form of neural activity that has been suggested to reflect listening effort is the power of 8-12 Hz (alpha) oscillations measured by electroencephalography (EEG). Alpha power in two cortical regions has been associated with effortful listening-left inferior frontal gyrus (IFG), and parietal cortex-but these relationships have not been examined in the same listeners. Further, there are few studies available investigating neural correlates of effort in the individuals with cochlear implants. Here we tested 16 CI users in a novel effort-focused speech-in-noise listening paradigm, and confirm a relationship between alpha power and self-reported effort ratings in parietal regions, but not left IFG. The parietal relationship was not linear but quadratic, with alpha power comparatively lower when effort ratings were at the top and bottom of the effort scale, and higher when effort ratings were in the middle of the scale. Results are discussed in terms of cognitive systems that are engaged in difficult listening situations, and the implication for clinical translation.
Project description:Listeners are routinely exposed to many different types of speech, including artificially-enhanced and synthetic speech, styles which deviate to a greater or lesser extent from naturally-spoken exemplars. While the impact of differing speech types on intelligibility is well-studied, it is less clear how such types affect cognitive processing demands, and in particular whether those speech forms with the greatest intelligibility in noise have a commensurately lower listening effort. The current study measured intelligibility, self-reported listening effort, and a pupillometry-based measure of cognitive load for four distinct types of speech: (i) plain i.e. natural unmodified speech; (ii) Lombard speech, a naturally-enhanced form which occurs when speaking in the presence of noise; (iii) artificially-enhanced speech which involves spectral shaping and dynamic range compression; and (iv) speech synthesized from text. In the first experiment a cohort of 26 native listeners responded to the four speech types in three levels of speech-shaped noise. In a second experiment, 31 non-native listeners underwent the same procedure at more favorable signal-to-noise ratios, chosen since second language listening in noise has a more detrimental effect on intelligibility than listening in a first language. For both native and non-native listeners, artificially-enhanced speech was the most intelligible and led to the lowest subjective effort ratings, while the reverse was true for synthetic speech. However, pupil data suggested that Lombard speech elicited the lowest processing demands overall. These outcomes indicate that the relationship between intelligibility and cognitive processing demands is not a simple inverse, but is mediated by speech type. The findings of the current study motivate the search for speech modification algorithms that are optimized for both intelligibility and listening effort.
Project description:The benefits of combining a cochlear implant (CI) and a hearing aid (HA) in opposite ears on speech perception were examined in 15 adult unilateral CI recipients who regularly use a contralateral HA. A within-subjects design was carried out to assess speech intelligibility testing, listening effort ratings, and a sound quality questionnaire for the conditions CI alone, CIHA together, and HA alone when applicable. The primary outcome of bimodal benefit, defined as the difference between CIHA and CI, was statistically significant for speech intelligibility in quiet as well as for intelligibility in noise across tested spatial conditions. A reduction in effort on top of intelligibility at the highest tested signal-to-noise ratio was found. Moreover, the bimodal listening situation was rated to sound more voluminous, less tinny, and less unpleasant than CI alone. Listening effort and sound quality emerged as feasible and relevant measures to demonstrate bimodal benefit across a clinically representative range of bimodal users. These extended dimensions of speech perception can shed more light on the array of benefits provided by complementing a CI with a contralateral HA.
Project description:PurposeTests measuring speech comprehension and listening effort for cochlear implant (CI) users may reflect important aspects of real-world speech communication. In this study, we describe the development of a multiple-talker, English-language sentence verification task (SVT) for use in adult CI users to measure speech comprehension and listening effort. We also examine whether talker differences affect speech comprehension and listening effort.MethodThirteen experienced adult CI users participated in the study and underwent testing using a newly developed multiple-talker SVT. Participants were sequentially presented with audio recordings of unique sentences spoken in English by six different talkers. Participants classified each sentence as either true or false. Accuracy of classification and the response time (RT) for correct responses were used as measures of comprehension and listening effort, respectively. The effect of talker on the results was further analyzed.ResultsAll 13 participants successfully completed the SVT. The mean verification accuracy for participants was 87.2% ± 8.8%. The mean RT for correct responses across participants was 1,050 ms ± 391 ms. When stratified by talker, verification accuracy ranged from 83.7% to 95.2% and mean RTs across participant ranged from 786 ms to 1,254 ms. Talker did not have a significant effect on sentence classification accuracy, but it did have a significant effect on RTs (p < .001).ConclusionsThe SVT is an easily implemented test that can assess speech comprehension and listening effort simultaneously. CI users may experience increased effort for comprehending certain talkers' speech, despite showing similar levels of comprehension accuracy.Supplemental materialhttps://doi.org/10.23641/asha.24126630.