Dataset Information

Visual speech discrimination and identification of natural and synthetic consonant stimuli.

ABSTRACT: From phonetic features to connected discourse, every level of psycholinguistic structure including prosody can be perceived through viewing the talking face. Yet a longstanding notion in the literature is that visual speech perceptual categories comprise groups of phonemes (referred to as visemes), such as /p, b, m/ and /f, v/, whose internal structure is not informative to the visual speech perceiver. This conclusion has not to our knowledge been evaluated using a psychophysical discrimination paradigm. We hypothesized that perceivers can discriminate the phonemes within typical viseme groups, and that discrimination measured with d-prime (d') and response latency is related to visual stimulus dissimilarities between consonant segments. In Experiment 1, participants performed speeded discrimination for pairs of consonant-vowel spoken nonsense syllables that were predicted to be same, near, or far in their perceptual distances, and that were presented as natural or synthesized video. Near pairs were within-viseme consonants. Natural within-viseme stimulus pairs were discriminated significantly above chance (except for /k/-/h/). Sensitivity (d') increased and response times decreased with distance. Discrimination and identification were superior with natural stimuli, which comprised more phonetic information. We suggest that the notion of the viseme as a unitary perceptual category is incorrect. Experiment 2 probed the perceptual basis for visual speech discrimination by inverting the stimuli. Overall reductions in d' with inverted stimuli but a persistent pattern of larger d' for far than for near stimulus pairs are interpreted as evidence that visual speech is represented by both its motion and configural attributes. The methods and results of this investigation open up avenues for understanding the neural and perceptual bases for visual and audiovisual speech perception and for development of practical applications such as visual lipreading/speechreading speech synthesis.

SUBMITTER: Files BT

PROVIDER: S-EPMC4499841 | biostudies-other | 2015

REPOSITORIES: biostudies-other

ACCESS DATA

Publications

Visual speech discrimination and identification of natural and synthetic consonant stimuli.

Files Benjamin T BT Tjan Bosco S BS Jiang Jintao J Bernstein Lynne E LE

Frontiers in psychology 20150713

From phonetic features to connected discourse, every level of psycholinguistic structure including prosody can be perceived through viewing the talking face. Yet a longstanding notion in the literature is that visual speech perceptual categories comprise groups of phonemes (referred to as visemes), such as /p, b, m/ and /f, v/, whose internal structure is not informative to the visual speech perceiver. This conclusion has not to our knowledge been evaluated using a psychophysical discrimination ...[more]

PMID: 26217249

Similar Datasets

Project description:When animals are previously exposed to two different visual stimuli simultaneously, their learning performance at discriminating those stimuli delays: such a phenomenon is known as "classifying-together" or "Bateson effect". However, the consistency of this phenomenon has not been wholly endorsed, especially considering the evidence collected in several vertebrates. The current study addressed whether a teleost fish, Xenotoca eiseni, was liable to the Bateson effect. Three experiments were designed, by handling the visual stimuli (i.e., a full red disk, an amputated red disk, a red cross) and the presence of an exposure phase, before performing a discriminative learning task (Exp. 1: full red disk vs. amputated red disk; Exp. 2: full red disk vs. red cross). In the exposure phase, three conditions per pairs of training stimuli were arranged: "congruence", where fish were exposed and trained to choose the same stimulus; "wide-incongruence", where fish were exposed to one stimulus and trained to choose the other one; "narrow-incongruence", where fish were exposed to both the stimuli and trained to choose one of them. In the absence of exposure (Exp. 3), the discrimination learning task was carried out to establish a baseline performance as regards the full red disk vs. amputated red disk, and the full red disk vs. red cross. Results showed that fish ran into retardation effects at learning when trained to choose a novel stimulus with respect to the one experienced during the exposure-phase (wide-incongruence condition), as well as after being simultaneously exposed to both stimuli (narrow-incongruence condition). Furthermore, there were no facilitation effects due to the congruence compared with the baseline: in such a case, familiar stimuli did not ease the performance at learning. The study provides the first evidence about the consistency of the classifying-together effect in a fish species, further highlighting the impact of visual similarities on discrimination processes.

Dataset Information

Visual speech discrimination and identification of natural and synthetic consonant stimuli.

Publications

Visual speech discrimination and identification of natural and synthetic consonant stimuli.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets