Audio recordings dataset of genuine and replayed speech at both ends of a telecommunication channel.
Ontology highlight
ABSTRACT: The recordings in this database were collected for the purpose of evaluating the ability of a copy-detection based playback attack detector to safeguard a remote-access speaker-verified and passphrase-protected system from playback attacks. The database includes multiple utterances of the same phrase by the same person in addition to a variety of distorted versions of many of the utterances. Multiple distortions of an utterance were obtained, in part, by simultaneously recording the utterance at both ends of a telecommunication channel - using a digital voice recorder to obtain the user-end (i.e., in-person) recording and a telephony board to obtain the system-end recording. While the former suffers little distortion, the latter suffers the "non-stationary" distortion imposed by the channel. Additional distortions of the same utterance were captured at the system-end of the channel when the in-person recording was replayed at the user-end; these additional recordings simulate playback attacks and suffer the distortion imposed by both the playback device and the channel. The database may be used: to evaluate the vulnerability of a speaker verification system (SVS) to playback attacks; to evaluate the performance of a copy-detection or distortion-detection based playback attack detector (PAD); to evaluate the overall security of a speaker verification system in tandem with a playback attack countermeasure; or to investigate the distortion imposed by various telecommunication channels and/or playback speakers.
SUBMITTER: Shang W
PROVIDER: S-EPMC7772533 | biostudies-literature | 2021 Feb
REPOSITORIES: biostudies-literature
ACCESS DATA