Measuring the Dual-Task Costs of Audiovisual Speech Processing Across Levels of Background Noise

Successful communication requires that listeners not only identify speech, but do so while maintaining performance on other tasks, like remembering what a conversational partner said or paying attention while driving. This set of four experiments systematically evaluated how audiovisual speech—which reliably improves speech intelligibility—affects dual-task costs during speech perception (i.e., one facet of listening effort). Results indicated that audiovisual speech reduces dual-task costs in difficult listening conditions (those in which visual cues substantially benefit intelligibility), but may actually increase costs in easy conditions—a pattern of results that was internally replicated multiple times. This study also presents a novel dual-task paradigm specifically designed to facilitate conducting dual-task research remotely. Given the novelty of the task, this study includes psychometric experiments that establish positive and negative control, assess convergent validity, measure task sensitivity relative to a commonly-used dual-task paradigm, and generate performance curves across a range of listening conditions. Thus, in addition to evaluating the effects of audiovisual speech across a wide range of background noise levels, this study enables other researchers to address theoretical questions related to the cognitive mechanisms supporting speech processing beyond the specific issues addressed here and without being limited to in-person research.

Speech and Non-Speech Measures of Audiovisual Integration are not Correlated

Many natural events generate both visual and auditory signals, and humans are remarkably adept at integrating information from those sources. However, individuals appear to differ markedly in their ability or propensity to combine what they hear with what they see. Individual differences in audiovisual integration have been established using a range of materials, including speech stimuli (seeing and hearing a talker) and simpler audiovisual stimuli (seeing flashes of light combined with tones). Although there are multiple tasks in the literature that are referred to as “measures of audiovisual integration,” the tasks themselves differ widely with respect to both the type of stimuli used (speech versus non-speech) and the nature of the tasks themselves (e.g., some tasks use conflicting auditory and visual stimuli whereas others use congruent stimuli). It is not clear whether these varied tasks are actually measuring the same underlying construct: audiovisual integration. This study tested the relationships among four commonly-used measures of audiovisual integration, two of which use speech stimuli (susceptibility to the McGurk effect and a measure of audiovisual benefit), and two of which use non-speech stimuli (the sound-induced flash illusion and audiovisual integration capacity). We replicated previous work showing large individual differences in each measure but found no significant correlations among any of the measures. These results suggest that tasks that are commonly referred to as measures of audiovisual integration may be tapping into different parts of the same process or different constructs entirely.