The Dual-Task Costs of Audiovisual Benefit: Effects of Noise and ‘Native’ Speaker Status

Listeners typically understand speech more accurately when they can see and hear the talker relative to hearing alone. However, seeing the talker’s face does not necessarily reduce the cognitive costs associated with processing speech as measured by dual-task costs. In difficult listening conditions, dual-task response times may be faster for audiovisual than audio-only speech, but when listening conditions are easy, the presence of a talking face may have no effect on dual task responses or even slow responses relative to listening alone. The current study expanded upon this work by including samples of both native and nonnative English speakers and assessing speech intelligibility, subjective listening effort (Experiment 1), and dual-task costs (Experiment 2) for audio-only and audiovisual speech across multiple noise levels. We found that seeing the talker reduces dual-task costs only in difficult listening conditions in which the visual information is necessary to accurately identify the speech. The effects of background noise and speech modality were robust within groups of native as well as nonnative listeners, suggesting that if researchers are interested in studying general phenomena related to speech processing (i.e., rather than specifically studying how language background affects results), these effects would have emerged regardless of whether the sample was limited to native speakers of English. However, the magnitude of some effects differed for native and nonnative listeners.

Assessing the Effects of “Native Speaker” Status on Classic Findings in Speech Research

It is common practice in speech research to only sample participants who self-report being “native English speakers.” Although there is research on differences in language processing between native and non-native listeners (see Lecumberri et al., 2010 for a review), the majority of speech research that aims to establish general findings (e.g., testing models of spoken word recognition) only includes native speakers in their sample. Not only is the “native English speaker” criterion poorly defined, but it also excludes historically underrepresented groups from speech perception research, often without attention to whether this exclusion is likely to affect study outcomes. The purpose of this study is to empirically test whether and how using different inclusion criteria (“native English speakers” vs. “non-native English speakers”) affects several well-known phenomena in speech perception research. Five hundred participants completed word (N = 200) and sentence (N = 300) identification tasks in quiet and in moderate levels of background noise. Results indicate that multiple classic findings in speech perception research—including the effects of noise level, lexical density, and semantic context on speech intelligibility—persist regardless of “native English” speaking status. However, the magnitude of some of these effects differed across participant groups. Taken together, these results suggest that researchers should carefully consider whether L1/LX status is likely to affect outcomes, and make decisions about inclusion criteria on a study-by-study basis.

nonnative speech processing