Low-frequency neural tracking of natural speech envelope reflects evoked responses to acoustic edges, not oscillatory entrainment
AbstractThe amplitude envelope of speech is crucial for accurate comprehension, and several studies have shown that the phase of neural activity in the theta-delta bands (1 - 10 Hz) tracks the phase of the speech amplitude envelope during listening, a process referred to as envelope tracking. However, the mechanisms underlying envelope tracking have been heavily debated. A dominant model posits that envelope tracking reflects continuous entrainment of endogenous low-frequency oscillations to the speech envelope. However, it has proven challenging to distinguish this from the alternative that envelope tracking reflects evoked responses to acoustic landmarks within the envelope. Here we recorded magnetoencephalography while participants listened to natural and slowed speech to test two critical predictions of the entrainment model: (1) that the frequency range of phase-locking reflects the stimulus speech rate and (2) that an entrained oscillator will resonate for multiple cycles after a landmark-driven phase reset. We found that peaks in the rate of envelope change, acoustic edges, induced evoked responses and theta phase locking. Crucially, the frequency range of this phase locking was independent of the speech rate and transient, in line with the evoked response account. Further comparisons between regular and slowed speech revealed that encoding of acoustic edge magnitudes was invariant to contextual speech rate, demonstrating that it was normalized for speech rate. Taken together, our results show that the evoked response model provides a better account of neural phase locking to the speech envelope than oscillatory entrainment.