Loading…

Latent neural dynamics encode temporal context in speech

•We recorded auditory neural responses to speech using electrocorticography.•Reduced-rank regression captures responses with low-dimensional latent states.•Responses to timing cues are more widespread than phonetic feature responses.•Responses to sentence-level and syllable-level timing cues have cy...

Full description

Saved in:
Bibliographic Details
Published in:Hearing research 2023-09, Vol.437, p.108838-108838, Article 108838
Main Authors: Stephen, Emily P, Li, Yuanning, Metzger, Sean, Oganian, Yulia, Chang, Edward F
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•We recorded auditory neural responses to speech using electrocorticography.•Reduced-rank regression captures responses with low-dimensional latent states.•Responses to timing cues are more widespread than phonetic feature responses.•Responses to sentence-level and syllable-level timing cues have cyclical dynamics.•The structure of these latent dynamics could bind phonetic features across time. Direct neural recordings from human auditory cortex have demonstrated encoding for acoustic-phonetic features of consonants and vowels. Neural responses also encode distinct acoustic amplitude cues related to timing, such as those that occur at the onset of a sentence after a silent period or the onset of the vowel in each syllable. Here, we used a group reduced rank regression model to show that distributed cortical responses support a low-dimensional latent state representation of temporal context in speech. The timing cues each capture more unique variance than all other phonetic features and exhibit rotational or cyclical dynamics in latent space from activity that is widespread over the superior temporal gyrus. We propose that these spatially distributed timing signals could serve to provide temporal context for, and possibly bind across time, the concurrent processing of individual phonetic features, to compose higher-order phonological (e.g. word-level) representations.
ISSN:0378-5955
1878-5891
DOI:10.1016/j.heares.2023.108838