scholarly journals The Primitive Representation in Speech Perception: Phoneme or Distinctive Features

2013 ◽  
Vol 5 (4) ◽  
pp. 157-169
Author(s):  
Moon-Jung Bae
2012 ◽  
Vol 55 (3) ◽  
pp. 903-918 ◽  
Author(s):  
Mathias Scharinger ◽  
Philip J. Monahan ◽  
William J. Idsardi

Purpose Speech perception can be described as the transformation of continuous acoustic information into discrete memory representations. Therefore, research on neural representations of speech sounds is particularly important for a better understanding of this transformation. Speech perception models make specific assumptions regarding the representation of mid vowels (e.g., [ɛ]) that are articulated with a neutral position in regard to height. One hypothesis is that their representation is less specific than the representation of vowels with a more specific position (e.g., [æ]). Method In a magnetoencephalography study, we tested the underspecification of mid vowel in American English. Using a mismatch negativity (MMN) paradigm, mid and low lax vowels ([ɛ]/[æ]), and high and low lax vowels ([ i ]/[æ]), were opposed, and M100/N1 dipole source parameters as well as MMN latency and amplitude were examined. Results Larger MMNs occurred when the mid vowel [ɛ] was a deviant to the standard [æ], a result consistent with less specific representations for mid vowels. MMNs of equal magnitude were elicited in the high–low comparison, consistent with more specific representations for both high and low vowels. M100 dipole locations support early vowel categorization on the basis of linguistically relevant acoustic–phonetic features. Conclusion We take our results to reflect an abstract long-term representation of vowels that do not include redundant specifications at very early stages of processing the speech signal. Moreover, the dipole locations indicate extraction of distinctive features and their mapping onto representationally faithful cortical locations (i.e., a feature map).


1971 ◽  
Vol 1 (2) ◽  
pp. 81-96 ◽  
Author(s):  
Natalie Waterson

Speech perception is of interest to linguists and psychologists alike. Psychologists seek for linguistic units to enable them to explain processes involved in speech; linguists try to establish what these units may be, whether distinctive features, phonemes, syllables, words or even larger units. Although the phoneme was for some time considered to be the most likely candidate, experimental evidence is increasingly pointing to some larger unit, particularly in view of the fact that no one-to-one acoustic correlation with the phoneme nor with distinctive features can be found (cf. Reddy, 1967: 336, Ladefoged, 1967: 146, Denes, 1963: 892). Furthermore, if the phoneme were to be the unit of perception, in any sort of processing involving matching a perceived pattern with one already stored, far too many operations would be involved because of the large size of vocabularies and large number of sentence types in a language; such processing would have to be too rapid to be feasible, bearing in mind the constraints of memory span. There is now more sympathy for the syllable or larger stretch as the unit of perception (e.g. Laver, 1970: 68, Maclay and Osgood, 1959, Ladefoged, 1959: 402), and there seems to be good evidence for the planning of speech to be in stretches longer than a word, e.g. Ladefoged's experiments with placing ‘dot’ at different parts of a sentence (Ladefoged, 1959).


1977 ◽  
Vol 45 (2) ◽  
pp. 459-471
Author(s):  
James R. Lackner ◽  
Betty Tuller ◽  
Louis M. Goldstein

If one listens to a meaningless syllable that is repeated over and over, he will hear it undergo a variety of changes that can be described systematically in terms of reorganizations of the phones constituting the syllable and changes in a restricted set of phonetic distinctive features. When the repeated syllable is followed by a different syllable but in the same voice, the new (test) syllable will be misperceived in a manner related to the perceptual misrepresentation of the repeated syllable. In the present experiment subjects ( N = 24) listened to 72 different experimental sequences of repeated syllables in a male voice followed by test syllables in a female voice. Identification of penultimate and test syllables was independent and in no instance were the phones constituting the syllables reorganized. These results are interpreted as evidence against both auditory and phonetic feature detector theories of speech perception.


2007 ◽  
Vol 363 (1493) ◽  
pp. 1071-1086 ◽  
Author(s):  
David Poeppel ◽  
William J Idsardi ◽  
Virginie van Wassenhove

Speech perception consists of a set of computations that take continuously varying acoustic waveforms as input and generate discrete representations that make contact with the lexical representations stored in long-term memory as output. Because the perceptual objects that are recognized by the speech perception enter into subsequent linguistic computation, the format that is used for lexical representation and processing fundamentally constrains the speech perceptual processes. Consequently, theories of speech perception must, at some level, be tightly linked to theories of lexical representation. Minimally, speech perception must yield representations that smoothly and rapidly interface with stored lexical items. Adopting the perspective of Marr, we argue and provide neurobiological and psychophysical evidence for the following research programme. First, at the implementational level, speech perception is a multi-time resolution process, with perceptual analyses occurring concurrently on at least two time scales (approx. 20–80 ms, approx. 150–300 ms), commensurate with (sub)segmental and syllabic analyses, respectively. Second, at the algorithmic level, we suggest that perception proceeds on the basis of internal forward models, or uses an ‘analysis-by-synthesis’ approach. Third, at the computational level (in the sense of Marr), the theory of lexical representation that we adopt is principally informed by phonological research and assumes that words are represented in the mental lexicon in terms of sequences of discrete segments composed of distinctive features. One important goal of the research programme is to develop linking hypotheses between putative neurobiological primitives (e.g. temporal primitives) and those primitives derived from linguistic inquiry, to arrive ultimately at a biologically sensible and theoretically satisfying model of representation and computation in speech.


Author(s):  
Asish C. Nag ◽  
Lee D. Peachey

Cat extraocular muscles consist of two regions: orbital, and global. The orbital region contains predominantly small diameter fibers, while the global region contains a variety of fibers of different diameters. The differences in ultrastructural features among these muscle fibers indicate that the extraocular muscles of cats contain at least five structurally distinguishable types of fibers.Superior rectus muscles were studied by light and electron microscopy, mapping the distribution of each fiber type with its distinctive features. A mixture of 4% paraformaldehyde and 4% glutaraldehyde was perfused through the carotid arteries of anesthetized adult cats and applied locally to exposed superior rectus muscles during the perfusion.


2020 ◽  
Vol 63 (4) ◽  
pp. 1270-1281
Author(s):  
Leah Fostick ◽  
Riki Taitelbaum-Swead ◽  
Shulamith Kreitler ◽  
Shelly Zokraut ◽  
Miriam Billig

Purpose Difficulty in understanding spoken speech is a common complaint among aging adults, even when hearing impairment is absent. Correlational studies point to a relationship between age, auditory temporal processing (ATP), and speech perception but cannot demonstrate causality unlike training studies. In the current study, we test (a) the causal relationship between a spatial–temporal ATP task (temporal order judgment [TOJ]) and speech perception among aging adults using a training design and (b) whether improvement in aging adult speech perception is accompanied by improved self-efficacy. Method Eighty-two participants aged 60–83 years were randomly assigned to a group receiving (a) ATP training (TOJ) over 14 days, (b) non-ATP training (intensity discrimination) over 14 days, or (c) no training. Results The data showed that TOJ training elicited improvement in all speech perception tests, which was accompanied by increased self-efficacy. Neither improvement in speech perception nor self-efficacy was evident following non-ATP training or no training. Conclusions There was no generalization of the improvement resulting from TOJ training to intensity discrimination or generalization of improvement resulting from intensity discrimination training to speech perception. These findings imply that the effect of TOJ training on speech perception is specific and such improvement is not simply the product of generally improved auditory perception. It provides support for the idea that temporal properties of speech are indeed crucial for speech perception. Clinically, the findings suggest that aging adults can be trained to improve their speech perception, specifically through computer-based auditory training, and this may improve perceived self-efficacy.


2020 ◽  
Vol 29 (2) ◽  
pp. 259-264 ◽  
Author(s):  
Hasan K. Saleh ◽  
Paula Folkeard ◽  
Ewan Macpherson ◽  
Susan Scollie

Purpose The original Connected Speech Test (CST; Cox et al., 1987) is a well-regarded and often utilized speech perception test. The aim of this study was to develop a new version of the CST using a neutral North American accent and to assess the use of this updated CST on participants with normal hearing. Method A female English speaker was recruited to read the original CST passages, which were recorded as the new CST stimuli. A study was designed to assess the newly recorded CST passages' equivalence and conduct normalization. The study included 19 Western University students (11 females and eight males) with normal hearing and with English as a first language. Results Raw scores for the 48 tested passages were converted to rationalized arcsine units, and average passage scores more than 1 rationalized arcsine unit standard deviation from the mean were excluded. The internal reliability of the 32 remaining passages was assessed, and the two-way random effects intraclass correlation was .944. Conclusion The aim of our study was to create new CST stimuli with a more general North American accent in order to minimize accent effects on the speech perception scores. The study resulted in 32 passages of equivalent difficulty for listeners with normal hearing.


2020 ◽  
Vol 63 (7) ◽  
pp. 2245-2254 ◽  
Author(s):  
Jianrong Wang ◽  
Yumeng Zhu ◽  
Yu Chen ◽  
Abdilbar Mamat ◽  
Mei Yu ◽  
...  

Purpose The primary purpose of this study was to explore the audiovisual speech perception strategies.80.23.47 adopted by normal-hearing and deaf people in processing familiar and unfamiliar languages. Our primary hypothesis was that they would adopt different perception strategies due to different sensory experiences at an early age, limitations of the physical device, and the developmental gap of language, and others. Method Thirty normal-hearing adults and 33 prelingually deaf adults participated in the study. They were asked to perform judgment and listening tasks while watching videos of a Uygur–Mandarin bilingual speaker in a familiar language (Standard Chinese) or an unfamiliar language (Modern Uygur) while their eye movements were recorded by eye-tracking technology. Results Task had a slight influence on the distribution of selective attention, whereas subject and language had significant influences. To be specific, the normal-hearing and the d10eaf participants mainly gazed at the speaker's eyes and mouth, respectively, in the experiment; moreover, while the normal-hearing participants had to stare longer at the speaker's mouth when they confronted with the unfamiliar language Modern Uygur, the deaf participant did not change their attention allocation pattern when perceiving the two languages. Conclusions Normal-hearing and deaf adults adopt different audiovisual speech perception strategies: Normal-hearing adults mainly look at the eyes, and deaf adults mainly look at the mouth. Additionally, language and task can also modulate the speech perception strategy.


Sign in / Sign up

Export Citation Format

Share Document