Evaluation of various sets of acoustic cues for the perception of prevocalic stop consonants. II. Modeling and evaluation

1996 ◽  
Vol 100 (6) ◽  
pp. 3865-3881 ◽  
Author(s):  
Roel Smits ◽  
Louis ten Bosch ◽  
René Collier
1983 ◽  
Vol 73 (5) ◽  
pp. 1779-1793 ◽  
Author(s):  
Diane Kewley‐Port ◽  
David B. Pisoni ◽  
Michael Studdert‐Kennedy

2009 ◽  
Vol 126 (4) ◽  
pp. 2181
Author(s):  
Helen M. Hanson ◽  
Stefanie Shattuck-Hufnagel ◽  
Margaret Capotosto

2021 ◽  
Author(s):  
Kubra Bodur ◽  
Sweeney Branje ◽  
Morgane Peirolo ◽  
Ingrid Tiscareno ◽  
James S. German

2005 ◽  
Vol 48 (3) ◽  
pp. 681-701 ◽  
Author(s):  
Laura K. Holden ◽  
Andrew E. Vandali ◽  
Margaret W. Skinner ◽  
Marios S. Fourakis ◽  
Timothy A. Holden

One of the difficulties faced by cochlear implant (CI) recipients is perception of low-intensity speech cues. A. E. Vandali (2001) has developed the transient emphasis spectral maxima (TESM) strategy to amplify short-duration, low-level sounds. The aim of the present study was to determine whether speech scores would be significantly higher with TESM than with the advanced combination encoder (ACE) strategy fitted using procedures that optimize perception of soft speech and other sounds. Eight adult recipients of the Nucleus 24 CI system participated in this study. No significant differences in scores were seen between ACE and TESM for consonant-vowel nucleus-consonant (CNC) words presented at 55 and 65 dB SPL, for sentences in noise presented at 65 dB SPL at 2 different signal-to-noise ratios, or for closed-set vowels and consonants presented at 60 dB SPL. However, perception of stop consonants within CNC words presented at the lower level (55 dB SPL) was significantly higher with TESM than ACE. In addition, percentage of information transmitted for words at 55 dB SPL was significantly higher with TESM than with ACE for manner and voicing features for consonants in the initial word position. Analysis of closed-set consonants presented at 60 dB SPL revealed percentage of information transmitted for manner was significantly higher with TESM than with ACE. These improvements with TESM were small compared with those reported by Vandali for recipients of the Nucleus 22 CI system. It appears that mapping techniques used to program speech processors and improved processing capabilities of the Nucleus 24 system contributed to soft sounds being understood almost as well with ACE as with TESM. However, half of the participants preferred TESM to ACE for use in everyday life, and all but 1 used TESM in specific listening situations. Clinically, TESM may be useful to ensure the audibility of low-intensity, short-duration acoustic cues that are important for understanding speech, for recipients who are difficult to map, or if insufficient time precludes the use of mapping techniques to increase audibility of soft sound.


2008 ◽  
Vol 123 (6) ◽  
pp. 4482-4497
Author(s):  
Anne Bonneau ◽  
Yves Laprie

2002 ◽  
Vol 28 ◽  
pp. 1-12 ◽  
Author(s):  
Hansook Choi

In this study, cross-dialectal variation in the use of the acoustic cues of VOT and F0 to mark the laryngeal contrast in Korean stops is examined with Chonnam Korean and Seoul Korean. Prior experimental results (Han & Weitzman, 1970; Hardcastle, 1973; Jun, 1993 &1998; Kim, C., 1965) show that pitch values in the vowel onset following the target stop consonants play a supplementary role to VOT in designating the three contrastive laryngeal categories. F0 contours are determined in part by the intonational system of a language, which raises the question of how the intonational system interacts with phonological contrasts. Intonational difference might be linked to dissimilar patterns in using the complementary acoustic cues of VOT and F0. This hypothesis is tested with 6 Korean speakers, three Seoul Korean and three Chonnam Korean speakers. The results show that Chonnam Korean involves more 3-way VOT and a 2-way distinction in F0 distribution in comparison to Seoul Korean that shows more 3-way F0 distribution and a 2-way VOT distinction. The two acoustic cues are complementary in that one cue is rather faithful in marking 3-way contrast, while the other cue marks the contrast less distinctively. It also seems that these variations are not completely arbitrary, but linked to the phonological characteristics in dialects. Chonnam Korean, in which the initial tonal realization in the accentual phrase is expected to be more salient, tends to minimize the F0 perturbation effect from the preceding consonants by taking more overlaps in F0 distribution. And a 3-way distribution of VOT in Chonnam Korean, as compensation, can be also understood as a durational sensitivity. Without these characteristics, Seoul Korean shows relatively more overlapping distribution in VOT and more 3-way separation in F0 distribution.  


2019 ◽  
Vol 62 (2) ◽  
pp. 434-441 ◽  
Author(s):  
Shunsuke Tamura ◽  
Kazuhito Ito ◽  
Nobuyuki Hirose ◽  
Shuji Mori

Purpose The purpose of this study was to investigate whether speech perception would reflect small latency changes in subcortical speech representation. Method Twelve native Japanese listeners participated in the experiment. Those listeners participated in speech identification task and auditory brainstem response (ABR) measurement using /d/–/t/ continuum stimuli varying in voice onset time (VOT) with manipulation of the amplitude of initial noise (consonant) portion, the duration of which corresponded to VOT. Results Increasing the noise portion amplitude lengthened subcortical representation of VOT, which is the latency difference between ABRs synchronizing to the onsets of initial noise and following periodic (vowel) portions (VOT ABR ) and made listeners likely to perceive the stimuli with ambiguous VOT as a voiceless stop /t/. In addition, the amount of VOT ABR lengthening was close to that of the VOT boundary shortening. Conclusion A few milliseconds of difference in subcortical speech representation are important for the perception of speech sounds with ambiguous acoustic cues. Supplemental Material https://doi.org/10.23641/asha.7728695


1988 ◽  
Vol 31 (3) ◽  
pp. 449-459 ◽  
Author(s):  
Karen Forrest ◽  
Barbara K. Rockman

Spectrographic measures of voice onset time (VOT) were made for phonologically disordered children in whom a voicing contrast was just beginning to emerge. These temporal measures were related to adult listeners' perception of voicing of the initial stop consonant to determine how well VOT could predict perceived voicing. In general, the predictive utility of VOT was not very high. The relation between VOT as produced by the phonologically disordered children and perceived voicing ranged from 0.31 to 0.43. A finer-grained analysis was conducted to determine what other acoustic cues might have influenced the listeners' judgments of voicing. Although no one acoustic cue could be found to explain all listeners' responses, spectral cues such as fundamental and F 1 frequencies at the onset of voicing, as well as the burst and aspiration amplitude relative to the vowel onset amplitude accounted for the perceived voicing of about half of the tokens that were not differentiated by VOT. Rather than relying solely on the temporal characteristics of the VOT interval, a matrix of acoustic cues may influence how a listener perceives word-initial voicing as produced by phonologically disordered children.


Sign in / Sign up

Export Citation Format

Share Document