In Search of Salience: Focus Detection in the Speech of Different Talkers

2021 ◽  
pp. 002383092110460
Author(s):  
Martin Ho Kwan Ip ◽  
Anne Cutler

Many different prosodic cues can help listeners predict upcoming speech. However, no research to date has assessed listeners’ processing of preceding prosody from different speakers. The present experiments examine (1) whether individual speakers (of the same language variety) are likely to vary in their production of preceding prosody; (2) to the extent that there is talker variability, whether listeners are flexible enough to use any prosodic cues signaled by the individual speaker; and (3) whether types of prosodic cues (e.g., F0 versus duration) vary in informativeness. Using a phoneme-detection task, we examined whether listeners can entrain to different combinations of preceding prosodic cues to predict where focus will fall in an utterance. We used unsynthesized sentences recorded by four female native speakers of Australian English who happened to have used different preceding cues to produce sentences with prosodic focus: a combination of pre-focus overall duration cues, F0 and intensity (mean, maximum, range), and longer pre-target interval before the focused word onset (Speaker 1), only mean F0 cues, mean and maximum intensity, and longer pre-target interval (Speaker 2), only pre-target interval duration (Speaker 3), and only pre-focus overall duration and maximum intensity (Speaker 4). Listeners could entrain to almost every speaker’s cues (the exception being Speaker 4’s use of only pre-focus overall duration and maximum intensity), and could use whatever cues were available even when one of the cue sources was rendered uninformative. Our findings demonstrate both speaker variability and listener flexibility in the processing of prosodic focus.

2021 ◽  
pp. 1-19
Author(s):  
Julien MILLASSEAU ◽  
Ivan YUEN ◽  
Laurence BRUGGEMAN ◽  
Katherine DEMUTH

Abstract While voicing contrasts in word-onset position are acquired relatively early, much less is known about how and when they are acquired in word-coda position, where accurate production of these contrasts is also critical for distinguishing words (e.g., do g vs. do ck ). This study examined how the acoustic cues to coda voicing contrasts are realized in the speech of 4-year-old Australian English-speaking children. The results showed that children used similar acoustic cues to those of adults, including longer vowel duration and more frequent voice bar for voiced stops, and longer closure and burst durations for voiceless stops along with more frequent irregular pitch periods. This suggests that 4-year-olds have acquired productive use of the acoustic cues to coda voicing contrasts, though implementations are not yet fully adult-like. The findings have implications for understanding the development of phonological contrasts in populations for whom these may be challenging, such as children with hearing loss.


Author(s):  
V.P. MESHCHERYAKOV ◽  
◽  
YU.G. IVANOV ◽  
T.N. PIMKINA ◽  
E.V. ERMOSHINA

The aim of the research is to study the possibility of using a latent period of the ejection of the first portion of milk in order to evaluate the individual characteristics of the milk ejection features of cows using the technology of bucket milking and robotic milking. Two experiments were conducted on cows of Black-Motley breed. Under the first experiment, the individual characteristics of the milk ejection were shown using the technology of bucket milking. Under the second experiment, they were determined for the technology of robotic milking. The first experiment was conducted on 12 mature cows. They were milked with a serial milking machine. The process of lactation was recorded by means of a bucket counter. The parameters of milk ejection were defined by analyzing the curve of lactation and making calculations. The second experiment was conducted on 30 first-calf heifers. Cows were milked on robotic installation the Astronaut A4 of Lely Company (the Netherlands). The data of the information system of herd management Lely T4C have been used for the analysis. Depending on the indicator of a latent period of the first milk portion ejection in both experiments three groups of cows (I–III) have been isolated. The ability of milk ejection in the first group was identified as high, in the second group – average and in the third group – low. Both experiments showed that the value of a latent period of the first milk portion ejection determined the milk ejection ability of cows. The increase in the period of the first milk portion ejection has been found among cows as their milk ejection ability decreses. The currently used milking technology shows that the reduced milk ejection among cows leads to the decrease in the indicators of the average and maximum intensity of milk ejection, the first two minutes of milking and also it leads to longer duration of milking. Using the robotic milking, the authors found that the first-calf heifers with the short period of the first milk portion ejection are characterized by the shortest duration of treating the teats and staying in the milking parlor, the average duration of milk ejection from the each quarter of the udder, as well as high values of the average and maximum intensity of milk ejection. The first-calf heifers with slow milking capacity are characterized by the longest duration of treating the teats and staying in the milking parlor, the average duration of milk ejection from the each quarter of the udder, as well as the lowest values of the average and maximum intensity of milk ejection. This suggests that the selection of first-calf heifers with high milk ejection ability will help to increase the productivity of automatic milking systems during the milking process. It is proposed to use the value of a latent period of the first milk portion ejection in the breeding activities.


2000 ◽  
Vol 26 (3) ◽  
pp. 339-373 ◽  
Author(s):  
Andreas Stolcke ◽  
Klaus Ries ◽  
Noah Coccaro ◽  
Elizabeth Shriberg ◽  
Rebecca Bates ◽  
...  

We describe a statistical approach for modeling dialogue acts in conversational speech, i.e., speech-act-like units such as STATEMENT, Question, BACKCHANNEL, Agreement, Disagreement, and Apology. Our model detects and predicts dialogue acts based on lexical, collocational, and prosodic cues, as well as on the discourse coherence of the dialogue act sequence. The dialogue model is based on treating the discourse structure of a conversation as a hidden Markov model and the individual dialogue acts as observations emanating from the model states. Constraints on the likely sequence of dialogue acts are modeled via a dialogue act n-gram. The statistical dialogue grammar is combined with word n-grams, decision trees, and neural networks modeling the idiosyncratic lexical and prosodic manifestations of each dialogue act. We develop a probabilistic integration of speech recognition with dialogue modeling, to improve both speech recognition and dialogue act classification accuracy. Models are trained and evaluated using a large hand-labeled database of 1,155 conversations from the Switchboard corpus of spontaneous human-to-human telephone speech. We achieved good dialogue act labeling accuracy (65% based on errorful, automatically recognized words and prosody, and 71% based on word transcripts, compared to a chance baseline accuracy of 35% and human accuracy of 84%) and a small reduction in word recognition error.


Languages ◽  
2021 ◽  
Vol 6 (1) ◽  
pp. 44
Author(s):  
Jaydene Elvin ◽  
Daniel Williams ◽  
Jason A. Shaw ◽  
Catherine T. Best ◽  
Paola Escudero

This study tests whether Australian English (AusE) and European Spanish (ES) listeners differ in their categorisation and discrimination of Brazilian Portuguese (BP) vowels. In particular, we investigate two theoretically relevant measures of vowel category overlap (acoustic vs. perceptual categorisation) as predictors of non-native discrimination difficulty. We also investigate whether the individual listener’s own native vowel productions predict non-native vowel perception better than group averages. The results showed comparable performance for AusE and ES participants in their perception of the BP vowels. In particular, discrimination patterns were largely dependent on contrast-specific learning scenarios, which were similar across AusE and ES. We also found that acoustic similarity between individuals’ own native productions and the BP stimuli were largely consistent with the participants’ patterns of non-native categorisation. Furthermore, the results indicated that both acoustic and perceptual overlap successfully predict discrimination performance. However, accuracy in discrimination was better explained by perceptual similarity for ES listeners and by acoustic similarity for AusE listeners. Interestingly, we also found that for ES listeners, the group averages explained discrimination accuracy better than predictions based on individual production data, but that the AusE group showed no difference.


2021 ◽  
Author(s):  
◽  
Mengzhu Yan

<p>It is well established that focus plays an important role in facilitating language processing, i.e., focused words are recognised faster and remembered better. In addition, more recent research shows that alternatives to a word (e.g., sailor as an alternative to captain) are more activated when listeners hear the word with contrastive prominence (e.g., ‘The captain put on the raincoat) (bold indicates contrastive prominence). The mechanism behind these processing advantages is focus. Focus has two broad conceptions in relation to its effect on language processing: focus as updating the common ground and focus as indicating alternatives. Considerable psycholinguistic evidence has been obtained for processing advantages consistent with the first conception, and this evidence comes from studies across a reasonably wide range of languages. But the evidence for the second conception only comes from a handful of closely related languages (i.e., English, Dutch and German). Further, it has largely been confined to contrastive accenting as a marker of focus. Therefore, it is not clear if other types of focus marking (e.g., clefts) have similar processing effects. It is also not known if all this is true in Mandarin, as there is very little research in these areas in Mandarin. Mandarin uses pitch expansion to mark contrastive prominence, rather than the pitch accenting found in Germanic languages. Therefore, the investigation of Mandarin expands our knowledge of these speech processing effects to a different language and language family. It also expands our knowledge of the relative roles of prosody and syntax in marking focus and in speech processing in Mandarin, and in general.  This thesis tested how different types of focus marking affect the perception of focus and two aspects of language processing related to focus: the encoding and activation of discourse information (focused words and focus alternatives). The aim was to see whether there is a link between the relative importance of prosodic and syntactic focus marking in Mandarin and their effectiveness in these aspects of language processing. For focus perception, contrastive prominence and clefting have been claimed to mark focus in Mandarin, but it has not been well tested whether listeners perceive them as focus marking. For the first aspect of processing, it is not yet clear what cues listeners use to encode focused information beyond prominence when processing a discourse. For the second aspect, there has been rapidly growing interest in the role of alternatives in language processing, but little is known regarding the effect of clefting. In addition, it is not clear whether the prosodic and syntactic cues are equally effective, and again little research has been devoted to Mandarin. Therefore, the following experiments were conducted to look at these cues in Mandarin.  Experiment 1, a norming study, was conducted to help select stimuli for the following Experiments 2, 3, 4A and 4B. Experiment 2 investigated the relative weights of prosodic and syntactic focus cues in a question-answer appropriateness rating task. The findings show that in canonical word order sentences, the focus was perceived to be on the word that was marked by contrastive prominence. In clefts where the prominence and syntactic cues were on the same word, that word was perceived as being in focus. However, in ‘mismatch’ cases, e.g., 是[船长]F 穿上的[雨衣]F ‘It was the [captain]F who put on the [raincoat]F’ (F indicates focus), the focus was perceived to be on raincoat, the word that had contrastive prominence. In other words, participants weighted prosodic cues more highly. This suggests that prosodic prominence is a stronger focus cue than syntax in Mandarin.  Experiment 3 looked at the role of prosodic and syntactic cues in listeners’ encoding of discourse information in a speeded ‘false alternative’ rejection task. This experiment shows that false alternatives to a word in a sentence (e.g., sailor to captain in ‘The captain put on the raincoat’) were more easily rejected if captain was marked with prosodic cues than with syntactic cues. This experiment shows congruent results to those of Experiment 2, in that prosodic cues were more effective than syntactic cues in encoding discourse information. It seems that a more important marker of focus provides more effective encoding of discourse information.  Experiments 4A and 4B investigated the role of prosodic and syntactic focus cues in the activation of discourse information in Mandarin, using the cross-modal lexical priming paradigm. Both studies consistently show that prosodic focus marking, but not syntactic focus marking, facilitates the activation of identical targets (e.g., captain after hearing ‘The captain put on the raincoat’). Similarly, prosodic focus marking, but not syntactic focus marking, primes alternatives (e.g., sailor). But focus marking does not prime noncontrastive associates (e.g., deck). These findings, together with previous findings on focus particles (e.g., only), suggest that alternative priming is particularly related to contrastive prominence, at least in languages looked at to date. The relative priming effects of prosodic and syntactic focus cues in Experiments 4A and 4B are in line with their relative weights in Experiments 2 and 3.   This thesis presents a crucial link between the relative weights of prosodic and syntactic cues in marking focus, their degrees of effectiveness in encoding discourse information and their ability to activate discourse information in Mandarin. This research contributes significantly to our cross-linguistic understanding of prosodic and syntactic focus in speech processing, showing the processing advantages of focus may be common across languages, but what cues trigger the effects differ by language.</p>


2012 ◽  
Vol 3 ◽  
pp. 25
Author(s):  
Arunima Choudhury ◽  
Elsi Kaiser

The paper investigates the prosodic distinctions available in Bangla/Bengali to differentiate focus-types. Bangla has canonical SOV order. The immediate preverbal position is the default focus position. We conducted an elicitation study followed by a perception study to investigate whether Bangla speakers distinguish new-information vs. contrastive focus prosodically and whether the syntactic position of the focused constituent matters. We found reliable effects between focus-types only when the focused constituent is an object, in the default focus position. Therefore, Bangla uses prosodic cues to mark focus, but the perceptibility interacts with syntactic position: Differences between focus-types are amplified in default focus position.


2014 ◽  
Vol 27 (1) ◽  
pp. 455-476 ◽  
Author(s):  
John A. Knaff ◽  
Scott P. Longmore ◽  
Debra A. Molenar

Abstract Storm-centered infrared (IR) imagery of tropical cyclones (TCs) is related to the 850-hPa mean tangential wind at a radius of 500 km (V500) calculated from 6-hourly global numerical analyses for North Atlantic and eastern North Pacific TCs for 1995–2011. V500 estimates are scaled using the climatological vortex decay rate beyond 500 km to estimate the radius of 5 kt (1 kt = 0.514 m s−1) winds (R5) or TC size. A much larger historical record of TC-centered IR imagery (1978–2011) is then used to estimate TC sizes and form a global TC size climatology. The basin-specific distributions of TC size reveal that, among other things, the eastern North Pacific TC basins have the smallest while western North Pacific have the largest TC size distributions. The life cycle of TC sizes with respect to maximum intensity shows that TC growth characteristics are different among the individual TC basins, with the North Atlantic composites showing continued growth after maximum intensity. Small TCs are generally located at lower latitudes, westward steering, and preferred in seasons when environmental low-level vorticity is suppressed. Large TCs are generally located at higher latitudes, poleward steering, and preferred in enhanced low-level vorticity environments. Postmaximum intensity growth of TCs occurs in regions associated with enhanced baroclinicity and TC recurvature, while those that do not grow much are associated with west movement, erratic storm tracks, and landfall at or near the time of maximum intensity. With respect to climate change, no significant long-term trends are found in the dataset of TC size.


2021 ◽  
Author(s):  
◽  
Mengzhu Yan

<p>It is well established that focus plays an important role in facilitating language processing, i.e., focused words are recognised faster and remembered better. In addition, more recent research shows that alternatives to a word (e.g., sailor as an alternative to captain) are more activated when listeners hear the word with contrastive prominence (e.g., ‘The captain put on the raincoat) (bold indicates contrastive prominence). The mechanism behind these processing advantages is focus. Focus has two broad conceptions in relation to its effect on language processing: focus as updating the common ground and focus as indicating alternatives. Considerable psycholinguistic evidence has been obtained for processing advantages consistent with the first conception, and this evidence comes from studies across a reasonably wide range of languages. But the evidence for the second conception only comes from a handful of closely related languages (i.e., English, Dutch and German). Further, it has largely been confined to contrastive accenting as a marker of focus. Therefore, it is not clear if other types of focus marking (e.g., clefts) have similar processing effects. It is also not known if all this is true in Mandarin, as there is very little research in these areas in Mandarin. Mandarin uses pitch expansion to mark contrastive prominence, rather than the pitch accenting found in Germanic languages. Therefore, the investigation of Mandarin expands our knowledge of these speech processing effects to a different language and language family. It also expands our knowledge of the relative roles of prosody and syntax in marking focus and in speech processing in Mandarin, and in general.  This thesis tested how different types of focus marking affect the perception of focus and two aspects of language processing related to focus: the encoding and activation of discourse information (focused words and focus alternatives). The aim was to see whether there is a link between the relative importance of prosodic and syntactic focus marking in Mandarin and their effectiveness in these aspects of language processing. For focus perception, contrastive prominence and clefting have been claimed to mark focus in Mandarin, but it has not been well tested whether listeners perceive them as focus marking. For the first aspect of processing, it is not yet clear what cues listeners use to encode focused information beyond prominence when processing a discourse. For the second aspect, there has been rapidly growing interest in the role of alternatives in language processing, but little is known regarding the effect of clefting. In addition, it is not clear whether the prosodic and syntactic cues are equally effective, and again little research has been devoted to Mandarin. Therefore, the following experiments were conducted to look at these cues in Mandarin.  Experiment 1, a norming study, was conducted to help select stimuli for the following Experiments 2, 3, 4A and 4B. Experiment 2 investigated the relative weights of prosodic and syntactic focus cues in a question-answer appropriateness rating task. The findings show that in canonical word order sentences, the focus was perceived to be on the word that was marked by contrastive prominence. In clefts where the prominence and syntactic cues were on the same word, that word was perceived as being in focus. However, in ‘mismatch’ cases, e.g., 是[船长]F 穿上的[雨衣]F ‘It was the [captain]F who put on the [raincoat]F’ (F indicates focus), the focus was perceived to be on raincoat, the word that had contrastive prominence. In other words, participants weighted prosodic cues more highly. This suggests that prosodic prominence is a stronger focus cue than syntax in Mandarin.  Experiment 3 looked at the role of prosodic and syntactic cues in listeners’ encoding of discourse information in a speeded ‘false alternative’ rejection task. This experiment shows that false alternatives to a word in a sentence (e.g., sailor to captain in ‘The captain put on the raincoat’) were more easily rejected if captain was marked with prosodic cues than with syntactic cues. This experiment shows congruent results to those of Experiment 2, in that prosodic cues were more effective than syntactic cues in encoding discourse information. It seems that a more important marker of focus provides more effective encoding of discourse information.  Experiments 4A and 4B investigated the role of prosodic and syntactic focus cues in the activation of discourse information in Mandarin, using the cross-modal lexical priming paradigm. Both studies consistently show that prosodic focus marking, but not syntactic focus marking, facilitates the activation of identical targets (e.g., captain after hearing ‘The captain put on the raincoat’). Similarly, prosodic focus marking, but not syntactic focus marking, primes alternatives (e.g., sailor). But focus marking does not prime noncontrastive associates (e.g., deck). These findings, together with previous findings on focus particles (e.g., only), suggest that alternative priming is particularly related to contrastive prominence, at least in languages looked at to date. The relative priming effects of prosodic and syntactic focus cues in Experiments 4A and 4B are in line with their relative weights in Experiments 2 and 3.   This thesis presents a crucial link between the relative weights of prosodic and syntactic cues in marking focus, their degrees of effectiveness in encoding discourse information and their ability to activate discourse information in Mandarin. This research contributes significantly to our cross-linguistic understanding of prosodic and syntactic focus in speech processing, showing the processing advantages of focus may be common across languages, but what cues trigger the effects differ by language.</p>


2013 ◽  
Vol 11 (1) ◽  
pp. 41-56
Author(s):  
Celine Horgues

In English, prosodic parameters play a major role at two main levels. First, they indicate the intonation at the level of the utterance by marking the distinction between sentence types (statements vs questions) and they are related – although more or less directly- to the informational and grammatical structures of the utterance. Secondly, prosodic cues also contribute to marking the stress pattern at the level of the word (word stress or lexical stress). Even if it is useful to dissociate these two levels theoretically, when looking at their phonetic implementation in an utterance, it soon appears that the exact same prosodic cues are used (namely fundamental frequency, duration, and intensity). Contrary to what happens in tone languages, there is no pre-set prosodic configuration attached to each word in English. Yet, words in discourse retain a relative accentual independence even though the exact prosodic implementation of word stress depends on the specific intonational context expressed in a given utterance (Pierrehumbert, 1980). In French, stress pertains to the level of the group of words rather than to the individual word, which has no real accentual autonomy. Therefore, it is not surprising that French learners of English are faced with a major challenge: how to ensure the marking of lexical stress while, at the same time, using the same prosodic cues to indicate the intonational structure of the utterance. My hypothesis is that some intonational contexts impose a bigger constraint on French learners of English than others. These particularly challenging contexts are the final position at the boundary of non-final clause, or the boundary of a rising interrogative. Other contexts, like the quotation form or the final position of a statement, are less challenging for the intonational marking of lexical stress. To test my hypothesis, I collected passages of read speech by thirteen upper intermediate/advanced French learners of English along with the same passage read by ten native English speakers. Two trisyllabics carrying primary stress on the second syllable (com㆐puter, pro㆐tection) were placed in a series of intonational contexts under observation. The test-words were then extracted and submitted to native English listeners. The perceptual results show that the predicted ‘challenging’ contexts indeed caused substantial instability in the learners’ placement of lexical stress as perceived by native English listeners.


Author(s):  
Yike Yang ◽  
Si Chen

This paper investigated whether and how individual speakers of Mandarin Chinese (Mandarin) mark prosodic focus (broad focus vs verb focus) differently in their production, and tested focus effects on mean F0, duration and intensity. The findings indicated the role of the three acoustic cues in Mandarin focus marking at both the group and individual levels. Meanwhile, the individual data showed great variations among speakers in terms of the extent to which the cues were employed. It is proposed that the dynamics of acoustic cues should be considered in future studies and caution should be taken when selecting stimuli for focus perception studies.


Sign in / Sign up

Export Citation Format

Share Document