scholarly journals A call for generic-use large-scale single-speaker speech corpora and an example of their application in concatenative speech synthesis.

1999 ◽  
Vol 20 (3) ◽  
pp. 241-246
Author(s):  
Nick Campbell
2021 ◽  
Vol 14 (3) ◽  
pp. 1-26
Author(s):  
Danielle Bragg ◽  
Katharina Reinecke ◽  
Richard E. Ladner

As conversational agents and digital assistants become increasingly pervasive, understanding their synthetic speech becomes increasingly important. Simultaneously, speech synthesis is becoming more sophisticated and manipulable, providing the opportunity to optimize speech rate to save users time. However, little is known about people’s abilities to understand fast speech. In this work, we provide an extension of the first large-scale study on human listening rates, enlarging the prior study run with 453 participants to 1,409 participants and adding new analyses on this larger group. Run on LabintheWild, it used volunteer participants, was screen reader accessible, and measured listening rate by accuracy at answering questions spoken by a screen reader at various rates. Our results show that people who are visually impaired, who often rely on audio cues and access text aurally, generally have higher listening rates than sighted people. The findings also suggest a need to expand the range of rates available on personal devices. These results demonstrate the potential for users to learn to listen to faster rates, expanding the possibilities for human-conversational agent interaction.


2010 ◽  
Vol 13 (2) ◽  
pp. 85-99 ◽  
Author(s):  
Grażyna Demenko ◽  
Katarzyna Klessa ◽  
Marcin Szymański ◽  
Stefan Breuer ◽  
Wolfgang Hess

2010 ◽  
Vol 37 (3) ◽  
pp. 671-703 ◽  
Author(s):  
Heidi R. Waterfall ◽  
Ben Sandbank ◽  
Luca Onnis ◽  
Shimon Edelman

ABSTRACTThis paper reports progress in developing a computer model of language acquisition in the form of (1) a generative grammar that is (2) algorithmically learnable from realistic corpus data, (3) viable in its large-scale quantitative performance and (4) psychologically real. First, we describe new algorithmic methods for unsupervised learning of generative grammars from raw CHILDES data and give an account of the generative performance of the acquired grammars. Next, we summarize findings from recent longitudinal and experimental work that suggests how certain statistically prominent structural properties of child-directed speech may facilitate language acquisition. We then present a series of new analyses of CHILDES data indicating that the desired properties are indeed present in realistic child-directed speech corpora. Finally, we suggest how our computational results, behavioral findings, and corpus-based insights can be integrated into a next-generation model aimed at meeting the four requirements of our modeling framework.


Sign in / Sign up

Export Citation Format

Share Document