synthetic speech
Recently Published Documents


TOTAL DOCUMENTS

467
(FIVE YEARS 59)

H-INDEX

30
(FIVE YEARS 3)

eLife ◽  
2021 ◽  
Vol 10 ◽  
Author(s):  
Agnès Landemard ◽  
Célian Bimbard ◽  
Charlie Demené ◽  
Shihab Shamma ◽  
Sam Norman-Haignere ◽  
...  

Little is known about how neural representations of natural sounds differ across species. For example, speech and music play a unique role in human hearing, yet it is unclear how auditory representations of speech and music differ between humans and other animals. Using functional ultrasound imaging, we measured responses in ferrets to a set of natural and spectrotemporally matched synthetic sounds previously tested in humans. Ferrets showed similar lower-level frequency and modulation tuning to that observed in humans. But while humans showed substantially larger responses to natural vs. synthetic speech and music in non-primary regions, ferret responses to natural and synthetic sounds were closely matched throughout primary and non-primary auditory cortex, even when tested with ferret vocalizations. This finding reveals that auditory representations in humans and ferrets diverge sharply at late stages of cortical processing, potentially driven by higher-order processing demands in speech and music.


2021 ◽  
Vol 11 (21) ◽  
pp. 10475
Author(s):  
Xiao Zhou ◽  
Zhenhua Ling ◽  
Yajun Hu ◽  
Lirong Dai

An encoder–decoder with attention has become a popular method to achieve sequence-to-sequence (Seq2Seq) acoustic modeling for speech synthesis. To improve the robustness of the attention mechanism, methods utilizing the monotonic alignment between phone sequences and acoustic feature sequences have been proposed, such as stepwise monotonic attention (SMA). However, the phone sequences derived by grapheme-to-phoneme (G2P) conversion may not contain the pauses at the phrase boundaries in utterances, which challenges the assumption of strictly stepwise alignment in SMA. Therefore, this paper proposes to insert hidden states into phone sequences to deal with the situation that pauses are not provided explicitly, and designs a semi-stepwise monotonic attention (SSMA) to model these inserted hidden states. In this method, hidden states are introduced that absorb the pause segments in utterances in an unsupervised way. Thus, the attention at each decoding frame has three options, moving forward to the next phone, staying at the same phone, or jumping to a hidden state. Experimental results show that SSMA can achieve better naturalness of synthetic speech than SMA when phrase boundaries are not available. Moreover, the pause positions derived from the alignment paths of SSMA matched the manually labeled phrase boundaries quite well.


2021 ◽  
Author(s):  
Xinhui Chen ◽  
You Zhang ◽  
Ge Zhu ◽  
Zhiyao Duan

2021 ◽  
pp. 103256
Author(s):  
Jialong Li ◽  
Hongxia Wang ◽  
Peisong He ◽  
Sani M. Abdullahi ◽  
Bin Li

Author(s):  
Ben Noah ◽  
Arathi Sethumadhavan ◽  
Josh Lovejoy ◽  
David Mondello

Text-to-Speech (TTS) technologies have provided ways to produce acoustic approximations of human voices. However, recent advancements in machine learning (i.e., neural network TTS) have helped move beyond coarse mimicry and towards more natural-sounding speech. With only a small collection of recorded utterances, it is now possible to generate wholly synthetic voices indistinguishable from those of human speakers. While these new approaches to speech synthesis can help facilitate more seamless experiences with artificial agents, they also lower the barrier to entry for those seeking to perpetrate deception. As such, in the development of these technologies, it is important to anticipate potential harms and devise strategies to help mitigate against misuse. This paper presents findings from a 360-person survey that assessed public perceptions of synthetic voices, with a particular focus on how voice type and social scenarios impact ratings of trust. Findings have implications for the responsible deployment of synthetic speech technologies.


2021 ◽  
Author(s):  
Sai Sirisha Rallabandi ◽  
Abhinav Bharadwaj ◽  
Babak Naderi ◽  
Sebastian Möller
Keyword(s):  

2021 ◽  
Author(s):  
Xu Li ◽  
Xixin Wu ◽  
Hui Lu ◽  
Xunying Liu ◽  
Helen Meng

2021 ◽  
Author(s):  
Yahya Aldholmi ◽  
Rawan Aldhafyan ◽  
Asma Alqahtani

2021 ◽  
Author(s):  
Mikey Elmers ◽  
Raphael Werner ◽  
Beeke Muhlack ◽  
Bernd Möbius ◽  
Jürgen Trouvain

Sign in / Sign up

Export Citation Format

Share Document