A Systematic Study of Open Source and Commercial Text-to-Speech (TTS) Engines

Author(s):  
Jordan Hosier ◽  
Jordan Kalfen ◽  
Nikhita Sharma ◽  
Vijay K. Gurbani
2016 ◽  
Author(s):  
John Andersson ◽  
Sebastian Berlin ◽  
André Costa ◽  
Harald Berthelsen ◽  
Hanna Lindgren ◽  
...  

Author(s):  
Saida Mussakhojayeva ◽  
Aigerim Janaliyeva ◽  
Almas Mirzakhmetov ◽  
Yerbolat Khassanov ◽  
Huseyin Atakan Varol

2016 ◽  
Author(s):  
Andrew Wilkinson ◽  
Alok Parlikar ◽  
Sunayana Sitaram ◽  
Tim White ◽  
Alan W. Black ◽  
...  
Keyword(s):  

2010 ◽  
Author(s):  
Igor Couto ◽  
Aldebaro Klautau ◽  
Ranniery Maia ◽  
Nelson Neto ◽  
Vincent Tadaiesky

Author(s):  
Kartik Tiwari

Abstract: This paper introduces a new text-to-speech presentation from end-to-end (E2E-TTS) using toolkit called ESPnet-TTS, which is an open source extension. ESPnet speech processing tools kit. Various models come under ESPnet TTS TacoTron 2, Transformer TTS, and Fast Speech. This also provides recipes recommended by the Kaldi speech recognition tool kit (ASR). Recipes based on the composition combined with the ESPnet ASR recipe, which provides high performance. This toolkit also provides pre-trained models and samples of all recipes for users to use as a base .It works on TTS-STT and translation features for various indicator languages, with a strong focus on English, Marathi and Hindi. This paper also shows that neural sequence-to-sequence models find the state of the art or near the effects of the art state on existing databases. We also analyze some of the key design challenges that contribute to the development of a multilingual business translation system, which includes processing bilingual business data sets and evaluating multiple translation methods. The test result can be obtained using tokens and these test results show that our models can achieve modern performance compared to the latest LJ Speech tool kit data. Terms of Reference — Open source, end-to-end, text-to-speech


2021 ◽  
pp. 1-5
Author(s):  
Elham Akhlaghi ◽  
Anna Bączkowska ◽  
Harald Berthelsen ◽  
Branislav Bédi ◽  
Cathy Chua ◽  
...  

A popular idea in Computer Assisted Language Learning (CALL) is to use multimodal annotated texts, with annotations typically including embedded audio and translations, to support L2 learning through reading. An important question is how to create the audio, which can be done either through human recording or by a Text-To-Speech (TTS) synthesis engine. We may reasonably expect TTS to be quicker and easier, but humans to be of higher quality. Here, we report a study using the open-source LARA platform and ten languages. Samples of LARA audio totaling about three and a half minutes were provided for each language in both human and TTS form; subjects used a web form to compare different versions of the same item and rate the voices as a whole. Although human voice was more often preferred, TTS achieved higher ratings in some languages and was close in others.


2013 ◽  
Vol 385-386 ◽  
pp. 1790-1796
Author(s):  
Lie Bin Yu

With open source, stable performance, simple structure, integrated in Linux kernel, and good performance, KVM (Kernel-based Virtual Machine) gets wide attention from major IT vendors and academia circles. However, virtualization technology will introduce additional overhead, especially in the I/O virtualization. The current researches about KVM have been carried out mostly focused on the performance comparison and analysis between virtualization solutions. There is no systematic study of the KVM I/O performance. In this paper, the I/O architecture in KVM is firstly studied, and then the disk and network performance of fully virtualization KVM and Para-virtualization KVM with all the different configurations are systemically tested and analyzed. Both the KVM I/O performance and the influences of the different configuration options on the performance are figured out in this paper. These are quite important and useful for the users and designers.


Author(s):  
Gianluigi Botton ◽  
Gilles L'espérance

As interest for parallel EELS spectrum imaging grows in laboratories equipped with commercial spectrometers, different approaches were used in recent years by a few research groups in the development of the technique of spectrum imaging as reported in the literature. Either by controlling, with a personal computer both the microsope and the spectrometer or using more powerful workstations interfaced to conventional multichannel analysers with commercially available programs to control the microscope and the spectrometer, spectrum images can now be obtained. Work on the limits of the technique, in terms of the quantitative performance was reported, however, by the present author where a systematic study of artifacts detection limits, statistical errors as a function of desired spatial resolution and range of chemical elements to be studied in a map was carried out The aim of the present paper is to show an application of quantitative parallel EELS spectrum imaging where statistical analysis is performed at each pixel and interpretation is carried out using criteria established from the statistical analysis and variations in composition are analyzed with the help of information retreived from t/γ maps so that artifacts are avoided.


Sign in / Sign up

Export Citation Format

Share Document