Towards a high quality Arabic speech synthesis system based on neural networks and residual excited vocal tract model

In this paper, we present our first Vietnamese speech synthesis system based on deep neural networks. To improve the training data collected from the Internet, a cleaning method is proposed. The experimental results indicate that by using deeper architectures we can achieve better performance for the TTS than using shallow architectures such as hidden Markov model. We also present the effect of using different amounts of data to train the TTS systems. In the VLSP TTS challenge 2018, our proposed DNN-based speech synthesis system won the first place in all three subjects including naturalness, intelligibility, and MOS.

Download Full-text

WORLD: A Vocoder-Based High-Quality Speech Synthesis System for Real-Time Applications

IEICE Transactions on Information and Systems ◽

10.1587/transinf.2015edp7457 ◽

2016 ◽

Vol E99.D (7) ◽

pp. 1877-1884 ◽

Cited By ~ 216

Author(s):

Masanori MORISE ◽

Fumiya YOKOMORI ◽

Kenji OZAWA

Keyword(s):

Real Time ◽

Speech Synthesis ◽

Synthesis System ◽

High Quality ◽

Real Time Applications

Download Full-text

Fast and High-Quality Singing Voice Synthesis System Based on Convolutional Neural Networks

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp40776.2020.9053811 ◽

2020 ◽

Cited By ~ 1

Author(s):

Kazuhiro Nakamura ◽

Shinji Takaki ◽

Kei Hashimoto ◽

Keiichiro Oura ◽

Yoshihiko Nankaku ◽

...

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Singing Voice ◽

Synthesis System ◽

High Quality ◽

Voice Synthesis

Download Full-text

High quality text-to-speech synthesis system with efficient duration models developed using coding schemes based on vowel production characteristics

2013 13th International Conference on Intellient Systems Design and Applications ◽

10.1109/isda.2013.6920727 ◽

2013 ◽

Author(s):

V. Ramu Reddy ◽

K. Sreenivasa Rao

Keyword(s):

Speech Synthesis ◽

Duration Models ◽

Text To Speech ◽

Vowel Production ◽

Synthesis System ◽

High Quality ◽

Coding Schemes ◽

Text To Speech Synthesis ◽

Production Characteristics

Download Full-text

The Romanian speech synthesis (RSS) corpus: Building a high quality HMM-based speech synthesis system using a high sampling rate

Speech Communication ◽

10.1016/j.specom.2010.12.002 ◽

2011 ◽

Vol 53 (3) ◽

pp. 442-450 ◽

Cited By ~ 25

Author(s):

Adriana Stan ◽

Junichi Yamagishi ◽

Simon King ◽

Matthew Aylett

Keyword(s):

Speech Synthesis ◽

Sampling Rate ◽

Synthesis System ◽

High Quality ◽

High Sampling ◽

High Sampling Rate

Download Full-text

A HMM-based speech synthesis system using a new glottal source and vocal-tract separation method

2010 IEEE International Conference on Acoustics, Speech and Signal Processing ◽

10.1109/icassp.2010.5495550 ◽

2010 ◽

Cited By ~ 8

Author(s):

Pierre Lanchantin ◽

Gilles Degottex ◽

Xavier Rodet

Keyword(s):

Speech Synthesis ◽

Vocal Tract ◽

Separation Method ◽

Synthesis System ◽

Glottal Source

Download Full-text

Prediction of Sugar Content in Port Wine Vintage Grapes Using Machine Learning and Hyperspectral Imaging

Processes ◽

10.3390/pr9071241 ◽

2021 ◽

Vol 9 (7) ◽

pp. 1241

Author(s):

Véronique Gomes ◽

Marco S. Reis ◽

Francisco Rovira-Más ◽

Ana Mendes-Ferreira ◽

Pedro Melo-Pinto

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Hyperspectral Imaging ◽

Sugar Content ◽

Hyperspectral Data ◽

Machine Learning Algorithms ◽

Port Wine ◽

Monitoring And Control ◽

High Quality ◽

Harvesting Stage

The high quality of Port wine is the result of a sequence of winemaking operations, such as harvesting, maceration, fermentation, extraction and aging. These stages require proper monitoring and control, in order to consistently achieve the desired wine properties. The present work focuses on the harvesting stage, where the sugar content of grapes plays a key role as one of the critical maturity parameters. Our approach makes use of hyperspectral imaging technology to rapidly extract information from wine grape berries; the collected spectra are fed to machine learning algorithms that produce estimates of the sugar level. A consistent predictive capability is important for establishing the harvest date, as well as to select the best grapes to produce specific high-quality wines. We compared four different machine learning methods (including deep learning), assessing their generalization capacity for different vintages and varieties not included in the training process. Ridge regression, partial least squares, neural networks and convolutional neural networks were the methods considered to conduct this comparison. The results show that the estimated models can successfully predict the sugar content from hyperspectral data, with the convolutional neural network outperforming the other methods.

Download Full-text

Generation and Annotation of Simulation-Real Ship Images for Convolutional Neural Networks Training and Testing

Applied Sciences ◽

10.3390/app11135931 ◽

2021 ◽

Vol 11 (13) ◽

pp. 5931

Author(s):

Ji’an You ◽

Zhaozheng Hu ◽

Chao Peng ◽

Zhiqiang Wang

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Image Annotation ◽

Three Dimensional ◽

Image Data ◽

Detection Algorithm ◽

Simulation Software ◽

High Quality ◽

Annotation Method ◽

Detection Of Objects

Large amounts of high-quality image data are the basis and premise of the high accuracy detection of objects in the field of convolutional neural networks (CNN). It is challenging to collect various high-quality ship image data based on the marine environment. A novel method based on CNN is proposed to generate a large number of high-quality ship images to address this. We obtained ship images with different perspectives and different sizes by adjusting the ships’ postures and sizes in three-dimensional (3D) simulation software, then 3D ship data were transformed into 2D ship image according to the principle of pinhole imaging. We selected specific experimental scenes as background images, and the target ships of the 2D ship images were superimposed onto the background images to generate “Simulation–Real” ship images (named SRS images hereafter). Additionally, an image annotation method based on SRS images was designed. Finally, the target detection algorithm based on CNN was used to train and test the generated SRS images. The proposed method is suitable for generating a large number of high-quality ship image samples and annotation data of corresponding ship images quickly to significantly improve the accuracy of ship detection. The annotation method proposed is superior to the annotation methods that label images with the image annotation software of Label-me and Label-img in terms of labeling the SRS images.

Download Full-text