scholarly journals Performance Comparison and Duration Model Improvement of Speaker Adaptation Methods in HMM-based Korean Speech Synthesis

2012 ◽  
Vol 4 (3) ◽  
pp. 111-117
Author(s):  
Hea-Min Lee ◽  
Hyung-Soon Kim
Author(s):  
Aakshi Mittal ◽  
Mohit Dua

AbstractDetection of spoof is essential for improving the performance of current scenario of Automatic Speaker Verification (ASV) systems. Empowerment to both frontend and backend parts can build the robust ASV systems. First, this paper discuses performance comparison of static and static–dynamic Constant Q Cepstral Coefficients (CQCC) frontend features by using Long Short Term Memory (LSTM) with Time Distributed Wrappers model at the backend. Second, it performs comparative analysis of ASV systems built using three deep learning models LSTM with Time Distributed Wrappers, LSTM and Convolutional Neural Network at backend and using static–dynamic CQCC features at frontend. Third, it discusses implementation of two spoof detection systems for ASV by using same static–dynamic CQCC features at frontend and different combination of deep learning models at backend. Out of these two, the first one is a voting protocol based two-level spoof detection system that uses CNN, LSTM model at first level and LSTM with Time Distributed Wrappers model at second level. The second one is a two-level spoof detection system with user identification and verification protocol, which uses LSTM model for user identification at first level and LSTM with Time Distributed Wrappers for verification at the second level. For implementing the proposed work, a variation in ASVspoof 2019 dataset has been used to introduce all types of spoofing attacks such as Speech Synthesis (SS), Voice Conversion (VC) and replay in single set of dataset. The results show that, at frontend, static–dynamic CQCC feature outperform static CQCC features and at the backend, hybrid combination of deep learning models increases accuracy of spoof detection systems.


2007 ◽  
Vol 21 (2) ◽  
pp. 325-349 ◽  
Author(s):  
Ọdẹ´túnjí A. Ọdẹ´jọbí ◽  
Shun Ha Sylvia Wong ◽  
Anthony J. Beaumont

2009 ◽  
Author(s):  
Robert E. Remez ◽  
Kathryn R. Dubowski ◽  
Morgana L. Davids ◽  
Emily F. Thomas ◽  
Nina Paddu ◽  
...  
Keyword(s):  

2020 ◽  
pp. 1-12
Author(s):  
Li Dongmei

English text-to-speech conversion is the key content of modern computer technology research. Its difficulty is that there are large errors in the conversion process of text-to-speech feature recognition, and it is difficult to apply the English text-to-speech conversion algorithm to the system. In order to improve the efficiency of the English text-to-speech conversion, based on the machine learning algorithm, after the original voice waveform is labeled with the pitch, this article modifies the rhythm through PSOLA, and uses the C4.5 algorithm to train a decision tree for judging pronunciation of polyphones. In order to evaluate the performance of pronunciation discrimination method based on part-of-speech rules and HMM-based prosody hierarchy prediction in speech synthesis systems, this study constructed a system model. In addition, the waveform stitching method and PSOLA are used to synthesize the sound. For words whose main stress cannot be discriminated by morphological structure, label learning can be done by machine learning methods. Finally, this study evaluates and analyzes the performance of the algorithm through control experiments. The results show that the algorithm proposed in this paper has good performance and has a certain practical effect.


Author(s):  
Molla Asmare ◽  
Mustafa Ilbas

Nowadays, the most decisive challenges we are fronting are perfectly clean energy making for equitable and sustainable modern energy access, and battling the emerging alteration of the climate. This is because, carbon-rich fuels are the fundamental supply of utilized energy for strengthening human society, and it will be sustained in the near future. In connection with this, electrochemical technologies are an emerging and domineering tool for efficiently transforming the existing scarce fossil fuels and renewable energy sources into electric power with a trivial environmental impact. Compared with conventional power generation technologies, SOFC that operate at high temperature is emerging as a frontrunner to convert the fuels chemical energy into electric power and permits the deployment of varieties of fuels with negligible ecological destructions. According to this critical review, direct ammonia is obtained as a primary possible choice and price-effective green fuel for T-SOFCs. This is because T-SOFCs have higher volumetric power density, mechanically stable, and high thermal shocking resistance. Also, there is no sealing issue problem which is the chronic issues of the planar one. As a result, the toxicity of ammonia to use as a fuel is minimized if there may be a leakage during operation. It is portable and manageable that can be work everywhere when there is energy demand. Besides, manufacturing, onboard hydrogen deposition, and transportation infrastructure connected snags of hydrogen will be solved using ammonia. Ammonia is a low-priced carbon-neutral source of energy and has more stored volumetric energy compared with hydrogen. Yet, to utilize direct NH3 as a means of hydrogen carrier and an alternative green fuel in T-SOFCs practically determining the optimum operating temperatures, reactant flow rates, electrode porosities, pressure, the position of the anode, thickness and diameters of the tube are still requiring further improvement. Therefore, mathematical modeling ought to be developed to determine these parameters before planning for experimental work. Also, a performance comparison of AS, ES, and CS- T-SOFC powered with direct NH3 will be investigated and best-performed support will be carefully chosen for practical implementation and an experimental study will be conducted for verification based on optimum parameter values obtained from numerical modeling.


Sign in / Sign up

Export Citation Format

Share Document