Design of English text-to-speech conversion algorithm based on machine learning

2020 ◽  
pp. 1-12
Author(s):  
Li Dongmei

English text-to-speech conversion is the key content of modern computer technology research. Its difficulty is that there are large errors in the conversion process of text-to-speech feature recognition, and it is difficult to apply the English text-to-speech conversion algorithm to the system. In order to improve the efficiency of the English text-to-speech conversion, based on the machine learning algorithm, after the original voice waveform is labeled with the pitch, this article modifies the rhythm through PSOLA, and uses the C4.5 algorithm to train a decision tree for judging pronunciation of polyphones. In order to evaluate the performance of pronunciation discrimination method based on part-of-speech rules and HMM-based prosody hierarchy prediction in speech synthesis systems, this study constructed a system model. In addition, the waveform stitching method and PSOLA are used to synthesize the sound. For words whose main stress cannot be discriminated by morphological structure, label learning can be done by machine learning methods. Finally, this study evaluates and analyzes the performance of the algorithm through control experiments. The results show that the algorithm proposed in this paper has good performance and has a certain practical effect.

2020 ◽  
Vol 10 (19) ◽  
pp. 6882
Author(s):  
Kostadin Mishev ◽  
Aleksandra Karovska Ristovska ◽  
Dimitar Trajanov ◽  
Tome Eftimov ◽  
Monika Simjanoska

This paper presents MAKEDONKA, the first open-source Macedonian language synthesizer that is based on the Deep Learning approach. The paper provides an overview of the numerous attempts to achieve a human-like reproducible speech, which has unfortunately shown to be unsuccessful due to the work invisibility and lack of integration examples with real software tools. The recent advances in Machine Learning, the Deep Learning-based methodologies, provide novel methods for feature engineering that allow for smooth transitions in the synthesized speech, making it sound natural and human-like. This paper presents a methodology for end-to-end speech synthesis that is based on a fully-convolutional sequence-to-sequence acoustic model with a position-augmented attention mechanism—Deep Voice 3. Our model directly synthesizes Macedonian speech from characters. We created a dataset that contains approximately 20 h of speech from a native Macedonian female speaker, and we use it to train the text-to-speech (TTS) model. The achieved MOS score of 3.93 makes our model appropriate for application in any kind of software that needs text-to-speech service in the Macedonian language. Our TTS platform is publicly available for use and ready for integration.


2020 ◽  
Author(s):  
Alan Mejia Maza ◽  
Seth Jarvis ◽  
Weaverly Colleen Lee ◽  
Thomas J. Cunningham ◽  
Giampietro Schiavo ◽  
...  

AbstractThe neuromuscular junction (NMJ) is the peripheral synapse formed between a motor neuron axon terminal and a muscle fibre. NMJs are thought to be the primary site of peripheral pathology in many neuromuscular diseases, but innervation/denervation status is often assessed qualitatively with poor systematic criteria across studies, and separately from 3D morphological structure. Here, we describe the development of ‘NMJ-Analyser’, to comprehensively screen the morphology of NMJs and their corresponding innervation status automatically. NMJ-Analyser generates 29 biologically relevant features to quantitatively define healthy and aberrant neuromuscular synapses and applies machine learning to diagnose NMJ degeneration. We validated this framework in longitudinal analyses of wildtype mice, as well as in four different neuromuscular disease models: three for amyotrophic lateral sclerosis (ALS) and one for peripheral neuropathy. We showed that structural changes at the NMJ initially occur in the nerve terminal of mutant TDP43 and FUS ALS models. Using a machine learning algorithm, healthy and aberrant neuromuscular synapses are identified with 95% accuracy, with 88% sensitivity and 97% specificity. Our results validate NMJ-Analyser as a robust platform for systematic and structural screening of NMJs, and pave the way for transferrable, and cross-comparison and high-throughput studies in neuromuscular diseases.


2020 ◽  
pp. 1-12
Author(s):  
Ao Qi ◽  
Liu Narengerile

At present, the recognition method based on character segmentation is not effective in recognizing English text, and the traditional methods are based on the structural features and statistical characteristics of strokes. In order to improve the recognition effect of in English text, from the perspective of machine learning, this study introduces multi-features to improve the lack of information caused by the small Chinese data set. Moreover, this study disassembles the character recognition problem into a text matching problem of question and answer, and the textual entailment problem of answer and standard answer and continues training on the data set of short text score. The final result has a certain improvement, which proves the usability of the mechanism designed in this paper. In order to study the performance of the model proposed in this paper, the model proposed in this paper and the neural network recognition model are compared in terms of recognition accuracy and recognition speed. The research results show that the algorithm proposed in this paper has a certain effect.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Alan Mejia Maza ◽  
Seth Jarvis ◽  
Weaverly Colleen Lee ◽  
Thomas J. Cunningham ◽  
Giampietro Schiavo ◽  
...  

AbstractThe neuromuscular junction (NMJ) is the peripheral synapse formed between a motor neuron axon terminal and a muscle fibre. NMJs are thought to be the primary site of peripheral pathology in many neuromuscular diseases, but innervation/denervation status is often assessed qualitatively with poor systematic criteria across studies, and separately from 3D morphological structure. Here, we describe the development of ‘NMJ-Analyser’, to comprehensively screen the morphology of NMJs and their corresponding innervation status automatically. NMJ-Analyser generates 29 biologically relevant features to quantitatively define healthy and aberrant neuromuscular synapses and applies machine learning to diagnose NMJ degeneration. We validated this framework in longitudinal analyses of wildtype mice, as well as in four different neuromuscular disease models: three for amyotrophic lateral sclerosis (ALS) and one for peripheral neuropathy. We showed that structural changes at the NMJ initially occur in the nerve terminal of mutant TDP43 and FUS ALS models. Using a machine learning algorithm, healthy and aberrant neuromuscular synapses are identified with 95% accuracy, with 88% sensitivity and 97% specificity. Our results validate NMJ-Analyser as a robust platform for systematic and structural screening of NMJs, and pave the way for transferrable, and cross-comparison and high-throughput studies in neuromuscular diseases.


Author(s):  
Bhushan Hemant Dhimate ◽  
◽  
Manjiri Vitthal Khopade ◽  
Avadhoot Yogesh Dhere ◽  
Supriya Dhanaraj Dhumale ◽  
...  

Text to speech conversion is one of the applications of machine learning. It is widely used in search engines, standalone applications, web applications, chatbots and android applications. But still there is need to upgrade text to speech system so that we can get more interactive and user-friendly application. Traditional text to speech application has monotonous voice as output which does not has emotions in it and seems to be more mechanized. So, there is need to improvise the existing system by embedding the flavour of emotions in it. Existing text to speech cannot be used in story telling applications also it does not provide effective communication. Most of the Text to Speech systems are developed using algorithms such as Support Vector Machine (SVM), Naïve Bayes etc. Emotion Based Text to Speech System will help to improvise the existing Text to Speech system. With the help of machine learning and deep learning algorithm such as Recurrent Neural Network can be used for performing sentiment analysis and semantic analysis on the input text. We are going to use neural network which is more effective and help to maintain a relation between previous word and next word. Emotion based text to speech system will be able to identify four emotions ‘happy’, ‘sad’, ‘angry’ and ‘neutral’. Emotion based text to speech system will be beneficial for educational purpose like listening stories from storytelling applications for young budding children. Emotion based text to speech is going to be serviceable for visually impaired individuals.


Author(s):  
R. Nirmalan ◽  
M. Javith Hussain Khan ◽  
V. Sounder ◽  
A. Manikkaraja

The evolution in modern computer technology produce an huge amount of data by the way of using updated technology world with the lot and lot of inventions. The algorithms which we used in machine-learning traditionally might not support the concept of big data. Here we have discussed and implemented the solution for the problem, while predicting breast cancer using big data. DNA methylation (DM) as well gene expression (GE) are the two types of data used for the prediction of breast cancer. The main objective is to classify individual data set in the separate manner. To achieve this main objective, we have used a platform Apache Spark. Here,we have applied three types of algorithms used for classification, they are decision tree, random forest algorithm, support vector machine algorithm which will be mentioned as SVM .These three types of algorithm used for producing models used for breast cancer prediction. Analyze have done for finding which algorithm will produce the better result with good accuracy and less error rate. Additionally, the platforms like Weka and Spark are compared, to find which will have the better performance while dealing with the huge data. The obtained outcome have proved that the Support Vector Machine classifier which is scalable might given the better performance than all other classifiers and it have achieved the lowest error range with the highest accuracy using GE data set


Author(s):  
Chai Wutiwiwatchai ◽  
Ausdang Thangthai ◽  
Ananlada Chotimongkol ◽  
Chatchawarn Hansakunbuntheung ◽  
Nattanun Thatphithakkul

Sign in / Sign up

Export Citation Format

Share Document