Persian sentences to phoneme sequences conversion based on recurrent neural networks

AbstractGrapheme to phoneme conversion is one of the main subsystems of Text-to-Speech (TTS) systems. Converting sequence of written words to their corresponding phoneme sequences for the Persian language is more challenging than other languages; because in the standard orthography of this language the short vowels are omitted and the pronunciation ofwords depends on their positions in a sentence. Common approaches used in the Persian commercial TTS systems have several modules and complicated models for natural language processing and homograph disambiguation that make the implementation harder as well as reducing the overall precision of system. In this paper we define the grapheme-to-phoneme conversion as a sequential labeling problem; and use the modified Recurrent Neural Networks (RNN) to create a smart and integrated model for this purpose. The recurrent networks are modified to be bidirectional and equipped with Long-Short Term Memory (LSTM) blocks to acquire most of the past and future contextual information for decision making. The experiments conducted in this paper show that in addition to having a unified structure the bidirectional RNN-LSTM has a good performance in recognizing the pronunciation of the Persian sentences with the precision more than 98 percent.

Download Full-text

Learning molecular dynamics with simple language model built upon long short-term memory neural network

Nature Communications ◽

10.1038/s41467-020-18959-8 ◽

2020 ◽

Vol 11 (1) ◽

Cited By ~ 1

Author(s):

Sun-Ting Tsai ◽

En-Jui Kuo ◽

Pratyush Tiwary

Keyword(s):

Neural Networks ◽

Language Processing ◽

Recurrent Neural Networks ◽

Short Term Memory ◽

Language Model ◽

Short Term ◽

Term Memory ◽

Molecular Systems ◽

Simple Language ◽

Long Short Term Memory

Abstract Recurrent neural networks have led to breakthroughs in natural language processing and speech recognition. Here we show that recurrent networks, specifically long short-term memory networks can also capture the temporal evolution of chemical/biophysical trajectories. Our character-level language model learns a probabilistic model of 1-dimensional stochastic trajectories generated from higher-dimensional dynamics. The model captures Boltzmann statistics and also reproduces kinetics across a spectrum of timescales. We demonstrate how training the long short-term memory network is equivalent to learning a path entropy, and that its embedding layer, instead of representing contextual meaning of characters, here exhibits a nontrivial connectivity between different metastable states in the underlying physical system. We demonstrate our model’s reliability through different benchmark systems and a force spectroscopy trajectory for multi-state riboswitch. We anticipate that our work represents a stepping stone in the understanding and use of recurrent neural networks for understanding the dynamics of complex stochastic molecular systems.

Download Full-text