A Pronunciation Rule-Based Speech Synthesis Technique for Odia Numerals

The author presents MASSY, the MODULAR AUDIOVISUAL SPEECH SYNTHESIZER. The system combines two approaches of visual speech synthesis. Two control models are implemented: a (data based) di-viseme model and a (rule based) dominance model where both produce control commands in a parameterized articulation space. Analogously two visualization methods are implemented: an image based (video-realistic) face model and a 3D synthetic head. Both face models can be driven by both the data based and the rule based articulation model. The high-level visual speech synthesis generates a sequence of control commands for the visible articulation. For every virtual articulator (articulation parameter) the 3D synthetic face model defines a set of displacement vectors for the vertices of the 3D objects of the head. The vertices of the 3D synthetic head then are moved by linear combinations of these displacement vectors to visualize articulation movements. For the image based video synthesis a single reference image is deformed to fit the facial properties derived from the control commands. Facial feature points and facial displacements have to be defined for the reference image. The algorithm can also use an image database with appropriately annotated facial properties. An example database was built automatically from video recordings. Both the 3D synthetic face and the image based face generate visual speech that is capable to increase the intelligibility of audible speech. Other well known image based audiovisual speech synthesis systems like MIKETALK and VIDEO REWRITE concatenate pre-recorded single images or video sequences, respectively. Parametric talking heads like BALDI control a parametric face with a parametric articulation model. The presented system demonstrates the compatibility of parametric and data based visual speech synthesis approaches.

Download Full-text

A rule-based phrase parser for real-time text-to-speech synthesis

Natural Language Engineering ◽

10.1017/s1351324900000140 ◽

1995 ◽

Vol 1 (2) ◽

pp. 191-212 ◽

Cited By ~ 1

Author(s):

Joan Bachenko ◽

Eileen Fitzpatrick ◽

Jeffrey Daugherty

Keyword(s):

Real Time ◽

Hard Of Hearing ◽

Speech Synthesis ◽

Break Point ◽

Linguistic Context ◽

Text To Speech ◽

Rule Based ◽

Front End ◽

Text To Speech Synthesis ◽

Break Points

AbstractText-to-speech systems are currently designed to work on complete sentences and paragraphs, thereby allowing front end processors access to large amounts of linguistic context. Problems with this design arise when applications require text to be synthesized in near real time, as it is being typed. How does the system decide which incoming words should be collected and synthesized as a group when prior and subsequent word groups are unknown? We describe a rule-based parser that uses a three cell buffer and phrasing rules to identify break points for incoming text. Words up to the break point are synthesized as new text is moved into the buffer; no hierarchical structure is built beyond the lexical level. The parser was developed for use in a system that synthesizes written telecommunications by Deaf and hard of hearing people. These are texts written entirely in upper case, with little or no punctuation, and using a nonstandard variety of English (e.g. WHEN DO I WILL CALL BACK YOU). The parser performed well in a three month field trial utilizing tens of thousands of texts. Laboratory tests indicate that the parser exhibited a low error rate when compared with a human reader.

Download Full-text

An Implementation for a Prediction based Internet Load balancing Algorithm

10.36227/techrxiv.14685831.v1 ◽

2021 ◽

Author(s):

Shaik Aftaab Zia

Keyword(s):

Load Balancing ◽

Web Application ◽

Speech Synthesis ◽

Round Robin ◽

Resource Utilisation ◽

Dynamic Information ◽

Rule Based ◽

Dynamic Algorithms ◽

Text To Speech Synthesis ◽

Load Balancing Algorithm

<div>Internet load balancing algorithms can be categorised into static and dynamic algorithms. Static algorithms like Round Robin and IP hash are rule based and do not take into account dynamic information like load on individual servers. Dynamic algorithms like Least connections take this into account and aim to distribute traffic more optimally, but lead to requirement of monitors or polling mechanisms to obtain this information. Predictive load balancing algorithms aim to remove this requirement by trying to predict load induced on servers due to requests rather than measuring it directly. We aim to provide an improved implementation of algorithm described by Patil et al.[1] and compare this implementation with a static algorithm like Round Robin in terms of performance and resource utilisation. This implementation is for a web application which does text-to-speech synthesis.</div>

Download Full-text

Speech synthesis by rule based on VCV waveform synthesis units

The Journal of the Acoustical Society of America ◽

10.1121/1.416345 ◽

1996 ◽

Vol 100 (4) ◽

pp. 2760-2760

Author(s):

Takao Koyama ◽

Nubuo Koizumi

Keyword(s):

Speech Synthesis ◽

Rule Based ◽

Waveform Synthesis

Download Full-text

Deep Syntactic Analysis and Rule Based Accentuation in Text-to-Speech Synthesis

Text, Speech and Dialogue - Lecture Notes in Computer Science ◽

10.1007/978-3-540-87391-4_68 ◽

2008 ◽

pp. 535-542 ◽

Cited By ~ 1

Author(s):

Antti Suni ◽

Martti Vainio

Keyword(s):

Speech Synthesis ◽

Syntactic Analysis ◽

Text To Speech ◽

Rule Based ◽

Text To Speech Synthesis

Download Full-text

Modified Rule-Based Concatenative Technique for Intelligible Speech Synthesis in Indian Languages

Advanced Science Letters ◽

10.1166/asl.2016.6862 ◽

2016 ◽

Vol 22 (2) ◽

pp. 557-563 ◽

Cited By ~ 2

Author(s):

Soumya Priyadarsini Panda ◽

Ajit Kumar Nayak

Keyword(s):

Speech Synthesis ◽

Indian Languages ◽

Rule Based ◽

Intelligible Speech

Download Full-text

A Novel Data Independent Approach for Conversion of Hand Punched Kannada Braille Script to Text and Speech

International Journal of Image and Graphics ◽

10.1142/s0219467818500109 ◽

2018 ◽

Vol 18 (02) ◽

pp. 1850010 ◽

Cited By ~ 2

Author(s):

T. Shreekanth ◽

M. R. Deeksha ◽

Karthikeya R. Kaushik

Keyword(s):

Feature Extraction ◽

Performance Evaluation ◽

Natural Language ◽

Speech Synthesis ◽

Basic Unit ◽

Text To Speech ◽

Synthesis Technique ◽

Visually Challenged ◽

Minimal Work ◽

Regional Languages

In society, there exists a gap in communication between the sighted community and the visually challenged people due to different scripts followed to read and write. To bridge this gap there is a need for a system that supports automatic conversion of Braille script to text and speech in the corresponding language. Optical Braille Recognition (OBR) system converts the hand-punched Braille characters into their equivalent natural language characters. The Text-to-Speech (TTS) system converts the recognized characters into audible speech using speech synthesis techniques. Existing literature reveals that OBR and TTS systems have been well established independently for English. There is a scope for development of OBR and TTS systems for regional languages. In spite of Kannada being one of the most widely spoken regional languages in India, minimal work has been done towards Kannada OBR and TTS. There is no system that directly converts Braille script to speech, therefore, this development of Kannada Braille to text and speech system is one of a kind. The acquired image is processed and feature extraction is performed using [Formula: see text]-means algorithm and heuristics to convert the Braille characters to Kannada script. The concatenation based speech synthesis technique employing phoneme as the basic unit is used to convert Kannada TTS using Festival TTS framework. Performance evaluation of the proposed system is done using Kannada Braille database developed independently, and the results obtained are found to be satisfactory when compared to existing methods in the literature.

Download Full-text