Design and implementation of a Luganda text normalization module for a speech synthesis software program

As the technology is developing day-by-day and most of the human work is done by the machine or systems, it is the need of the today’s world to develop systems that can read informal text or words in a proper and standard way even though the format of writing these words or text does not match the standard English words. The informal texts types that exists are the dates, currencies, abbreviations and acronyms of standard words, measurements, URLs, phone numbers etc. This paper focuses on the normalization of such text that converts the informal text into their equivalent standard form which is called text normalization. To produce the equivalent speech form of these non-standard words is the necessity of the today’s system. Text normalization is pre-processing step of the natural language processing system. The paper discusses various techniques and methods for the conversion of the non-standard words into standard words. The methods used for classification of the token are regular expressions, used for simple patter match of the token. Naïve Bayes classification for number sense disambiguity and Stochastic Gradient Descent for resolving acronym and class ambiguity .The result and analysis are also mentioned in the form of error-rate of the system, which shows the area for the scope of more improvement in the system.

Download Full-text

Text Normalization for Telugu Text-to-Speech Synthesis

INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY ◽

10.24297/ijct.v11i2.1176 ◽

2013 ◽

Vol 11 (2) ◽

pp. 2241-2249

Author(s):

Dr. K.V.N. Sunitha ◽

P.Sunitha Devi

Keyword(s):

Speech Synthesis ◽

Text Processing ◽

Text To Speech ◽

Speech Technology ◽

Rule Based System ◽

Input Text ◽

Novel Approach ◽

Text To Speech Synthesis ◽

Processing Component ◽

Text Normalization

Most areas related to language and speech technology, directly or indirectly, require handling of unrestricted text, and Text-to-speech systems directly need to work on real text. To build a natural sounding speech synthesis system, it is essential that the text processing component produce an appropriate sequence of phonemic units corresponding to an arbitrary input text. A novel approach is used, where the input text is tokenized, and classification is done based on token type. The token sense disambiguation is achieved by the semantic nature of the language and then the expansion rules are applied to get the normalized text. However, for Telugu language not much work is done on text normalization. In this paper we discuss our efforts for designing a rule based system to achieve text normalization in the context of building Telugu text-to-speech system.

Download Full-text

Die Rolle der Textnormierung bei der Sprachvollsynthese / The Role of Text Normalization in Text-to-Speech Synthesis

it - Information Technology ◽

10.1524/itit.1989.31.5.342 ◽

1989 ◽

Vol 31 (5) ◽

Cited By ~ 1

Author(s):

D.S. Stall

Keyword(s):

Speech Synthesis ◽

Text To Speech ◽

Text To Speech Synthesis ◽

Text Normalization

Download Full-text

Design and Implementation of Multi-Protocol Self-Adaptation Gateway

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.687-691.2108 ◽

2014 ◽

Vol 687-691 ◽

pp. 2108-2111

Author(s):

Kai Lin Zhang ◽

Da Hua Li ◽

Shu Chen Shi ◽

Xue Song Yang ◽

Zhen Xiao ◽

...

Keyword(s):

Network Protocols ◽

It Use ◽

Design And Implementation ◽

Hardware Interface ◽

Software Program ◽

Self Adaptation ◽

Profibus Dp

For the four kinds of network protocols which are common in industry :PROFIBUS-DP, DeviceNet, EtherNet/IP and Modbus-RTU, a gateway has been designed which could integrate the four protocols ,and have developed hardware and software program of the Multi-protocol self-adaptation gateway .It uses STM32F407ZGT6 as its major chip. A Modbus-RTU interface has been designed which depends on the UART interface of the main chip to send and receive packets in hardware. It use COMX as hardware interface of PROFIBUS-DP ,DeviceNet ,and EtherNet/IP to process their packets. It has achieved a function that the PROFIBUS-DP interface will change its address with the changing of the configuration of master .In addition, the multi-protocol self-adaptation gateway can integrate several masters which support different protocols in one network. The master controllers could communicate with each other in this network.After being tested and verified ,it has been used in the communication of DCS systems, and it has converted the packets of the four protocols which have been mentioned above and it has met the demands of designing a gateway.

Download Full-text

Text normalization and ambiguity resolution in speech synthesis

The Journal of the Acoustical Society of America ◽

10.1121/1.408245 ◽

1993 ◽

Vol 94 (3) ◽

pp. 1841-1841

Author(s):

David Yarowsky

Keyword(s):

Ambiguity Resolution ◽

Speech Synthesis ◽

Text Normalization

Download Full-text