scholarly journals Design and implementation of a Luganda text normalization module for a speech synthesis software program

2020 ◽  
Vol 111 (4) ◽  
pp. 149-154
Author(s):  
Ronald Kizito ◽  
Wayne S. Okello ◽  
Sulaiman Kagumire
IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 36202-36209
Author(s):  
Lan Huang ◽  
Shunan Zhuang ◽  
Kangping Wang

As the technology is developing day-by-day and most of the human work is done by the machine or systems, it is the need of the today’s world to develop systems that can read informal text or words in a proper and standard way even though the format of writing these words or text does not match the standard English words. The informal texts types that exists are the dates, currencies, abbreviations and acronyms of standard words, measurements, URLs, phone numbers etc. This paper focuses on the normalization of such text that converts the informal text into their equivalent standard form which is called text normalization. To produce the equivalent speech form of these non-standard words is the necessity of the today’s system. Text normalization is pre-processing step of the natural language processing system. The paper discusses various techniques and methods for the conversion of the non-standard words into standard words. The methods used for classification of the token are regular expressions, used for simple patter match of the token. Naïve Bayes classification for number sense disambiguity and Stochastic Gradient Descent for resolving acronym and class ambiguity .The result and analysis are also mentioned in the form of error-rate of the system, which shows the area for the scope of more improvement in the system.


2013 ◽  
Vol 11 (2) ◽  
pp. 2241-2249
Author(s):  
Dr. K.V.N. Sunitha ◽  
P.Sunitha Devi

Most areas related to language and speech technology, directly or indirectly, require handling of unrestricted text, and Text-to-speech systems directly need to work on real text. To build a natural sounding speech synthesis system, it is essential that the text processing component produce an appropriate sequence of phonemic units corresponding to an arbitrary input text. A novel approach is used, where the input text is tokenized, and classification is done based on token type. The token sense disambiguation is achieved by the semantic nature of the language and then the expansion rules are applied to get the normalized text. However, for Telugu language not much work is done on text normalization. In this paper we discuss our efforts for designing a rule based system to achieve text normalization in the context of building Telugu text-to-speech system.


2014 ◽  
Vol 687-691 ◽  
pp. 2108-2111
Author(s):  
Kai Lin Zhang ◽  
Da Hua Li ◽  
Shu Chen Shi ◽  
Xue Song Yang ◽  
Zhen Xiao ◽  
...  

For the four kinds of network protocols which are common in industry :PROFIBUS-DP, DeviceNet, EtherNet/IP and Modbus-RTU, a gateway has been designed which could integrate the four protocols ,and have developed hardware and software program of the Multi-protocol self-adaptation gateway .It uses STM32F407ZGT6 as its major chip. A Modbus-RTU interface has been designed which depends on the UART interface of the main chip to send and receive packets in hardware. It use COMX as hardware interface of PROFIBUS-DP ,DeviceNet ,and EtherNet/IP to process their packets. It has achieved a function that the PROFIBUS-DP interface will change its address with the changing of the configuration of master .In addition, the multi-protocol self-adaptation gateway can integrate several masters which support different protocols in one network. The master controllers could communicate with each other in this network.After being tested and verified ,it has been used in the communication of DCS systems, and it has converted the packets of the four protocols which have been mentioned above and it has met the demands of designing a gateway.


Sign in / Sign up

Export Citation Format

Share Document