Mask Region-Based Convolutional Neural Networks (R-CNN) for Sinhala Sign Language to Text Conversion

2021 ◽  
Author(s):  
R. D. Rusiru Sewwantha ◽  
T. N. D. S. Ginige

Sign Language is the use of various gestures and symbols for communication. It is mainly used by disabled people with communication difficulties due to their speech or hearing impediments. Due to the lack of knowledge on sign language, natural language speakers like us, are not able to communicate with such people. As a result, a communication gap is created between sign language users and natural language speakers. It should also be noted that sign language differs from country to country. With American sign language being the most commonly used, in Sri Lanka, we use Sri Lankan/Sinhala sign language. In this research, the authors propose a mobile solution using a Region Based Convolutional Neural Network for object detection to reduce the communication gap between the sign users and language speakers by identifying and interpreting Sinhala sign language to Sinhala text using Natural Language Processing (NLP). The system is able to identify and interpret still gesture signs in real-time using the trained model. The proposed solution uses object detection for the identification of the signs.

2014 ◽  
Vol 26 (3) ◽  
pp. 1015-1026 ◽  
Author(s):  
Naja Ferjan Ramirez ◽  
Matthew K. Leonard ◽  
Tristan S. Davenport ◽  
Christina Torres ◽  
Eric Halgren ◽  
...  

10.1038/nn775 ◽  
2001 ◽  
Vol 5 (1) ◽  
pp. 76-80 ◽  
Author(s):  
Aaron J. Newman ◽  
Daphne Bavelier ◽  
David Corina ◽  
Peter Jezzard ◽  
Helen J. Neville

1999 ◽  
Vol 42 (3) ◽  
pp. 568-582 ◽  
Author(s):  
Susan D. Fischer ◽  
Lorraine A. Delhorne ◽  
Charlotte M. Reed

Previous research on the visual reception of fingerspelled English suggests that communication rates are limited primarily by constraints on production. Studies of artificially accelerated fingerspelling indicate that reception of fingerspelled sentences is highly accurate for rates up to 2 to 3 times those that can be produced naturally. The current paper reports on the results of a comparable study of the reception of American Sign Language (ASL). Fourteen native deaf ASL signers participated in an experiment in which videotaped productions of isolated ASL signs or ASL sentences were presented at normal playback speed and at speeds of 2, 3, 4, and 6 times normal speed. For isolated signs, identification scores decreased from 95% correct to 46% correct across the range of rates that were tested; for sentences, the ability to identify key signs decreased from 88% to 19% over the range of rates tested. The results indicate a breakdown in processing at around 2.5–3 times the normal rate as evidenced both by a substantial drop in intelligibility in this region and by a shift in error patterns away from semantic and toward formational. These results parallel those obtained in previous studies of the intelligibility of the auditory reception of time-compressed speech and the visual reception of accelerated fingerspelling. Taken together, these results suggest a modality-independent upper limit to language processing.


2020 ◽  
Vol 12 (1) ◽  
pp. 138-163
Author(s):  
BENJAMIN ANIBLE

abstractReaction times for a translation recognition study are reported where novice to expert English–ASL bilinguals rejected English translation distractors for ASL signs that were related to the correct translations through phonology, semantics, or both form and meaning (diagrammatic iconicity). Imageability ratings of concepts impacted performance in all conditions; when imageability was high, participants showed interference for phonologically related distractors, and when imageability was low participants showed interference for semantically related distractors, regardless of proficiency. For diagrammatically related distractors high imageability caused interference in experts, but low imageability caused interference in novices. These patterns suggest that imageability and diagrammaticity interact with proficiency – experts process diagrammatic related distractors phonologically, but novices process them semantically. This implies that motivated signs are dependent on the entrenchment of language systematicity; rather than decreasing their impact on language processing as proficiency grows, they build on the original benefit conferred by iconic mappings.


2008 ◽  
Vol 02 (01) ◽  
pp. 21-45 ◽  
Author(s):  
MATT HUENERFAUTH

Software to generate animations of American Sign Language (ASL) has important accessibility benefits for the significant number of deaf adults with low levels of written language literacy. We have implemented a prototype software system to generate an important subset of ASL phenomena called "classifier predicates," complex and spatially descriptive types of sentences. The output of this prototype system has been evaluated by native ASL signers. Our generator includes several novel models of 3D space, spatial semantics, and temporal coordination motivated by linguistic properties of ASL. These classifier predicates have several similarities to iconic gestures that often co-occur with spoken language; these two phenomena will be compared. This article explores implications of the design of our system for research in multimodal gesture generation systems. A conceptual model of multimodal communication signals is introduced to show how computational linguistic research on ASL relates to the field of multimodal natural language processing.


A recent surge in interest to create translation systems inclusive of sign languages is engendered by not only the rapid development of various approaches in the field of machine translation, but also the increased awareness of the struggles of the deaf community to comprehend written English. This paper describes the working of SILANT (SIgn LANguage Translator), a machine translation system that converts English to American Sign Language (ASL) using the principles of Natural Language Processing (NLP) and Deep Learning. The translation of English text is based on transformational rules which generates an intermediate representation which in turn spawns appropriate ASL animations. Although this kind of rule-based translation is notorious for being an accurate yet narrow approach, in this system, we broaden the scope of the translation using a synonym network and paraphrasing module which implements deep learning algorithms. In doing so, we are able to achieve both the accuracy of a rule-based approach and the scale of a deep learning one.


2011 ◽  
Author(s):  
M. Leonard ◽  
N. Ferjan Ramirez ◽  
C. Torres ◽  
M. Hatrak ◽  
R. Mayberry ◽  
...  

2018 ◽  
Author(s):  
Leslie Pertz ◽  
Missy Plegue ◽  
Kathleen Diehl ◽  
Philip Zazove ◽  
Michael McKee

Sign in / Sign up

Export Citation Format

Share Document