Smart Approach to Optical Character Recognition and Ubiquitous Speech Synthesis Using Real-Time Deep Learning Algorithms

Author(s):  
Bhargav Goradiya ◽  
Yagnik Mehta ◽  
Nisarg Patel ◽  
Neel Macwan ◽  
Vatsal Shah
2021 ◽  
Vol 4 ◽  
Author(s):  
Logan Froese ◽  
Joshua Dian ◽  
Carleen Batson ◽  
Alwyn Gomez ◽  
Amanjyot Singh Sainbhi ◽  
...  

Introduction: As real time data processing is integrated with medical care for traumatic brain injury (TBI) patients, there is a requirement for devices to have digital output. However, there are still many devices that fail to have the required hardware to export real time data into an acceptable digital format or in a continuously updating manner. This is particularly the case for many intravenous pumps and older technological systems. Such accurate and digital real time data integration within TBI care and other fields is critical as we move towards digitizing healthcare information and integrating clinical data streams to improve bedside care. We propose to address this gap in technology by building a system that employs Optical Character Recognition through computer vision, using real time images from a pump monitor to extract the desired real time information.Methods: Using freely available software and readily available technology, we built a script that extracts real time images from a medication pump and then processes them using Optical Character Recognition to create digital text from the image. This text was then transferred to an ICM + real-time monitoring software in parallel with other retrieved physiological data.Results: The prototype that was built works effectively for our device, with source code openly available to interested end-users. However, future work is required for a more universal application of such a system.Conclusion: Advances here can improve medical information collection in the clinical environment, eliminating human error with bedside charting, and aid in data integration for biomedical research where many complex data sets can be seamlessly integrated digitally. Our design demonstrates a simple adaptation of current technology to help with this integration.


In the proposed paper we introduce a new Pashtu numerals dataset having handwritten scanned images. We make the dataset publically available for scientific and research use. Pashtu language is used by more than fifty million people both for oral and written communication, but still no efforts are devoted to the Optical Character Recognition (OCR) system for Pashtu language. We introduce a new method for handwritten numerals recognition of Pashtu language through the deep learning based models. We use convolutional neural networks (CNNs) both for features extraction and classification tasks. We assess the performance of the proposed CNNs based model and obtained recognition accuracy of 91.45%.


2020 ◽  
Vol 13 (1) ◽  
pp. 1-17
Author(s):  
Traian Rebedea ◽  
Vlad Florea

This paper proposes a deep learning solution for optical character recognition, specifically tuned to detect expiration dates that are printed on the packaging of food items. This method can be used to reduce food waste, having a significant impact on the design of smart refrigerators and can prove especially useful for persons with vision difficulties, by combining it with a speech synthesis engine. The main problem in designing an efficient solution for expiry date recognition is the lack of a large enough dataset to train deep neural networks. To tackle this issue, we propose to use an additional dataset composed of synthetically generated images. Both the synthetic and real image datasets are detailed in the paper and we show that the proposed method offers a 9.4% accuracy improvement over using real images alone.


Author(s):  
Oyeniran Oluwashina Akinloye ◽  
Oyebode Ebenezer Olukunle

Numerous works have been proposed and implemented in computerization of various human languages, nevertheless, miniscule effort have also been made so as to put Yorùbá Handwritten Character on the map of Optical Character Recognition. This study presents a novel technique in the development of Yorùbá alphabets recognition system through the use of deep learning. The developed model was implemented on Matlab R2018a environment using the developed framework where 10,500 samples of dataset were for training and 2100 samples were used for testing. The training of the developed model was conducted using 30 Epoch, at 164 iteration per epoch while the total iteration is 4920 iterations. Also, the training period was estimated to 11296 minutes 41 seconds. The model yielded the network accuracy of 100% while the accuracy of the test set is 97.97%, with F1 score of 0.9800, Precision of 0.9803 and Recall value of 0.9797.


2016 ◽  
Vol 2 (2) ◽  
pp. 194
Author(s):  
Andria Wahyudi ◽  
Andre Sumual ◽  
Jorgie Sumual

Penelitian ini membahas tentang gabungan beberapa teknologi untuk perancangan aplikasi translasi bahasa menggunakan teknologi Augmented Reality (AR) pada smartphone dengan sistem operasi Android. Tujuan utama dari penelitian ini adalah penerapan AR pada media translasi bahasa Tombulu dan Indonesia menggunakan SDK Vuforia. Vuforia digunakan untuk menampilkan teks secara real-time, dimana teknologi Optical Character Recognition (OCR) sudah menjadi fitur didalamnya yang digunakan  untuk melakukan pendeteksian teks. Setelah aplikasi selesai dibuat, dilakukan pengujian kemampuan deteksi dari aplikasi. Pengujian tersebut dimulai dari deteksi tulisan tangan, teks berwarna, typeface yang berbeda, typeface yang mengandung symbol, dan kata yang mengandung spasi. Adapun pengujian dengan cara manual, yaitu dengan mengetikan sendiri teks ke smartphone. Hasil yang di dapatkan adalah batas kemampuan maksimum dalam melakukan pendeteksian teks sesuai pengujian yang telah ditentukan sebelumya.Kata Kunci: Augmented Reality, Translation, Vuforia SDK, OCR


Sign in / Sign up

Export Citation Format

Share Document