An Online Cursive Handwritten Medical Words Recognition System for Busy Doctors in Developing Countries for Ensuring Efficient Healthcare Service Delivery

Abstract Doctors in developing countries are too busy to write digital prescriptions. Ninety-seven percent of Bangladeshi doctors write handwritten prescriptions, the majority of which lack legibility. Prescriptions are harder to read as they contain multiple languages. This paper proposes a machine learning approach to recognize doctors' handwriting to create digital prescriptions. A ‘Handwritten Medical Term Corpus’ dataset is developed containing 17,431 samples of 480 medical terms. In order to improve the recognition efficiency, this paper introduces a data augmentation technique to widen the variety and increase the sample size. A sequence of line data is extracted from the augmented images of 1,591,100 samples and fed to a Bidirectional LSTM. Data augmentation includes pattern Rotating, Shifting and Stretching (RSS). Eight different combinations are applied to evaluate the strength of the proposed method. The result shows 93.0% average accuracy (max: 94.5%, min: 92.1%) using Bidirectional LSTM and RSS data augmentation. This accuracy is 19.6% higher than the recognition result with no data expansion. The proposed handwritten recognition technology can be installed in a smartpen for busy doctors which will recognize the writings and digitize them in real-time. It is expected that the smartpen will contribute to reduce medical errors, save medical costs and ensure healthy living in developing countries.

Download Full-text

Development of Hand Gesture Based Electronic Key Using Microsoft Kinect

MATEC Web of Conferences ◽

10.1051/matecconf/201821802014 ◽

2018 ◽

Vol 218 ◽

pp. 02014

Author(s):

Arief Ramadhani ◽

Achmad Rizal ◽

Erwin Susanto

Keyword(s):

Computer Vision ◽

Gesture Recognition ◽

Recognition System ◽

Hand Gesture Recognition ◽

Microsoft Kinect ◽

Hand Gesture ◽

Depth Sensor ◽

Average Accuracy ◽

Recognition Result ◽

Hand Image

Computer vision is one of the fields of research that can be applied in a various subject. One application of computer vision is the hand gesture recognition system. The hand gesture is one of the ways to interact with computers or machines. In this study, hand gesture recognition was used as a password for electronic key systems. The hand gesture recognition in this study utilized the depth sensor in Microsoft Kinect Xbox 360. Depth sensor captured the hand image and segmented using a threshold. By scanning each pixel, we detected the thumb and the number of other fingers that open. The hand gesture recognition result was used as a password to unlock the electronic key. This system could recognize nine types of hand gesture represent number 1, 2, 3, 4, 5, 6, 7, 8, and 9. The average accuracy of the hand gesture recognition system was 97.78% for one single hand sign and 86.5% as password of three hand signs.

Download Full-text

Deep Learning-Based Violin Bowing Action Recognition

Sensors ◽

10.3390/s20205732 ◽

2020 ◽

Vol 20 (20) ◽

pp. 5732

Author(s):

Shih-Wei Sun ◽

Bao-Yun Liu ◽

Pao-Chi Chang

Keyword(s):

Deep Learning ◽

Action Recognition ◽

Data Augmentation ◽

Inertial Sensors ◽

Three Dimensional ◽

Recognition System ◽

Depth Camera ◽

Decision Level ◽

Average Accuracy ◽

Level Fusion

We propose a violin bowing action recognition system that can accurately recognize distinct bowing actions in classical violin performance. This system can recognize bowing actions by analyzing signals from a depth camera and from inertial sensors that are worn by a violinist. The contribution of this study is threefold: (1) a dataset comprising violin bowing actions was constructed from data captured by a depth camera and multiple inertial sensors; (2) data augmentation was achieved for depth-frame data through rotation in three-dimensional world coordinates and for inertial sensing data through yaw, pitch, and roll angle transformations; and, (3) bowing action classifiers were trained using different modalities, to compensate for the strengths and weaknesses of each modality, based on deep learning methods with a decision-level fusion process. In experiments, large external motions and subtle local motions produced from violin bow manipulations were both accurately recognized by the proposed system (average accuracy > 80%).

Download Full-text

A Deep Learning based Arabic Script Recognition System: Benchmark on KHAT

The International Arab Journal of Information Technology ◽

10.34028/iajit/17/3/3 ◽

2020 ◽

Vol 17 (3) ◽

pp. 299-305 ◽

Cited By ~ 1

Author(s):

Riaz Ahmad ◽

Saeeda Naz ◽

Muhammad Afzal ◽

Sheikh Rashid ◽

Marcus Liwicki ◽

...

Keyword(s):

Deep Learning ◽

Character Recognition ◽

Data Augmentation ◽

Short Term Memory ◽

Recognition System ◽

Learning Approach ◽

Arabic Text ◽

Data Set ◽

Processing Step ◽

Handwritten Arabic

This paper presents a deep learning benchmark on a complex dataset known as KFUPM Handwritten Arabic TexT (KHATT). The KHATT data-set consists of complex patterns of handwritten Arabic text-lines. This paper contributes mainly in three aspects i.e., (1) pre-processing, (2) deep learning based approach, and (3) data-augmentation. The pre-processing step includes pruning of white extra spaces plus de-skewing the skewed text-lines. We deploy a deep learning approach based on Multi-Dimensional Long Short-Term Memory (MDLSTM) networks and Connectionist Temporal Classification (CTC). The MDLSTM has the advantage of scanning the Arabic text-lines in all directions (horizontal and vertical) to cover dots, diacritics, strokes and fine inflammation. The data-augmentation with a deep learning approach proves to achieve better and promising improvement in results by gaining 80.02% Character Recognition (CR) over 75.08% as baseline.

Download Full-text

Sitsen: Passive sitting posture sensing based on wireless devices

International Journal of Distributed Sensor Networks ◽

10.1177/15501477211024846 ◽

2021 ◽

Vol 17 (7) ◽

pp. 155014772110248

Author(s):

Miaoyu Li ◽

Zhuohan Jiang ◽

Yutong Liu ◽

Shuheng Chen ◽

Marcin Wozniak ◽

...

Keyword(s):

Radio Frequency ◽

Radio Frequency Identification ◽

High Performance ◽

Learning Algorithm ◽

Low Cost ◽

Recognition System ◽

Sitting Posture ◽

Average Accuracy ◽

Phase Variations ◽

Window Approach

Physical health diseases caused by wrong sitting postures are becoming increasingly serious and widespread, especially for sedentary students and workers. Existing video-based approaches and sensor-based approaches can achieve high accuracy, while they have limitations like breaching privacy and relying on specific sensor devices. In this work, we propose Sitsen, a non-contact wireless-based sitting posture recognition system, just using radio frequency signals alone, which neither compromises the privacy nor requires using various specific sensors. We demonstrate that Sitsen can successfully recognize five habitual sitting postures with just one lightweight and low-cost radio frequency identification tag. The intuition is that different postures induce different phase variations. Due to the received phase readings are corrupted by the environmental noise and hardware imperfection, we employ series of signal processing schemes to obtain clean phase readings. Using the sliding window approach to extract effective features of the measured phase sequences and employing an appropriate machine learning algorithm, Sitsen can achieve robust and high performance. Extensive experiments are conducted in an office with 10 volunteers. The result shows that our system can recognize different sitting postures with an average accuracy of 97.02%.

Download Full-text

Loss of Smell and Taste Can Accurately Predict COVID-19 Infection: A Machine-Learning Approach

Journal of Clinical Medicine ◽

10.3390/jcm10040570 ◽

2021 ◽

Vol 10 (4) ◽

pp. 570

Author(s):

María A Callejon-Leblic ◽

Ramon Moreno-Luna ◽

Alfonso Del Cuvillo ◽

Isabel M Reyes-Tejero ◽

Miguel A Garcia-Villaran ◽

...

Keyword(s):

Machine Learning ◽

Modelling Framework ◽

Visual Analog Scales ◽

Average Accuracy ◽

Machine Learning Approach ◽

Taste Disorders ◽

Polymerase Chain ◽

Fold Cross Validation ◽

Validation Scheme ◽

Control Study

The COVID-19 outbreak has spread extensively around the world. Loss of smell and taste have emerged as main predictors for COVID-19. The objective of our study is to develop a comprehensive machine learning (ML) modelling framework to assess the predictive value of smell and taste disorders, along with other symptoms, in COVID-19 infection. A multicenter case-control study was performed, in which suspected cases for COVID-19, who were tested by real-time reverse-transcription polymerase chain reaction (RT-PCR), informed about the presence and severity of their symptoms using visual analog scales (VAS). ML algorithms were applied to the collected data to predict a COVID-19 diagnosis using a 50-fold cross-validation scheme by randomly splitting the patients in training (75%) and testing datasets (25%). A total of 777 patients were included. Loss of smell and taste were found to be the symptoms with higher odds ratios of 6.21 and 2.42 for COVID-19 positivity. The ML algorithms applied reached an average accuracy of 80%, a sensitivity of 82%, and a specificity of 78% when using VAS to predict a COVID-19 diagnosis. This study concludes that smell and taste disorders are accurate predictors, with ML algorithms constituting helpful tools for COVID-19 diagnostic prediction.

Download Full-text

A Machine Learning Approach For Classifying Low-mass X-ray Binaries Based On Their Compact Object Nature

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/staa3899 ◽

2020 ◽

Author(s):

R Pattnaik ◽

K Sharma ◽

K Alabarta ◽

D Altamirano ◽

M Chakraborty ◽

...

Keyword(s):

Machine Learning ◽

Black Hole ◽

Neutron Star ◽

Binary Systems ◽

Compact Object ◽

X Ray ◽

Average Accuracy ◽

Machine Learning Approach ◽

Low Mass ◽

X Ray Binaries

Abstract Low Mass X-ray binaries (LMXBs) are binary systems where one of the components is either a black hole or a neutron star and the other is a less massive star. It is challenging to unambiguously determine whether a LMXB hosts a black hole or a neutron star. In the last few decades, multiple observational works have tried, with different levels of success, to address this problem. In this paper, we explore the use of machine learning to tackle this observational challenge. We train a random forest classifier to identify the type of compact object using the energy spectrum in the energy range 5-25 keV obtained from the Rossi X-ray Timing Explorer archive. We report an average accuracy of 87±13% in classifying the spectra of LMXB sources. We further use the trained model for predicting the classes for LMXB systems with unknown or ambiguous classification. With the ever-increasing volume of astronomical data in the X-ray domain from present and upcoming missions (e.g., SWIFT, XMM-Newton, XARM, ATHENA, NICER), such methods can be extremely useful for faster and robust classification of X-ray sources and can also be deployed as part of the data reduction pipeline.

Download Full-text

Bearing Anomaly Recognition Using an Intelligent Digital Twin Integrated with Machine Learning

Applied Sciences ◽

10.3390/app11104602 ◽

2021 ◽

Vol 11 (10) ◽

pp. 4602

Author(s):

Farzin Piltan ◽

Jong-Myon Kim

Keyword(s):

Machine Learning ◽

Variable Structure ◽

Vibration Signal ◽

Crack Size ◽

Approximation Technique ◽

Digital Twin ◽

Average Accuracy ◽

Machine Learning Approach ◽

The Impact ◽

Signal Approximation

In this study, the application of an intelligent digital twin integrated with machine learning for bearing anomaly detection and crack size identification will be observed. The intelligent digital twin has two main sections: signal approximation and intelligent signal estimation. The mathematical vibration bearing signal approximation is integrated with machine learning-based signal approximation to approximate the bearing vibration signal in normal conditions. After that, the combination of the Kalman filter, high-order variable structure technique, and adaptive neural-fuzzy technique is integrated with the proposed signal approximation technique to design an intelligent digital twin. Next, the residual signals will be generated using the proposed intelligent digital twin and the original RAW signals. The machine learning approach will be integrated with the proposed intelligent digital twin for the classification of the bearing anomaly and crack sizes. The Case Western Reserve University bearing dataset is used to test the impact of the proposed scheme. Regarding the experimental results, the average accuracy for the bearing fault pattern recognition and crack size identification will be, respectively, 99.5% and 99.6%.

Download Full-text

Combining a convolutional neural network with autoencoders to predict the survival chance of COVID-19 patients

Scientific Reports ◽

10.1038/s41598-021-93543-8 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Fahime Khozeimeh ◽

Danial Sharifrazi ◽

Navid Hoseini Izadi ◽

Javad Hassannataj Joloudari ◽

Afshin Shoeibi ◽

...

Keyword(s):

Clinical Data ◽

Data Augmentation ◽

Clinical Information ◽

Ct Images ◽

Classification Performance ◽

Survival Chance ◽

Average Accuracy ◽

Novel Method ◽

Aided Diagnosis ◽

Augmentation Procedure

AbstractCOVID-19 has caused many deaths worldwide. The automation of the diagnosis of this virus is highly desired. Convolutional neural networks (CNNs) have shown outstanding classification performance on image datasets. To date, it appears that COVID computer-aided diagnosis systems based on CNNs and clinical information have not yet been analysed or explored. We propose a novel method, named the CNN-AE, to predict the survival chance of COVID-19 patients using a CNN trained with clinical information. Notably, the required resources to prepare CT images are expensive and limited compared to those required to collect clinical data, such as blood pressure, liver disease, etc. We evaluated our method using a publicly available clinical dataset that we collected. The dataset properties were carefully analysed to extract important features and compute the correlations of features. A data augmentation procedure based on autoencoders (AEs) was proposed to balance the dataset. The experimental results revealed that the average accuracy of the CNN-AE (96.05%) was higher than that of the CNN (92.49%). To demonstrate the generality of our augmentation method, we trained some existing mortality risk prediction methods on our dataset (with and without data augmentation) and compared their performances. We also evaluated our method using another dataset for further generality verification. To show that clinical data can be used for COVID-19 survival chance prediction, the CNN-AE was compared with multiple pre-trained deep models that were tuned based on CT images.

Download Full-text

Prosodic Feature-Based Discriminatively Trained Low Resource Speech Recognition System

Sustainability ◽

10.3390/su14020614 ◽

2022 ◽

Vol 14 (2) ◽

pp. 614

Author(s):

Taniya Hasija ◽

Virender Kadyan ◽

Kalpna Guleria ◽

Abdullah Alharbi ◽

Hashem Alyami ◽

...

Keyword(s):

Speech Recognition ◽

Mutual Information ◽

Data Augmentation ◽

Recognition System ◽

Speech Recognition System ◽

Prosodic Features ◽

Prosodic Feature ◽

Feature Based ◽

Maximum Mutual Information ◽

Children's Speech

Speech recognition has been an active field of research in the last few decades since it facilitates better human–computer interaction. Native language automatic speech recognition (ASR) systems are still underdeveloped. Punjabi ASR systems are in their infancy stage because most research has been conducted only on adult speech systems; however, less work has been performed on Punjabi children’s ASR systems. This research aimed to build a prosodic feature-based automatic children speech recognition system using discriminative modeling techniques. The corpus of Punjabi children’s speech has various runtime challenges, such as acoustic variations with varying speakers’ ages. Efforts were made to implement out-domain data augmentation to overcome such issues using Tacotron-based text to a speech synthesizer. The prosodic features were extracted from Punjabi children’s speech corpus, then particular prosodic features were coupled with Mel Frequency Cepstral Coefficient (MFCC) features before being submitted to an ASR framework. The system modeling process investigated various approaches, which included Maximum Mutual Information (MMI), Boosted Maximum Mutual Information (bMMI), and feature-based Maximum Mutual Information (fMMI). The out-domain data augmentation was performed to enhance the corpus. After that, prosodic features were also extracted from the extended corpus, and experiments were conducted on both individual and integrated prosodic-based acoustic features. It was observed that the fMMI technique exhibited 20% to 25% relative improvement in word error rate compared with MMI and bMMI techniques. Further, it was enhanced using an augmented dataset and hybrid front-end features (MFCC + POV + Fo + Voice quality) with a relative improvement of 13% compared with the earlier baseline system.

Download Full-text

A Robust Facial Feature Tracking Method Based on Optical Flow and Prior Measurement

International Journal of Cognitive Informatics and Natural Intelligence ◽

10.4018/jcini.2010100105 ◽

2010 ◽

Vol 4 (4) ◽

pp. 62-75

Author(s):

Guoyin Wang ◽

Yong Yang ◽

Kun He

Keyword(s):

Optical Flow ◽

Feature Tracking ◽

Facial Feature ◽

Recognition System ◽

Research Area ◽

Expression Recognition ◽

Tracking Method ◽

Facial Feature Tracking ◽

Recognition Result ◽

Method Of Measurement

Cognitive informatics (CI) is a research area including some interdisciplinary topics. Visual tracking is not only an important topic in CI, but also a hot topic in computer vision and facial expression recognition. In this paper, a novel and robust facial feature tracking method is proposed, in which Kanade-Lucas-Tomasi (KLT) optical flow is taken as basis. The prior method of measurement consisting of pupils detecting features restriction and errors and is used to improve the predictions. Simulation experiment results show that the proposed method is superior to the traditional optical flow tracking. Furthermore, the proposed method is used in a real time emotion recognition system and good recognition result is achieved.

Download Full-text