A Deep Learning based Arabic Script Recognition System: Benchmark on KHAT

2020 ◽  
Vol 17 (3) ◽  
pp. 299-305 ◽  
Author(s):  
Riaz Ahmad ◽  
Saeeda Naz ◽  
Muhammad Afzal ◽  
Sheikh Rashid ◽  
Marcus Liwicki ◽  
...  

This paper presents a deep learning benchmark on a complex dataset known as KFUPM Handwritten Arabic TexT (KHATT). The KHATT data-set consists of complex patterns of handwritten Arabic text-lines. This paper contributes mainly in three aspects i.e., (1) pre-processing, (2) deep learning based approach, and (3) data-augmentation. The pre-processing step includes pruning of white extra spaces plus de-skewing the skewed text-lines. We deploy a deep learning approach based on Multi-Dimensional Long Short-Term Memory (MDLSTM) networks and Connectionist Temporal Classification (CTC). The MDLSTM has the advantage of scanning the Arabic text-lines in all directions (horizontal and vertical) to cover dots, diacritics, strokes and fine inflammation. The data-augmentation with a deep learning approach proves to achieve better and promising improvement in results by gaining 80.02% Character Recognition (CR) over 75.08% as baseline.

2021 ◽  
Vol 22 (Supplement_2) ◽  
Author(s):  
C Torlasco ◽  
D Papetti ◽  
R Mene ◽  
J Artico ◽  
A Seraphim ◽  
...  

Abstract Funding Acknowledgements Type of funding sources: None. Introduction The extent of ischemic scar detected by Cardiac Magnetic Resonance (CMR) with late gadolinium enhancement (LGE) is linked with long-term prognosis, but scar quantification is time-consuming. Deep Learning (DL) approaches appear promising in CMR segmentation.  Purpose: To train and apply a deep learning approach to dark blood (DB) CMR-LGE for ischemic scar segmentation, comparing results to 4-Standard Deviation (4-SD) semi-automated method. Methods: We trained and validated a dual neural network infrastructure on a dataset of DB-LGE short-axis stacks, acquired at 1.5T from 33 patients with ischemic scar. The DL architectures were an evolution of the U-Net Convolutional Neural Network (CNN), using data augmentation to increase generalization. The CNNs worked together to identify and segment 1) the myocardium and 2) areas of LGE. The first CNN simultaneously cropped the region of interest (RoI) according to the bounding box of the heart and calculated the area of myocardium. The cropped RoI was then processed by the second CNN, which identified the overall LGE area. The extent of scar was calculated as the ratio of the two areas. For comparison, endo- and epi-cardial borders were manually contoured and scars segmented by a 4-SD technique with a validated software. Results: The two U-Net networks were implemented with two free and open-source software library for machine learning. We performed 5-fold cross-validation over a dataset of 108 and 385 labelled CMR images of the myocardium and scar, respectively. We obtained high performance (> ∼0.85) as measured by the Intersection over Union metric (IoU) on the training sets, in the case of scar segmentation. With regards to heart recognition, the performance was lower (> ∼0.7), although improved (∼ 0.75) by detecting the cardiac area instead of heart boundaries. On the validation set, performances oscillated between 0.8 and 0.85 for scar tissue recognition, and dropped to ∼0.7 for myocardium segmentation. We believe that underrepresented samples and noise might be affecting the overall performances, so that additional data might be beneficial. Figure1: examples of heart segmentation (upper left panel: training; upper right panel: validation) and of scar segmentation (lower left panel: training; lower right panel: validation). Conclusion: Our CNNs show promising results in automatically segmenting LV and quantify ischemic scars on DB-LGE-CMR images. The performances of our method can further improve by expanding the data set used for the training. If implemented in a clinical routine, this process can speed up the CMR analysis process and aid in the clinical decision-making. Abstract Figure.


Author(s):  
Kyungkoo Jun

Background & Objective: This paper proposes a Fourier transform inspired method to classify human activities from time series sensor data. Methods: Our method begins by decomposing 1D input signal into 2D patterns, which is motivated by the Fourier conversion. The decomposition is helped by Long Short-Term Memory (LSTM) which captures the temporal dependency from the signal and then produces encoded sequences. The sequences, once arranged into the 2D array, can represent the fingerprints of the signals. The benefit of such transformation is that we can exploit the recent advances of the deep learning models for the image classification such as Convolutional Neural Network (CNN). Results: The proposed model, as a result, is the combination of LSTM and CNN. We evaluate the model over two data sets. For the first data set, which is more standardized than the other, our model outperforms previous works or at least equal. In the case of the second data set, we devise the schemes to generate training and testing data by changing the parameters of the window size, the sliding size, and the labeling scheme. Conclusion: The evaluation results show that the accuracy is over 95% for some cases. We also analyze the effect of the parameters on the performance.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Molham Al-Maleh ◽  
Said Desouki

An amendment to this paper has been published and can be accessed via the original article.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
BinBin Zhang ◽  
Fumin Zhang ◽  
Xinghua Qu

Purpose Laser-based measurement techniques offer various advantages over conventional measurement techniques, such as no-destructive, no-contact, fast and long measuring distance. In cooperative laser ranging systems, it’s crucial to extract center coordinates of retroreflectors to accomplish automatic measurement. To solve this problem, this paper aims to propose a novel method. Design/methodology/approach We propose a method using Mask RCNN (Region Convolutional Neural Network), with ResNet101 (Residual Network 101) and FPN (Feature Pyramid Network) as the backbone, to localize retroreflectors, realizing automatic recognition in different backgrounds. Compared with two other deep learning algorithms, experiments show that the recognition rate of Mask RCNN is better especially for small-scale targets. Based on this, an ellipse detection algorithm is introduced to obtain the ellipses of retroreflectors from recognized target areas. The center coordinates of retroreflectors in the camera coordinate system are obtained by using a mathematics method. Findings To verify the accuracy of this method, an experiment was carried out: the distance between two retroreflectors with a known distance of 1,000.109 mm was measured, with 2.596 mm root-mean-squar error, meeting the requirements of the coarse location of retroreflectors. Research limitations/implications The research limitations/implications are as follows: (i) As the data set only has 200 pictures, although we have used some data augmentation methods such as rotating, mirroring and cropping, there is still room for improvement in the generalization ability of detection. (ii) The ellipse detection algorithm needs to work in relatively dark conditions, as the retroreflector is made of stainless steel, which easily reflects light. Originality/value The originality/value of the article lies in being able to obtain center coordinates of multiple retroreflectors automatically even in a cluttered background; being able to recognize retroreflectors with different sizes, especially for small targets; meeting the recognition requirement of multiple targets in a large field of view and obtaining 3 D centers of targets by monocular model-based vision.


2020 ◽  
Vol 12 (7) ◽  
pp. 1092
Author(s):  
David Browne ◽  
Michael Giering ◽  
Steven Prestwich

Scene classification is an important aspect of image/video understanding and segmentation. However, remote-sensing scene classification is a challenging image recognition task, partly due to the limited training data, which causes deep-learning Convolutional Neural Networks (CNNs) to overfit. Another difficulty is that images often have very different scales and orientation (viewing angle). Yet another is that the resulting networks may be very large, again making them prone to overfitting and unsuitable for deployment on memory- and energy-limited devices. We propose an efficient deep-learning approach to tackle these problems. We use transfer learning to compensate for the lack of data, and data augmentation to tackle varying scale and orientation. To reduce network size, we use a novel unsupervised learning approach based on k-means clustering, applied to all parts of the network: most network reduction methods use computationally expensive supervised learning methods, and apply only to the convolutional or fully connected layers, but not both. In experiments, we set new standards in classification accuracy on four remote-sensing and two scene-recognition image datasets.


Viruses ◽  
2020 ◽  
Vol 12 (7) ◽  
pp. 769 ◽  
Author(s):  
Ahmed Sedik ◽  
Abdullah M Iliyasu ◽  
Basma Abd El-Rahiem ◽  
Mohammed E. Abdel Samea ◽  
Asmaa Abdel-Raheem ◽  
...  

This generation faces existential threats because of the global assault of the novel Corona virus 2019 (i.e., COVID-19). With more than thirteen million infected and nearly 600000 fatalities in 188 countries/regions, COVID-19 is the worst calamity since the World War II. These misfortunes are traced to various reasons, including late detection of latent or asymptomatic carriers, migration, and inadequate isolation of infected people. This makes detection, containment, and mitigation global priorities to contain exposure via quarantine, lockdowns, work/stay at home, and social distancing that are focused on “flattening the curve”. While medical and healthcare givers are at the frontline in the battle against COVID-19, it is a crusade for all of humanity. Meanwhile, machine and deep learning models have been revolutionary across numerous domains and applications whose potency have been exploited to birth numerous state-of-the-art technologies utilised in disease detection, diagnoses, and treatment. Despite these potentials, machine and, particularly, deep learning models are data sensitive, because their effectiveness depends on availability and reliability of data. The unavailability of such data hinders efforts of engineers and computer scientists to fully contribute to the ongoing assault against COVID-19. Faced with a calamity on one side and absence of reliable data on the other, this study presents two data-augmentation models to enhance learnability of the Convolutional Neural Network (CNN) and the Convolutional Long Short-Term Memory (ConvLSTM)-based deep learning models (DADLMs) and, by doing so, boost the accuracy of COVID-19 detection. Experimental results reveal improvement in terms of accuracy of detection, logarithmic loss, and testing time relative to DLMs devoid of such data augmentation. Furthermore, average increases of 4% to 11% in COVID-19 detection accuracy are reported in favour of the proposed data-augmented deep learning models relative to the machine learning techniques. Therefore, the proposed algorithm is effective in performing a rapid and consistent Corona virus diagnosis that is primarily aimed at assisting clinicians in making accurate identification of the virus.


2020 ◽  
Vol 2 (2) ◽  
Author(s):  
Mamunur Rashid ◽  
Minarul Islam ◽  
Norizam Sulaiman ◽  
Bifta Sama Bari ◽  
Ripon Kumar Saha ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document