digit recognition
Recently Published Documents





2022 ◽  
Vol 15 (1) ◽  
pp. 1-35
Vladimir Rybalkin ◽  
Jonas Ney ◽  
Menbere Kina Tekleyohannes ◽  
Norbert Wehn

Multidimensional Long Short-Term Memory (MD-LSTM) neural network is an extension of one-dimensional LSTM for data with more than one dimension. MD-LSTM achieves state-of-the-art results in various applications, including handwritten text recognition, medical imaging, and many more. However, its implementation suffers from the inherently sequential execution that tremendously slows down both training and inference compared to other neural networks. The main goal of the current research is to provide acceleration for inference of MD-LSTM. We advocate that Field-Programmable Gate Array (FPGA) is an alternative platform for deep learning that can offer a solution when the massive parallelism of GPUs does not provide the necessary performance required by the application. In this article, we present the first hardware architecture for MD-LSTM. We conduct a systematic exploration to analyze a tradeoff between precision and accuracy. We use a challenging dataset for semantic segmentation, namely historical document image binarization from the DIBCO 2017 contest and a well-known MNIST dataset for handwritten digit recognition. Based on our new architecture, we implement FPGA-based accelerators that outperform Nvidia Geforce RTX 2080 Ti with respect to throughput by up to 9.9 and Nvidia Jetson AGX Xavier with respect to energy efficiency by up to 48 . Our accelerators achieve higher throughput, energy efficiency, and resource efficiency than FPGA-based implementations of convolutional neural networks (CNNs) for semantic segmentation tasks. For the handwritten digit recognition task, our FPGA implementations provide higher accuracy and can be considered as a solution when accuracy is a priority. Furthermore, they outperform earlier FPGA implementations of one-dimensional LSTMs with respect to throughput, energy efficiency, and resource efficiency.

T. Senthil ◽  
C. Rajan ◽  
J. Deepika

The predictions of characters/text/digits from the handwritten images have made the research community spotlight towards recognition. There are enormous applications and ambiguity that made prediction possible with Deep Learning (DL) approaches. Primarily, there are four necessary steps to be carried out with handwriting prediction. First, consideration of a dataset that is more appropriate for DL validation an inefficient manner. Here, Special Database 1 and Special Database 2 are used, which are combined and modified by the National Institute of Standards and Technology (NIST). Next is pre-processing of input handwritten digit recognition data by data normalization, extraction of efficient features which provides better prediction accuracy. The proposed idea uses pixel values as features with the analysis of hyper-parameters to enhance near-human performance. With SVM, non-linear and linear models are built to extract the appropriate features for further processing. The features are separate and placed over the Bag of Features (BoF), which is used by the next processing stage. Finally, a novel Convolutional Neural Network (CNN) is by built modifying the network structure with Orthogonal Learning Particle Swarm Optimization (CNN-OLPSO). This modification is adopted for evolutionarily optimizing the number of hyper-parameters. This proposed optimizer predicts the optimal values from the fitness computation and shows better efficiency when compared to various other conventional approaches. The novelty which relies on CNN adoption is to endeavor a suitable path towards digitalization and preserve the handwritten structure and help automatic feature extraction using CNN by offering better computation accuracy. The optimization approach helps to avoid over-fitting and under-fitting issues. Here, metrics like accuracy, elapsed time, recall, precision, and [Formula: see text]-measure are evaluated. The results of CNN-OLPSO give better accuracy, reduced error rate and better execution time (s) compared to other existing methods. Thus, the proposed model shows better tradeoff in the recognition rate of handwritten digits.

2021 ◽  
Vol 49 (1) ◽  
Toufik Datsi ◽  
Khalid Aznag ◽  
Ahmed El Oirrak ◽  

Current artificial neural network image recognition techniques use all the pixels of an image as input. In this paper, we present an efficient method for handwritten digit recognition that involves extracting the characteristics of a digit image by coding each row of the image as a decimal value, i.e., by transforming the binary representation into a decimal value. This method is called the decimal coding of rows. The set of decimal values calculated from the initial image is arranged as a vector and normalized; these values represent the inputs to the artificial neural network. The approach proposed in this work uses a multilayer perceptron neural network for the classification, recognition, and prediction of handwritten digits from 0 to 9. In this study, a dataset of 1797 samples were obtained from a digit database imported from the Scikit-learn library. Backpropagation was used as a learning algorithm to train the multilayer perceptron neural network. The results show that the proposed approach achieves better performance than two other schemes in terms of recognition accuracy and execution time.

2021 ◽  
Vol 2138 (1) ◽  
pp. 012002
Yang Gong ◽  
Pan Zhang

Abstract In view of the increasing demand for handwritten digit recognition, a handwritten digit recognition model based on convolutional neural network is proposed. The model includes 1 input layer and 2 convolutional layers (5*5 convolution Core), 2 pooling layers (2*2 pooling core), 1 fully connected layer, 1 output layer, and use the mnist data set for model training and prediction. After a lot of training and participation, the accuracy rate of the training set was finally reached to 100%, and the accuracy rate of 99.25% was also achieved on the test set, which can meet the requirements of recognizing handwritten digits.

David Noever ◽  
Samantha E. Miller Noever

A malicious firmware update may prove devastating to the embedded devices both that make up the Internet of Things (IoT) and that typically lack the same security verifications now applied to full operating systems. This work converts the binary headers of 40,000 firmware examples from bytes into 1024-pixel thumbnail images to train a deep neural network. The aim is to distinguish benign and malicious variants using modern deep learning methods without needing detailed functional or forensic analysis tools. One outcome of this image conversion enables contact with the vast machine learning literature already applied to handle digit recognition (MNIST). Another result indicates that greater than 90% accurate classifications prove possible using image-based convolutional neural networks (CNN) when combined with transfer learning methods. The envisioned CNN application would intercept firmware updates before their distribution to IoT networks and score their likelihood of containing malicious variants. To explain how the model makes classification decisions, the research applies traditional statistical methods such as both single and ensembles of decision trees with identifiable pixel or byte values that contribute the malicious or benign determination.

2021 ◽  
Ankit Kumar ◽  
Kunal Jani ◽  
Visaj Nirav Shah ◽  
Divyansh Khatri ◽  
Nabin Kumar Sahu ◽  

Sign in / Sign up

Export Citation Format

Share Document