scholarly journals A Study of Spatial Attention and Squeeze Excitation Block Fusion Improved ResNet for Identifying Bank Notes

2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Junjun Huo

Based on deep learning and digital image processing algorithms, we design and implement an accurate automatic recognition system for bank note text and propose an improved recognition method based on ResNet for the problems of difficult image text extraction and insufficient recognition accuracy. Firstly, a deep hyperparameterized convolution (DO-Conv) is used instead of the traditional convolution in the network to improve the recognition rate while reducing the model parameters. Then, the spatial attention model (SAM) and the squeezed excitation block (SE-Block) are fused and applied to a modified ResNet to extract detailed features of bank note images in the channel and spatial domains. Finally, the label-smoothed cross-entropy (LSCE) loss function is used to train the model to automatically calibrate the network to prevent classification errors. The experimental results demonstrate that the improved model is not easily affected by the image quality, and the model in this paper has good performance in text detection and recognition in specific business ticket scenarios.

2014 ◽  
Vol 989-994 ◽  
pp. 4187-4190 ◽  
Author(s):  
Lin Zhang

An adaptive gender recognition method is proposed in this paper. At first, do multiwavlet transform to face image and get its low frequency information, then do feature extraction to the low frequency information using compressive sensing (CS), use extreme learning machine (ELM) to achieve gender recognition finally. In the process of feature extraction, we use genetic algorithm (GA) to get the number of measurements of CS in order to gain the highest recognition rate, so the method can adaptive access optimal performance. Experimental results show that compared with PDA and LDA, the new method improved the recognition accuracy substantially.


Electronics ◽  
2020 ◽  
Vol 9 (12) ◽  
pp. 2056
Author(s):  
Junjie Wu ◽  
Jianfeng Xu ◽  
Deyu Lin ◽  
Min Tu

The recognition accuracy of micro-expressions in the field of facial expressions is still understudied, as current research methods mainly focus on feature extraction and classification. Based on optical flow and decision thinking theory, we propose a novel micro-expression recognition method, which can filter low-quality micro-expression video clips. Determined by preset thresholds, we develop two optical flow filtering mechanisms: one based on two-branch decisions (OFF2BD) and the other based on three-way decisions (OFF3WD). In OFF2BD, which use the classical binary logic to classify images, and divide the images into positive or negative domain for further filtering. Differ from the OFF2BD, OFF3WD added boundary domain to delay to judge the motion quality of the images. In this way, the video clips with low degree of morphological change can be eliminated, so as to directly improve the quality of micro-expression features and recognition rate. From the experimental results, we verify the recognition accuracy of 61.57%, and 65.41% for CASMEII, and SMIC datasets, respectively. Through the comparative analysis, it shows that the scheme can effectively improve the recognition performance.


Author(s):  
Hui Su ◽  
Bin Zhao ◽  
Feng Ma ◽  
Song Wang ◽  
Shaowei Xia

In this paper, a complete fault-tolerant check recognition system is proposed which has no check substitution error under the secret code verification. The fault-tolerant recognition method proposed in this paper creates all possible candidates for verification under the limited fault-tolerant rate, and with three classifiers of high isolated digit recognition rate, the system can always find out the correct recognition results of checks if there exist the correct labels of all the digits. Since the three classifiers are designed independently by different methods and they extract different features of handwritten digits, they can compensate each other when confusing digits are met. The segmentation stage combines the three most popular strategies, and gives out a way for segmenting unconstrained handwritten numeral strings on Chinese checks.


2014 ◽  
Vol 989-994 ◽  
pp. 2569-2575
Author(s):  
Feng Gao ◽  
Zhong Jian Dai ◽  
Kun Zhou ◽  
Ya Ping Dai

In order to improve the license plate recognition accuracy under complex environment, a new license location algorithm combining vertical edge detection, color information of the license plate and mathematical morphology is presented in this paper. For balance of computing load and recognition accuracy, a “200-d” character feature rule is designed, and the “200-d” feature is used as the input of BP neural network to recognize the characters. Based on the above-mentioned methods, a license plate recognition system is set up, which can locate and recognize the license plate effectively, even when the resolution of pictures and the position of vehicles in the pictures are not fixed. Experimental results indicate that the recognition rate of the algorithm reaches 90.5%.


Speech recognition technology has been developing very fast lately. One of its application is to know the meaning of some terms included in a geographic dictionary. When a subject speaks a word to the system, it will output the word and its meaning and explanation. There are many methods that are applied to speech recognition. One of the methods that can be applied and improve the accuracy of speech recognition is the use of a deep learning method, i.e. Convolutional Neural Network (CNN). In this research, CNN's speech recognition accuracy for the Indonesian geographic dictionary is analyzed to show that CNN can improve the accuracy of speech recognition compared to speech recognition with Gaussian mixture model and hidden Markov model (GMM-HMM). CNN is one of deep learning methods that analyzes and finds similarity in Mel-frequency cepstral coefficients (MFCC) from sound waves. This research is performed by making models of the spoken words using CNN under Python and TensorFlow. CNN is trained with these models from speech data collected and prepared from 20 students, consists of 19 men and a woman of different ages from 19 to 23 years. The vocabulary of the database consists of 50 words. The result of this research is a desktop application with the trained models implemented. Our application can recognize well the spoken words from subjects. Testing of the trained models was performed to examine the accuracy of the build speech recognition system. The result of the CNN speech recognition method from the Indonesian geographic dictionary is 80% accuracy for isolated words and 72.67% for continuous words in our research.


Agriculture ◽  
2022 ◽  
Vol 12 (1) ◽  
pp. 73
Author(s):  
Kaidong Lei ◽  
Chao Zong ◽  
Ting Yang ◽  
Shanshan Peng ◽  
Pengfei Zhu ◽  
...  

In large-scale sow production, real-time detection and recognition of sows is a key step towards the application of precision livestock farming techniques. In the pig house, the overlap of railings, floors, and sows usually challenge the accuracy of sow target detection. In this paper, a non-contact machine vision method was used for sow targets perception in complex scenarios, and the number position of sows in the pen could be detected. Two multi-target sow detection and recognition models based on the deep learning algorithms of Mask-RCNN and UNet-Attention were developed, and the model parameters were tuned. A field experiment was carried out. The data-set obtained from the experiment was used for algorithm training and validation. It was found that the Mask-RCNN model showed a higher recognition rate than that of the UNet-Attention model, with a final recognition rate of 96.8% and complete object detection outlines. In the process of image segmentation, the area distribution of sows in the pens was analyzed. The position of the sow’s head in the pen and the pixel area value of the sow segmentation were analyzed. The feeding, drinking, and lying behaviors of the sow have been identified on the basis of image recognition. The results showed that the average daily lying time, standing time, feeding and drinking time of sows were 12.67 h(MSE 1.08), 11.33 h(MSE 1.08), 3.25 h(MSE 0.27) and 0.391 h(MSE 0.10), respectively. The proposed method in this paper could solve the problem of target perception of sows in complex scenes and would be a powerful tool for the recognition of sows.


2013 ◽  
Vol 433-435 ◽  
pp. 316-321
Author(s):  
Lian Hai Zhang ◽  
Qi Chen ◽  
Dan Qu

Two kinds of imperfections, namely the detection errors and the asynchrony between phonological attributes and phone boundaries, can cause a substantial decline in recognition accuracy of a detection-based automatic speech recognition system. To solve these problems, an adjustment method between phonological attributes and phone boundaries is proposed in this paper. At first the prior knowledge of corpus and the detection results are combined, then the asynchronies in the phone boundary area are compensated and the detection errors are corrected; additionally, by selectively deleting some frames with errors, the precision of the phone models are improved. After adoption of this adjustment method, 1.4% of phoneme recognition rate can be improved in the TIMIT phone classification experiments based on Conditional Random Fields.


2011 ◽  
Vol 268-270 ◽  
pp. 82-87
Author(s):  
Zhi Peng Zhao ◽  
Yi Gang Cen ◽  
Xiao Fang Chen

In this paper, we proposed a new noise speech recognition method based on the compressive sensing theory. Through compressive sensing, our method increases the anti-noise ability of speech recognition system greatly, which leads to the improvement of the recognition accuracy. According to the experiments, our proposed method achieved better recognition performance compared with the traditional isolated word recognition method based on DTW algorithm.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Haixia Yang ◽  
Zhaohui Ji ◽  
Jun Sun ◽  
Fanan Xing ◽  
Yixian Shen ◽  
...  

Human gestures have been considered as one of the important human-computer interaction modes. With the fast development of wireless technology in urban Internet of Things (IoT) environment, Wi-Fi can not only provide the function of high-speed network communication but also has great development potential in the field of environmental perception. This paper proposes a gesture recognition system based on the channel state information (CSI) within the physical layer of Wi-Fi transmission. To solve the problems of noise interference and phase offset in the CSI, we adopt a model based on CSI quotient. Then, the amplitude and phase curves of CSI are smoothed using Savitzky-Golay filter, and the one-dimensional convolutional neural network (1D-CNN) is used to extract the gesture features. Then, the support vector machine (SVM) classifier is adopted to recognize the gestures. The experimental results have shown that our system can achieve a recognition rate of about 90% for three common gestures, including pushing forward, left stroke, and waving. Meanwhile, the effects of different human orientation and model parameters on the recognition results are analyzed as well.


Author(s):  
Teddy Surya Gunawan ◽  
Ahmad Fakhrur Razi Mohd Noor ◽  
Mira Kartiwi

Due to the advanced in GPU and CPU, in recent years, Deep Neural Network (DNN) becomes popular to be utilized both as feature extraction and classifier. This paper aims to develop offline handwritten recognition system using DNN. First, two popular English digits and letters database, i.e. MNIST and EMNIST, were selected to provide dataset for training and testing phase of DNN. Altogether, there are 10 digits [0-9] and 52 letters [a-z, A-Z]. The proposed DNN used stacked two autoencoder layers and one softmax layer. Recognition accuracy for English digits and letters is 97.7% and 88.8%, respectively. Performance comparison with other structure of neural networks revealed that the weighted average recognition rate for patternnet, feedforwardnet, and proposed DNN were 80.3%, 68.3%, and 90.4%, respectively. It shows that our proposed system is able to recognize handwritten English digits and letters with high accuracy.


Sign in / Sign up

Export Citation Format

Share Document