UL-CNN: An Ultra-Lightweight Convolutional Neural Network Aiming at Flash-Based Computing-In-Memory Architecture for Pedestrian Recognition

Pedestrian recognition has achieved the state-of-the-art performance due to the progress of recent convolutional neural network (CNN). However, mainstream CNN models are too complicated to emerging Computing-In-Memory (CIM) architectures for hardware implementation, because enormous parameters and massive intermediate processing results may incur severe “memory bottleneck”. This paper proposed a design methodology of Parameter Substitution with Nodes Compensation (PSNC) to significantly reduce parameters of CNN model without inference accuracy degradation. Based on the PSNC methodology, an ultra-lightweight convolutional neural network (UL-CNN) was designed. The UL-CNN model is a specially optimized convolutional neural network aiming at a flash-based CIM architecture (Conv-Flash) and to apply for recognizing person. The implementation result of running UL-CNN on Conv-Flash shows that the inference accuracy is up to 94.7%. Compared to LeNet-5, on the premise of the similar operations and accuracy, the amounts of UL-CNN’s parameters are less than 37% of LeNet-5 at the same dataset benchmark. Such parameter reduction can dramatically speed up the training process and economize on-chip storage overhead, as well as save the power consumption of the memory access. With the aid of UL-CNN, the Conv-Flash architecture can provide the best energy efficiency compared to other platforms (CPU, GPU, FPGA, etc.), which consumes only 2.2[Formula: see text] 105J to complete pedestrian recognition for one frame.

Download Full-text

A Multibranch Object Detection Method for Traffic Scenes

Computational Intelligence and Neuroscience ◽

10.1155/2019/3679203 ◽

2019 ◽

Vol 2019 ◽

pp. 1-16

Author(s):

Jiangfan Feng ◽

Fanjie Wang ◽

Siqin Feng ◽

Yongrong Peng

Keyword(s):

Neural Network ◽

Object Detection ◽

Convolutional Neural Network ◽

Detection Method ◽

State Of The Art ◽

Recall Rate ◽

Small Scale ◽

Feature Maps ◽

Time Requirements ◽

Speed Up

The performance of convolutional neural network- (CNN-) based object detection has achieved incredible success. Howbeit, existing CNN-based algorithms suffer from a problem that small-scale objects are difficult to detect because it may have lost its response when the feature map has reached a certain depth, and it is common that the scale of objects (such as cars, buses, and pedestrians) contained in traffic images and videos varies greatly. In this paper, we present a 32-layer multibranch convolutional neural network named MBNet for fast detecting objects in traffic scenes. Our model utilizes three detection branches, in which feature maps with a size of 16 × 16, 32 × 32, and 64 × 64 are used, respectively, to optimize the detection for large-, medium-, and small-scale objects. By means of a multitask loss function, our model can be trained end-to-end. The experimental results show that our model achieves state-of-the-art performance in terms of precision and recall rate, and the detection speed (up to 33 fps) is fast, which can meet the real-time requirements of industry.

Download Full-text

Low-Rank and Sparse Based Deep-Fusion Convolutional Neural Network for Crowd Counting

Mathematical Problems in Engineering ◽

10.1155/2017/5046727 ◽

2017 ◽

Vol 2017 ◽

pp. 1-11 ◽

Cited By ~ 2

Author(s):

Siqi Tang ◽

Zhisong Pan ◽

Xingyu Zhou

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

State Of The Art ◽

Regression Method ◽

Low Rank ◽

Counting Method ◽

Direct Integral ◽

Crowd Counting ◽

Counting Methods ◽

Density Map

This paper proposes an accurate crowd counting method based on convolutional neural network and low-rank and sparse structure. To this end, we firstly propose an effective deep-fusion convolutional neural network to promote the density map regression accuracy. Furthermore, we figure out that most of the existing CNN based crowd counting methods obtain overall counting by direct integral of estimated density map, which limits the accuracy of counting. Instead of direct integral, we adopt a regression method based on low-rank and sparse penalty to promote accuracy of the projection from density map to global counting. Experiments demonstrate the importance of such regression process on promoting the crowd counting performance. The proposed low-rank and sparse based deep-fusion convolutional neural network (LFCNN) outperforms existing crowd counting methods and achieves the state-of-the-art performance.

Download Full-text

Performance Analysis of State of the Art Convolutional Neural Network Architectures in Bangla Handwritten Character Recognition

Pattern Recognition and Image Analysis ◽

10.1134/s1054661821010089 ◽

2021 ◽

Vol 31 (1) ◽

pp. 60-71

Author(s):

Tapotosh Ghosh ◽

Min-Ha-Zul Abedin ◽

Hasan Al Banna ◽

Nasirul Mumenin ◽

Mohammad Abu Yousuf

Keyword(s):

Neural Network ◽

Performance Analysis ◽

Convolutional Neural Network ◽

Character Recognition ◽

State Of The Art ◽

Network Architectures ◽

Handwritten Character Recognition ◽

Handwritten Character ◽

Neural Network Architectures

Download Full-text

PedNet: A Spatio-Temporal Deep Convolutional Neural Network for Pedestrian Segmentation

Journal of Imaging ◽

10.3390/jimaging4090107 ◽

2018 ◽

Vol 4 (9) ◽

pp. 107 ◽

Cited By ~ 5

Author(s):

Mohib Ullah ◽

Ahmed Mohammed ◽

Faouzi Alaya Cheikh

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Performance Metrics ◽

State Of The Art ◽

Temporal Information ◽

Feature Maps ◽

Current Frame ◽

Low Level ◽

Art Methods ◽

Spatio Temporal

Articulation modeling, feature extraction, and classification are the important components of pedestrian segmentation. Usually, these components are modeled independently from each other and then combined in a sequential way. However, this approach is prone to poor segmentation if any individual component is weakly designed. To cope with this problem, we proposed a spatio-temporal convolutional neural network named PedNet which exploits temporal information for spatial segmentation. The backbone of the PedNet consists of an encoder–decoder network for downsampling and upsampling the feature maps, respectively. The input to the network is a set of three frames and the output is a binary mask of the segmented regions in the middle frame. Irrespective of classical deep models where the convolution layers are followed by a fully connected layer for classification, PedNet is a Fully Convolutional Network (FCN). It is trained end-to-end and the segmentation is achieved without the need of any pre- or post-processing. The main characteristic of PedNet is its unique design where it performs segmentation on a frame-by-frame basis but it uses the temporal information from the previous and the future frame for segmenting the pedestrian in the current frame. Moreover, to combine the low-level features with the high-level semantic information learned by the deeper layers, we used long-skip connections from the encoder to decoder network and concatenate the output of low-level layers with the higher level layers. This approach helps to get segmentation map with sharp boundaries. To show the potential benefits of temporal information, we also visualized different layers of the network. The visualization showed that the network learned different information from the consecutive frames and then combined the information optimally to segment the middle frame. We evaluated our approach on eight challenging datasets where humans are involved in different activities with severe articulation (football, road crossing, surveillance). The most common CamVid dataset which is used for calculating the performance of the segmentation algorithm is evaluated against seven state-of-the-art methods. The performance is shown on precision/recall, F 1 , F 2 , and mIoU. The qualitative and quantitative results show that PedNet achieves promising results against state-of-the-art methods with substantial improvement in terms of all the performance metrics.

Download Full-text

Spam Detection on Social Media Using Semantic Convolutional Neural Network

International Journal of Knowledge Discovery in Bioinformatics ◽

10.4018/ijkdb.2018010102 ◽

2018 ◽

Vol 8 (1) ◽

pp. 12-26 ◽

Cited By ~ 16

Author(s):

Gauri Jain ◽

Manisha Sharma ◽

Basant Agarwal

Keyword(s):

Neural Network ◽

Social Media ◽

Convolutional Neural Network ◽

State Of The Art ◽

Spam Detection ◽

Learning Technology ◽

The Social ◽

Social Media Text ◽

Current Article ◽

Semantic Layer

This article describes how spam detection in the social media text is becoming increasing important because of the exponential increase in the spam volume over the network. It is challenging, especially in case of text within the limited number of characters. Effective spam detection requires more number of efficient features to be learned. In the current article, the use of a deep learning technology known as a convolutional neural network (CNN) is proposed for spam detection with an added semantic layer on the top of it. The resultant model is known as a semantic convolutional neural network (SCNN). A semantic layer is composed of training the random word vectors with the help of Word2vec to get the semantically enriched word embedding. WordNet and ConceptNet are used to find the word similar to a given word, in case it is missing in the word2vec. The architecture is evaluated on two corpora: SMS Spam dataset (UCI repository) and Twitter dataset (Tweets scrapped from public live tweets). The authors' approach outperforms the-state-of-the-art results with 98.65% accuracy on SMS spam dataset and 94.40% accuracy on Twitter dataset.

Download Full-text

Dangerous Scenes Recognition During Hoisting Based on Faster Region-Based Convolutional Neural Network

Volume 2: Mechanics and Behavior of Active Materials; Structural Health Monitoring; Bioinspired Smart Materials and Systems; Energy Harvesting; Emerging Technologies ◽

10.1115/smasis2018-8226 ◽

2018 ◽

Author(s):

Hongguo Su ◽

Mingyuan Zhang ◽

Shengyuan Li ◽

Xuefeng Zhao

Keyword(s):

Neural Network ◽

Object Detection ◽

Convolutional Neural Network ◽

Image Classification ◽

State Of The Art ◽

Spatial Location ◽

Average Precision ◽

Practical Applications ◽

Security Enhancement ◽

Human Interactions

In the last couple of years, advancements in the deep learning, especially in convolutional neural networks, proved to be a boon for the image classification and recognition tasks. One of the important practical applications of object detection and image classification can be for security enhancement. If dangerous objects or scenes can be identified automatically, then a lot of accidents can be prevented. For this purpose, in this paper we made use of state-of-the-art implementation of Faster Region-based Convolutional Neural Network (Faster R-CNN) based on the monitoring video of hoisting sites to train a model to detect the dangerous object and the worker. By extracting the locations of them, object-human interactions during hoisting, mainly for changes in their spatial location relationship, can be understood whereby estimating whether the scene is safe or dangerous. Experimental results showed that the pre-trained model achieved good performance with a high mean average precision of 97.66% on object detection and the proposed method fulfilled the goal of dangerous scenes recognition perfectly.

Download Full-text

A design methodology for approximate multipliers in convolutional neural networks: A case of MNIST

International Journal of Reconfigurable and Embedded Systems (IJRES) ◽

10.11591/ijres.v10.i1.pp1-10 ◽

2021 ◽

Vol 10 (1) ◽

pp. 1

Author(s):

Kenta Shirane ◽

Takahiro Yamamoto ◽

Hiroyuki Tomiyama

Keyword(s):

Neural Network ◽

Neural Networks ◽

Convolutional Neural Network ◽

Design Methodology ◽

Critical Path ◽

High Accuracy ◽

Path Delay ◽

Trade Off ◽

Critical Path Delay

In this paper, we present a case study on approximate multipliers for MNIST Convolutional Neural Network (CNN). We apply approximate multipliers with different bit-width to the convolution layer in MNIST CNN, evaluate the accuracy of MNIST classification, and analyze the trade-off between approximate multiplier’s area, critical path delay and the accuracy. Based on the results of the evaluation and analysis, we propose a design methodology for approximate multipliers. The approximate multipliers consist of some partial products, which are carefully selected according to the CNN input. With this methodology, we further reduce the area and the delay of the multipliers with keeping high accuracy of the MNIST classification.

Download Full-text

WTRPNet: An Explainable Graph Feature Convolutional Neural Network for Epileptic EEG Classification

ACM Transactions on Multimedia Computing Communications and Applications ◽

10.1145/3460522 ◽

2021 ◽

Vol 17 (3s) ◽

pp. 1-18

Author(s):

Qi Xin ◽

Shaohao Hu ◽

Shuaiqi Liu ◽

Ling Zhao ◽

Shuihua Wang

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

State Of The Art ◽

Traumatic Injury ◽

Recurrence Plot ◽

Superior Performance ◽

Eeg Classification ◽

Epilepsy Diagnosis ◽

Electroencephalogram Eeg

As one of the important tools of epilepsy diagnosis, the electroencephalogram (EEG) is noninvasive and presents no traumatic injury to patients. It contains a lot of physiological and pathological information that is easy to obtain. The automatic classification of epileptic EEG is important in the diagnosis and therapeutic efficacy of epileptics. In this article, an explainable graph feature convolutional neural network named WTRPNet is proposed for epileptic EEG classification. Since WTRPNet is constructed by a recurrence plot in the wavelet domain, it can fully obtain the graph feature of the EEG signal, which is established by an explainable graph features extracted layer called WTRP block . The proposed method shows superior performance over state-of-the-art methods. Experimental results show that our algorithm has achieved an accuracy of 99.67% in classification of focal and nonfocal epileptic EEG, which proves the effectiveness of the classification and detection of epileptic EEG.

Download Full-text

Tomato pest classification using deep convolutional neural network with transfer learning, fine tuning and scratch learning

Intelligent Decision Technologies ◽

10.3233/idt-200192 ◽

2021 ◽

pp. 1-10

Author(s):

Gayatri Pattnaik ◽

Vimal K. Shrivastava ◽

K. Parvathi

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Transfer Learning ◽

Data Augmentation ◽

State Of The Art ◽

Deep Convolutional Neural Network ◽

Fine Tuning ◽

Tomato Plants ◽

Random Weights

Pests are major threat to economic growth of a country. Application of pesticide is the easiest way to control the pest infection. However, excessive utilization of pesticide is hazardous to environment. The recent advances in deep learning have paved the way for early detection and improved classification of pest in tomato plants which will benefit the farmers. This paper presents a comprehensive analysis of 11 state-of-the-art deep convolutional neural network (CNN) models with three configurations: transfers learning, fine-tuning and scratch learning. The training in transfer learning and fine tuning initiates from pre-trained weights whereas random weights are used in case of scratch learning. In addition, the concept of data augmentation has been explored to improve the performance. Our dataset consists of 859 tomato pest images from 10 categories. The results demonstrate that the highest classification accuracy of 94.87% has been achieved in the transfer learning approach by DenseNet201 model with data augmentation.

Download Full-text

Textual Deblurring using Convolutional Neural Network

10.36227/techrxiv.16760632 ◽

2021 ◽

Author(s):

Muhammad Shahroz Nadeem ◽

Sibt Hussain ◽

Fatih Kurugollu

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Loss Function ◽

State Of The Art ◽

Empirical Evaluation ◽

The State

This paper solves the textual deblurring problem, In this paper we propose a new loss function, we provide empirical evaluation of the design choices based on which a memory friendly CNN model is proposed, that performs better then the state of the art CNN method.

Download Full-text