Real-Time Pedestrian Detection Using Convolutional Neural Networks

Pedestrian detection provides manager of a smart city with a great opportunity to manage their city effectively and automatically. Specifically, pedestrian detection technology can improve our secure environment and make our traffic more efficient. In this paper, all of our work both modification and improvement are made based on YOLO, which is a real-time Convolutional Neural Network detector. In our work, we extend YOLO’s original network structure, and also give a new definition of loss function to boost the performance for pedestrian detection, especially when the targets are small, and that is exactly what YOLO is not good at. In our experiment, the proposed model is tested on INRIA, UCF YouTube Action Data Set and Caltech Pedestrian Detection Benchmark. Experimental results indicate that after our modification and improvement, the revised YOLO network outperforms the original version and also is better than other solutions.

Download Full-text

Pedestrian detection based on improved LeNet-5 convolutional neural network

Journal of Algorithms & Computational Technology ◽

10.1177/1748302619873601 ◽

2019 ◽

Vol 13 ◽

pp. 174830261987360 ◽

Cited By ~ 2

Author(s):

Chuan-Wei Zhang ◽

Meng-Yue Yang ◽

Hong-Jun Zeng ◽

Jian-Ping Wen

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Real Time ◽

Network Model ◽

Pedestrian Detection ◽

The Real ◽

Better Than

In this article, according to the real-time and accuracy requirements of advanced vehicle-assisted driving in pedestrian detection, an improved LeNet-5 convolutional neural network is proposed. Firstly, the structure of LeNet-5 network model is analyzed, and the structure and parameters of the network are improved and optimized on the basis of this network to get a new LeNet network model, and then it is used to detect pedestrians. Finally, the miss rate of the improved LeNet convolutional neural network is found to be 25% by contrast and analysis. The experiment proves that this method is better than SA-Fast R-CNN and classical LeNet-5 CNN algorithm.

Download Full-text

Latent Semantic Analysis using a Dennis Coefficient for English Sentiment Classification in a Parallel System

International Journal of Computers Communications & Control ◽

10.15837/ijccc.2018.3.3044 ◽

2018 ◽

Vol 13 (3) ◽

pp. 408-428 ◽

Cited By ~ 4

Author(s):

Phu Vo Ngoc

Keyword(s):

Latent Semantic Analysis ◽

Semantic Analysis ◽

Sentiment Classification ◽

Training Data ◽

The Novel ◽

Data Set ◽

Proposed Model ◽

Testing Data ◽

Novel Model ◽

Better Than

We have already survey many significant approaches for many years because there are many crucial contributions of the sentiment classification which can be applied in everyday life, such as in political activities, commodity production, and commercial activities. We have proposed a novel model using a Latent Semantic Analysis (LSA) and a Dennis Coefficient (DNC) for big data sentiment classification in English. Many LSA vectors (LSAV) have successfully been reformed by using the DNC. We use the DNC and the LSAVs to classify 11,000,000 documents of our testing data set to 5,000,000 documents of our training data set in English. This novel model uses many sentiment lexicons of our basis English sentiment dictionary (bESD). We have tested the proposed model in both a sequential environment and a distributed network system. The results of the sequential system are not as good as that of the parallel environment. We have achieved 88.76% accuracy of the testing data set, and this is better than the accuracies of many previous models of the semantic analysis. Besides, we have also compared the novel model with the previous models, and the experiments and the results of our proposed model are better than that of the previous model. Many different fields can widely use the results of the novel model in many commercial applications and surveys of the sentiment classification.

Download Full-text

Traffic Sign Classification Using Convolutional Neural Network

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit217545 ◽

2021 ◽

pp. 01-10

Author(s):

Pranav Kale ◽

Mayuresh Panchpor ◽

Saloni Dingore ◽

Saloni Gaikwad ◽

Prof. Dr. Laxmi Bewoor

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Real Time ◽

Autonomous Driving ◽

Time Data ◽

Traffic Sign ◽

Real World Data ◽

Data Set ◽

Proposed Model ◽

Validation Set

In today's world, deep learning fields are getting boosted with increasing speed. Lot of innovations and different algorithms are being developed. In field of computer vision, related to autonomous driving sector, traffic signs play an important role to provide real time data of an environment. Different algorithms were developed to classify these Signs. But performance still needs to improve for real time environment. Even the computational power required to train such model is high. In this paper, Convolutional Neural Network model is used to Classify Traffic Sign. The experiments are conducted on a real-world data set with images and videos captured from ordinary car driving as well as on GTSRB dataset [15] available on Kaggle. This proposed model is able to outperform previous models and resulted with accuracy of 99.6% on validation set. This idea has been granted Innovation Patent by Australian IP to Authors of this Research Paper. [24]

Download Full-text

Four-Layer ConvNet to Facial Emotion Recognition with Minimal Epochs and the Significance of Data Diversity

10.20944/preprints202105.0424.v1 ◽

2021 ◽

Author(s):

Tanoy Debnath ◽

Md. Mahfuz Reza ◽

Anichur Rahman ◽

Shahab Band ◽

Hamid Alinejad Rokny

Keyword(s):

Emotion Recognition ◽

Real Time ◽

Emotional Processing ◽

Image Data ◽

Medical Diagnostics ◽

Computer Interfaces ◽

Human Computer Interfaces ◽

Proposed Model ◽

Fully Connected ◽

Better Than

Emotion recognition defined as identifying human emotion and is directly related to different fields such as human-computer interfaces, human emotional processing, irrational analysis, medical diagnostics, data-driven animation, human-robot communi- cation and many more. The purpose of this study is to propose a new facial emotional recognition model using convolutional neural network. Our proposed model, “ConvNet”, detects seven specific emotions from image data including anger, disgust, fear, happiness, neutrality, sadness, and surprise. This research focuses on the model’s training accuracy in a short number of epoch which the authors can develop a real-time schema that can easily fit the model and sense emotions. Furthermore, this work focuses on the mental or emotional stuff of a man or woman using the behavioral aspects. To complete the training of the CNN network model, we use the FER2013 databases, and we test the system’s success by identifying facial expressions in the real-time. ConvNet consists of four layers of convolution together with two fully connected layers. The experimental results show that the ConvNet is able to achieve 96% training accuracy which is much better than current existing models. ConvNet also achieved validation accuracy of 65% to 70% (considering different datasets used for experiments), resulting in a higher classification accuracy compared to other existing models. We also made all the materials publicly accessible for the research community at: https://github.com/Tanoy004/Emotion-recognition-through-CNN.

Download Full-text

Effects of Temperature and Mounting Configuration on the Dynamic Parameters Identification of Industrial Robots

Robotics ◽

10.3390/robotics10030083 ◽

2021 ◽

Vol 10 (3) ◽

pp. 83

Author(s):

Andrea Raviola ◽

Roberto Guida ◽

Andrea De Martin ◽

Stefano Pastorelli ◽

Stefano Mauro ◽

...

Keyword(s):

Viscous Friction ◽

Operating Conditions ◽

Industrial Robots ◽

Dynamic Parameters ◽

Data Set ◽

Joint Torques ◽

Proposed Model ◽

Effects Of Temperature ◽

The One ◽

Definition Of

Dynamic parameters are crucial for the definition of high-fidelity models of industrial manipulators. However, since they are often partially unknown, a mathematical model able to identify them is discussed and validated with the UR3 and the UR5 collaborative robots from Universal Robots. According to the acquired experimental data, this procedure allows for reducing the error on the estimated joint torques of about 90% with respect to the one obtained using only the information provided by the manufacturer. The present research also highlights how changes in the robot operating conditions affect its dynamic behavior. In particular, the identification process has been applied to a data set obtained commanding the same trajectory multiple times to both robots under rising joints temperatures. Average reductions of the viscous friction coefficients of about 20% and 17% for the UR3 and the UR5 robots, respectively, have been observed. Moreover, it is shown how the manipulator mounting configuration affects the number of the base dynamic parameters necessary to properly estimate the robots’ joints torques. The ability of the proposed model to take into account different mounting configurations is then verified by performing the identification procedure on a data set generated through a digital twin of a UR5 robot mounted on the ceiling.

Download Full-text

Real-time object detection technology in railway operations

HKIE Transactions ◽

10.33430/v28n3thie-2020-0028 ◽

2021 ◽

pp. 154-162

Author(s):

Rock K C Ho ◽

Zhangyu Wang ◽

Simon S C Tang ◽

Qiang Zhang

Keyword(s):

Object Detection ◽

Real Time ◽

High Performance ◽

New Technology ◽

Practical Application ◽

Data Set ◽

Detection Technology ◽

Railway Industry ◽

Extreme Scenario ◽

Typical Scenario

Development of new technology to enhance train operability, in particular during manual driving by real-time object detection on track, is one of the rising trends in the railway industry. The function of object detection can provide train operators with reminder alerts whenever there is an object detected close to a train, e.g. a defined distance from the train. In this paper, a two-stage vision-based method is proposed to achieve this goal. At first, the Targets Generation Stage focuses on extracting all potential targets by identifying the centre points of targets. Meanwhile, the Targets Reconfirmation Stage is further adopted to re-analyse the potential targets from the previous stage to filter out incorrect potential targets in the output. The experiment and evaluation result shows that the proposed method achieved an Average Precision (AP) of 0.876 and 0.526 respectively under typical scenario sub-groups and extreme scenario sub-groups of the data set collected from a real railway environment at the methodological level. Furthermore, at the application level, high performance with the False Alarm Rate (FAR) of 0.01% and Missed Detection Rate (MDR) of 0.94%, which is capable of practical application, was achieved during the operation in the Tsuen Wan Line (TWL) in Hong Kong.

Download Full-text

An Improved Pedestrian Detection Algorithm using Integration of Resnet and Yolo V2

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.a3012.059120 ◽

2020 ◽

Vol 9 (1) ◽

pp. 2480-2485

Keyword(s):

Feature Extraction ◽

Pedestrian Detection ◽

Autonomous Driving ◽

Detection Algorithm ◽

Efficient Algorithms ◽

Accuracy Rate ◽

Detection Algorithms ◽

Detection Technology ◽

Proposed Model ◽

Real Time Applications

Pedestrian detection is one of the important tasks in object detection technology. The pedestrian detection algorithm has been used in applications like intelligent video surveillance, traffic analysis, and autonomous driving. In recent years, many pedestrian detection algorithms have been proposed but the key drawback is the accuracy and speed, which can be improved my integrating efficient algorithms. The proposed model improves the pedestrian detection algorithm by integrating two efficient algorithms together. The model is developed using the joint version of ResNet and YOLO v2, which preforms feature extraction and classification respectively. By using this model the efficiency of the system is increased by improving the accuracy rate so that it can be used with real-time applications. The model has been compared with existing models like SSD, Faster R-CNN and Mask R-CNN. Comparing with these models, the proposed model provides mAP value higher than these existing models with less loss function when tested on the INRIA dataset.

Download Full-text

The Development of Long-Distance Viewing Direction Analysis and Recognition of Observed Objects Using Head Image and Deep Learning

Mathematics ◽

10.3390/math9161880 ◽

2021 ◽

Vol 9 (16) ◽

pp. 1880

Author(s):

Yu-Shiuan Tsai ◽

Nai-Chi Chen ◽

Yi-Zeng Hsieh ◽

Shih-Syun Lin

Keyword(s):

Facial Feature ◽

Line Of Sight ◽

Long Distance ◽

Data Set ◽

Detection Technology ◽

The Neural Network ◽

Single Lens ◽

The Face ◽

Test Face ◽

Better Than

In this study, we use OpenPose to capture many facial feature nodes, create a data set and label it, and finally bring in the neural network model we created. The purpose is to predict the direction of the person’s line of sight from the face and facial feature nodes and finally add object detection technology to calculate the object that the person is observing. After implementing this method, we found that this method can correctly estimate the human body’s form. Furthermore, if multiple lenses can get more information, the effect will be better than a single lens, evaluating the observed objects more accurately. Furthermore, we found that the head in the image can judge the direction of view. In addition, we found that in the case of the test face tilt, approximately at a tilt angle of 60 degrees, the face nodes can still be captured. Similarly, when the inclination angle is greater than 60 degrees, the facing node cannot be used.

Download Full-text

Urban street scene analysis using lightweight multi-level multi-path feature aggregation network

Multiagent and Grid Systems ◽

10.3233/mgs-210353 ◽

2021 ◽

Vol 17 (3) ◽

pp. 249-271

Author(s):

Tanmay Singha ◽

Duc-Son Pham ◽

Aneesh Krishna

Keyword(s):

Real Time ◽

Scene Analysis ◽

Model Complexity ◽

Slight Reduction ◽

Data Set ◽

Urban Street ◽

Proposed Model ◽

Feature Aggregation ◽

Multi Level ◽

Street Scene

Urban street scene analysis is an important problem in computer vision with many off-line models achieving outstanding semantic segmentation results. However, it is an ongoing challenge for the research community to develop and optimize the deep neural architecture with real-time low computing requirements whilst maintaining good performance. Balancing between model complexity and performance has been a major hurdle with many models dropping too much accuracy for a slight reduction in model size and unable to handle high-resolution input images. The study aims to address this issue with a novel model, named M2FANet, that provides a much better balance between model’s efficiency and accuracy for scene segmentation than other alternatives. The proposed optimised backbone helps to increase model’s efficiency whereas, suggested Multi-level Multi-path (M2) feature aggregation approach enhances model’s performance in the real-time environment. By exploiting multi-feature scaling technique, M2FANet produces state-of-the-art results in resource-constrained situations by handling full input resolution. On the Cityscapes benchmark data set, the proposed model produces 68.5% and 68.3% class accuracy on validation and test sets respectively, whilst having only 1.3 million parameters. Compared with all real-time models of less than 5 million parameters, the proposed model is the most competitive in both performance and real-time capability.

Download Full-text

Four-layer Convnet to Facial Emotion Recognition With Minimal Epochs and the Significance of Data Diversity

10.21203/rs.3.rs-511221/v1 ◽

2021 ◽

Author(s):

Tanoy Debnath ◽

Md. Mahfuz Reza ◽

Anichur Rahman ◽

Shahab S. Band ◽

Hamid Alinejad-Rokny

Keyword(s):

Emotion Recognition ◽

Real Time ◽

Emotional Processing ◽

Image Data ◽

Medical Diagnostics ◽

Computer Interfaces ◽

Human Computer Interfaces ◽

Proposed Model ◽

Fully Connected ◽

Better Than

Abstract Emotion recognition defined as identifying human emotion and is directly related to different fields such as human-computer interfaces, human emotional processing, irrational analysis, medical diagnostics, data-driven animation, human-robot communi- cation and many more. The purpose of this study is to propose a new facial emotional recognition model using convolutional neural network. Our proposed model, “ConvNet”, detects seven specific emotions from image data including anger, disgust, fear, happiness, neutrality, sadness, and surprise. This research focuses on the model’s training accuracy in a short number of epoch which the authors can develop a real-time schema that can easily fit the model and sense emotions. Furthermore, this work focuses on the mental or emotional stuff of a man or woman using the behavioral aspects. To complete the training of the CNN network model, we use the FER2013 databases, and we test the system’s success by identifying facial expressions in the real-time. ConvNet consists of four layers of convolution together with two fully connected layers. The experimental results show that the ConvNet is able to achieve 96% training accuracy which is much better than current existing models. ConvNet also achieved validation accuracy of 65% to 70% (considering different datasets used for experiments), resulting in a higher classification accuracy compared to other existing models. We also made all the materials publicly accessible for the research community at: https://github.com/Tanoy004/Emotion-recognition-through-CNN.

Download Full-text