scholarly journals An Efficient Deep Convolutional Neural Network Approach for Object Detection and Recognition Using a Multi-Scale Anchor Box in Real-Time

2021 ◽  
Vol 13 (12) ◽  
pp. 307
Author(s):  
Vijayakumar Varadarajan ◽  
Dweepna Garg ◽  
Ketan Kotecha

Deep learning is a relatively new branch of machine learning in which computers are taught to recognize patterns in massive volumes of data. It primarily describes learning at various levels of representation, which aids in understanding data that includes text, voice, and visuals. Convolutional neural networks have been used to solve challenges in computer vision, including object identification, image classification, semantic segmentation and a lot more. Object detection in videos involves confirming the presence of the object in the image or video and then locating it accurately for recognition. In the video, modelling techniques suffer from high computation and memory costs, which may decrease performance measures such as accuracy and efficiency to identify the object accurately in real-time. The current object detection technique based on a deep convolution neural network requires executing multilevel convolution and pooling operations on the entire image to extract deep semantic properties from it. For large objects, detection models can provide superior results; however, those models fail to detect the varying size of the objects that have low resolution and are greatly influenced by noise because the features after the repeated convolution operations of existing models do not fully represent the essential characteristics of the objects in real-time. With the help of a multi-scale anchor box, the proposed approach reported in this paper enhances the detection accuracy by extracting features at multiple convolution levels of the object. The major contribution of this paper is to design a model to understand better the parameters and the hyper-parameters which affect the detection and the recognition of objects of varying sizes and shapes, and to achieve real-time object detection and recognition speeds by improving accuracy. The proposed model has achieved 84.49 mAP on the test set of the Pascal VOC-2007 dataset at 11 FPS, which is comparatively better than other real-time object detection models.

Electronics ◽  
2021 ◽  
Vol 10 (13) ◽  
pp. 1541
Author(s):  
Xavier Alphonse Inbaraj ◽  
Charlyn Villavicencio ◽  
Julio Jerison Macrohon ◽  
Jyh-Horng Jeng ◽  
Jer-Guang Hsieh

One of the fundamental advancements in the deployment of object detectors in real-time applications is to improve object recognition against obstruction, obscurity, and noises in images. In addition, object detection is a challenging task since it needs the correct detection of objects from images. Semantic segmentation and localization are an important module to recognizing an object in an image. The object localization method (Grad-CAM++) is mostly used by researchers for object localization, which uses the gradient with a convolution layer to build a localization map for important regions on the image. This paper proposes a method called Combined Grad-CAM++ with the Mask Regional Convolution Neural Network (GC-MRCNN) in order to detect objects in the image and also localization. The major advantage of proposed method is that they outperform all the counterpart methods in the domain and can also be used in unsupervised environments. The proposed detector based on GC-MRCNN provides a robust and feasible ability in detecting and classifying objects exist and their shapes in real time. It is found that the proposed method is able to perform highly effectively and efficiently in a wide range of images and provides higher resolution visual representation than existing methods (Grad-CAM, Grad-CAM++), which was proven by comparing various algorithms.


SinkrOn ◽  
2019 ◽  
Vol 4 (1) ◽  
pp. 260 ◽  
Author(s):  
Kevin Kevin ◽  
Nico Gunawan ◽  
Mariana Erfan Kristiani Zagoto ◽  
Laurentius Laurentius ◽  
Amir Mahmud Husein

Abstract— The purpose of this study is to compare the video quality between the Samsung HP camera and the Xiaomi HP camera. The object of study was UNPRI students who walked through the front yard of the UNPRI SEKIP campus. Here we test how accurate the camera's HP capture capacity is used to take the video. The method used to test this research is the Convolution Neural Network method. Object detection and recognition aim to detect and classify objects that can be applied to various fields such as face, human, pedestrian, vehicle detection (Pedoeem & Huang, 2018), besides the ability to find, identify, track and stabilize objects in various poses and important backgrounds in many real-time video applications. Object detection, tracking, alignment and stabilization have become very interesting fields of research in the vision and recognition of computer patterns due to the challenging nature of several slightly different objects such as object detection, where the algorithm must be precise enough to identify, track and center an object from the others


Sensors ◽  
2020 ◽  
Vol 20 (18) ◽  
pp. 5080
Author(s):  
Baohua Qiang ◽  
Ruidong Chen ◽  
Mingliang Zhou ◽  
Yuanchao Pang ◽  
Yijie Zhai ◽  
...  

In recent years, increasing image data comes from various sensors, and object detection plays a vital role in image understanding. For object detection in complex scenes, more detailed information in the image should be obtained to improve the accuracy of detection task. In this paper, we propose an object detection algorithm by jointing semantic segmentation (SSOD) for images. First, we construct a feature extraction network that integrates the hourglass structure network with the attention mechanism layer to extract and fuse multi-scale features to generate high-level features with rich semantic information. Second, the semantic segmentation task is used as an auxiliary task to allow the algorithm to perform multi-task learning. Finally, multi-scale features are used to predict the location and category of the object. The experimental results show that our algorithm substantially enhances object detection performance and consistently outperforms other three comparison algorithms, and the detection speed can reach real-time, which can be used for real-time detection.


2019 ◽  
Vol 277 ◽  
pp. 02005
Author(s):  
Ning Feng ◽  
Le Dong ◽  
Qianni Zhang ◽  
Ning Zhang ◽  
Xi Wu ◽  
...  

Real-time semantic segmentation has become crucial in many applications such as medical image analysis and autonomous driving. In this paper, we introduce a single semantic segmentation network, called DNS, for joint object detection and segmentation task. We take advantage of multi-scale deconvolution mechanism to perform real time computations. To this goal, down-scale and up-scale streams are utilized to combine the multi-scale features for the final detection and segmentation task. By using the proposed DNS, not only the tradeoff between accuracy and cost but also the balance of detection and segmentation performance are settled. Experimental results for PASCAL VOC datasets show competitive performance for joint object detection and segmentation task.


2021 ◽  
Vol 104 (2) ◽  
pp. 003685042110113
Author(s):  
Xianghua Ma ◽  
Zhenkun Yang

Real-time object detection on mobile platforms is a crucial but challenging computer vision task. However, it is widely recognized that although the lightweight object detectors have a high detection speed, the detection accuracy is relatively low. In order to improve detecting accuracy, it is beneficial to extract complete multi-scale image features in visual cognitive tasks. Asymmetric convolutions have a useful quality, that is, they have different aspect ratios, which can be used to exact image features of objects, especially objects with multi-scale characteristics. In this paper, we exploit three different asymmetric convolutions in parallel and propose a new multi-scale asymmetric convolution unit, namely MAC block to enhance multi-scale representation ability of CNNs. In addition, MAC block can adaptively merge the features with different scales by allocating learnable weighted parameters to three different asymmetric convolution branches. The proposed MAC blocks can be inserted into the state-of-the-art backbone such as ResNet-50 to form a new multi-scale backbone network of object detectors. To evaluate the performance of MAC block, we conduct experiments on CIFAR-100, PASCAL VOC 2007, PASCAL VOC 2012 and MS COCO 2014 datasets. Experimental results show that the detection precision can be greatly improved while a fast detection speed is guaranteed as well.


2015 ◽  
Vol 137 (6) ◽  
Author(s):  
Yanfang Wang ◽  
Saeed Salehi

Real-time drilling optimization improves drilling performance by providing early warnings in operation Mud hydraulics is a key aspect of drilling that can be optimized by access to real-time data. Different from the investigated references, reliable prediction of pump pressure provides an early warning of circulation problems, washout, lost circulation, underground blowout, and kicks. This will help the driller to make necessary corrections to mitigate potential problems. In this study, an artificial neural network (ANN) model to predict hydraulics was implemented through the fitting tool of matlab. Following the determination of the optimum model, the sensitivity analysis of input parameters on the created model was investigated by using forward regression method. Next, the remaining data from the selected well samples was applied for simulation to verify the quality of the developed model. The novelty is this paper is validation of computer models with actual field data collected from an operator in LA. The simulation result was promising as compared with collected field data. This model can accurately predict pump pressure versus depth in analogous formations. The result of this work shows the potential of the approach developed in this work based on NN models for predicting real-time drilling hydraulics.


Sign in / Sign up

Export Citation Format

Share Document