Deep Learning Based Object Recognition Using Physically-Realistic Synthetic Depth Scenes

The objective of this thesis was to evaluate the viability of implementation of an object recognition algorithm driven by deep learning for aerospace manufacturing, maintenance and assembly tasks. Comparison research has found that current computer vision methods such as, spatial mapping was limited to macro-object recognition because of its nodal wireframe analysis. An optical object recognition algorithm was trained to learn complex geometric and chromatic characteristics, therefore allowing for micro-object recognition, such as cables and other critical components. This thesis investigated the use of a convolutional neural network with object recognition algorithms. The viability of two categories of object recognition algorithms were analyzed: image prediction and object detection. Due to a viral epidemic, this thesis was limited in analytical consistency as resources were not readily available. The prediction-class algorithm was analyzed using a custom dataset comprised of 15 552 images of the MaxFlight V2002 Full Motion Simulator’s inverter system, and a model was created by transfer-learning that dataset onto the InceptionV3 convolutional neural network (CNN). The detection-class algorithm was analyzed using a custom dataset comprised of 100 images of two SUVs of different brand and style, and a model was created by transfer-learning that dataset onto the YOLOv3 deep learning architecture. The tests showed that the object recognition algorithms successfully identified the components with good accuracy, 99.97% mAP for prediction-class and 89.54% mAP. For detection-class. The accuracies and data collected with literature review found that object detection algorithms are accuracy, created for live -feed analysis and were suitable for the significant applications of AVI and aircraft assembly. In the future, a larger dataset needs to be complied to increase reliability and a custom convolutional neural network and deep learning algorithm needs to be developed specifically for aerospace assembly, maintenance and manufacturing applications.

Download Full-text

Absolute Distance Prediction Based on Deep Learning Object Detection and Monocular Depth Estimation Models

10.3233/faia210151 ◽

2021 ◽

Author(s):

Armin Masoumian ◽

David G.F. Marei ◽

Saddam Abdulwahab ◽

Julián Cristiano ◽

Domenec Puig ◽

...

Keyword(s):

Deep Learning ◽

Object Detection ◽

Distance Estimation ◽

Depth Estimation ◽

Depth Image ◽

Absolute Distance ◽

Learning Framework ◽

Depth Images ◽

Monocular Depth ◽

Distance Prediction

Determining the distance between the objects in a scene and the camera sensor from 2D images is feasible by estimating depth images using stereo cameras or 3D cameras. The outcome of depth estimation is relative distances that can be used to calculate absolute distances to be applicable in reality. However, distance estimation is very challenging using 2D monocular cameras. This paper presents a deep learning framework that consists of two deep networks for depth estimation and object detection using a single image. Firstly, objects in the scene are detected and localized using the You Only Look Once (YOLOv5) network. In parallel, the estimated depth image is computed using a deep autoencoder network to detect the relative distances. The proposed object detection based YOLO was trained using a supervised learning technique, in turn, the network of depth estimation was self-supervised training. The presented distance estimation framework was evaluated on real images of outdoor scenes. The achieved results show that the proposed framework is promising and it yields an accuracy of 96% with RMSE of 0.203 of the correct absolute distance.

Download Full-text

Implementation of Object Recognition Algorithm to enhance Manufacturing and Maintenance Tasks on an Aircraft

10.32920/ryerson.14637387 ◽

2021 ◽

Author(s):

Abhinav Sundar

Keyword(s):

Neural Network ◽

Deep Learning ◽

Object Recognition ◽

Object Detection ◽

Convolutional Neural Network ◽

Transfer Learning ◽

Learning Algorithm ◽

Recognition Algorithm ◽

Recognition Algorithms ◽

Aerospace Assembly

The objective of this thesis was to evaluate the viability of implementation of an object recognition algorithm driven by deep learning for aerospace manufacturing, maintenance and assembly tasks. Comparison research has found that current computer vision methods such as, spatial mapping was limited to macro-object recognition because of its nodal wireframe analysis. An optical object recognition algorithm was trained to learn complex geometric and chromatic characteristics, therefore allowing for micro-object recognition, such as cables and other critical components. This thesis investigated the use of a convolutional neural network with object recognition algorithms. The viability of two categories of object recognition algorithms were analyzed: image prediction and object detection. Due to a viral epidemic, this thesis was limited in analytical consistency as resources were not readily available. The prediction-class algorithm was analyzed using a custom dataset comprised of 15 552 images of the MaxFlight V2002 Full Motion Simulator’s inverter system, and a model was created by transfer-learning that dataset onto the InceptionV3 convolutional neural network (CNN). The detection-class algorithm was analyzed using a custom dataset comprised of 100 images of two SUVs of different brand and style, and a model was created by transfer-learning that dataset onto the YOLOv3 deep learning architecture. The tests showed that the object recognition algorithms successfully identified the components with good accuracy, 99.97% mAP for prediction-class and 89.54% mAP. For detection-class. The accuracies and data collected with literature review found that object detection algorithms are accuracy, created for live -feed analysis and were suitable for the significant applications of AVI and aircraft assembly. In the future, a larger dataset needs to be complied to increase reliability and a custom convolutional neural network and deep learning algorithm needs to be developed specifically for aerospace assembly, maintenance and manufacturing applications.

Download Full-text

Augmented Reality Maintenance Assistant Using YOLOv5

Applied Sciences ◽

10.3390/app11114758 ◽

2021 ◽

Vol 11 (11) ◽

pp. 4758

Author(s):

Ana Malta ◽

Mateus Mendes ◽

Torres Farinha

Keyword(s):

Neural Network ◽

Deep Learning ◽

Object Recognition ◽

Augmented Reality ◽

Real Time ◽

Recognition System ◽

High Accuracy ◽

Video Streams ◽

The Neural Network ◽

Deep Learning Neural Network

Maintenance professionals and other technical staff regularly need to learn to identify new parts in car engines and other equipment. The present work proposes a model of a task assistant based on a deep learning neural network. A YOLOv5 network is used for recognizing some of the constituent parts of an automobile. A dataset of car engine images was created and eight car parts were marked in the images. Then, the neural network was trained to detect each part. The results show that YOLOv5s is able to successfully detect the parts in real time video streams, with high accuracy, thus being useful as an aid to train professionals learning to deal with new equipment using augmented reality. The architecture of an object recognition system using augmented reality glasses is also designed.

Download Full-text

Iranian kinect face database (IKFDB): a color-depth based face database collected by kinect v.2 sensor

SN Applied Sciences ◽

10.1007/s42452-020-03999-y ◽

2021 ◽

Vol 3 (1) ◽

Author(s):

Seyed Muhammad Hossein Mousavi ◽

S. Younes Mirinezhad

Keyword(s):

Neural Network ◽

Facial Expression ◽

Facial Expression Recognition ◽

Depth Image ◽

Sensor Technology ◽

Support Vector ◽

Expression Recognition ◽

Face Database ◽

Depth Data ◽

Color Depth

AbstractThis study presents a new color-depth based face database gathered from different genders and age ranges from Iranian subjects. Using suitable databases, it is possible to validate and assess available methods in different research fields. This database has application in different fields such as face recognition, age estimation and Facial Expression Recognition and Facial Micro Expressions Recognition. Image databases based on their size and resolution are mostly large. Color images usually consist of three channels namely Red, Green and Blue. But in the last decade, another aspect of image type has emerged, named “depth image”. Depth images are used in calculating range and distance between objects and the sensor. Depending on the depth sensor technology, it is possible to acquire range data differently. Kinect sensor version 2 is capable of acquiring color and depth data simultaneously. Facial expression recognition is an important field in image processing, which has multiple uses from animation to psychology. Currently, there is a few numbers of color-depth (RGB-D) facial micro expressions recognition databases existing. With adding depth data to color data, the accuracy of final recognition will be increased. Due to the shortage of color-depth based facial expression databases and some weakness in available ones, a new and almost perfect RGB-D face database is presented in this paper, covering Middle-Eastern face type. In the validation section, the database will be compared with some famous benchmark face databases. For evaluation, Histogram Oriented Gradients features are extracted, and classification algorithms such as Support Vector Machine, Multi-Layer Neural Network and a deep learning method, called Convolutional Neural Network or are employed. The results are so promising.

Download Full-text

Food Volume Estimation Based on Deep Learning View Synthesis from a Single Depth Map

Nutrients ◽

10.3390/nu10122005 ◽

2018 ◽

Vol 10 (12) ◽

pp. 2005 ◽

Cited By ~ 12

Author(s):

Frank Lo ◽

Yingnan Sun ◽

Jianing Qiu ◽

Benny Lo

Keyword(s):

Neural Network ◽

Deep Learning ◽

Point Cloud ◽

Volume Estimation ◽

Assessment System ◽

View Synthesis ◽

Depth Image ◽

3D Point Cloud ◽

Viewing Angle ◽

3D Point Clouds

An objective dietary assessment system can help users to understand their dietary behavior and enable targeted interventions to address underlying health problems. To accurately quantify dietary intake, measurement of the portion size or food volume is required. For volume estimation, previous research studies mostly focused on using model-based or stereo-based approaches which rely on manual intervention or require users to capture multiple frames from different viewing angles which can be tedious. In this paper, a view synthesis approach based on deep learning is proposed to reconstruct 3D point clouds of food items and estimate the volume from a single depth image. A distinct neural network is designed to use a depth image from one viewing angle to predict another depth image captured from the corresponding opposite viewing angle. The whole 3D point cloud map is then reconstructed by fusing the initial data points with the synthesized points of the object items through the proposed point cloud completion and Iterative Closest Point (ICP) algorithms. Furthermore, a database with depth images of food object items captured from different viewing angles is constructed with image rendering and used to validate the proposed neural network. The methodology is then evaluated by comparing the volume estimated by the synthesized 3D point cloud with the ground truth volume of the object items.

Download Full-text

MultiDefectNet: Multi-Class Defect Detection of Building Façade Based on Deep Convolutional Neural Network

Sustainability ◽

10.3390/su12229785 ◽

2020 ◽

Vol 12 (22) ◽

pp. 9785

Author(s):

Kisu Lee ◽

Goopyo Hong ◽

Lee Sael ◽

Sanghyo Lee ◽

Ha Young Kim

Keyword(s):

Neural Network ◽

Deep Learning ◽

Object Detection ◽

Convolutional Neural Network ◽

Defect Detection ◽

Structural Integrity ◽

Residential Building ◽

Training Environment ◽

Building Facade ◽

Building Facades

Defects in residential building façades affect the structural integrity of buildings and degrade external appearances. Defects in a building façade are typically managed using manpower during maintenance. This approach is time-consuming, yields subjective results, and can lead to accidents or casualties. To address this, we propose a building façade monitoring system that utilizes an object detection method based on deep learning to efficiently manage defects by minimizing the involvement of manpower. The dataset used for training a deep-learning-based network contains actual residential building façade images. Various building designs in these raw images make it difficult to detect defects because of their various types and complex backgrounds. We employed the faster regions with convolutional neural network (Faster R-CNN) structure for more accurate defect detection in such environments, achieving an average precision (intersection over union (IoU) = 0.5) of 62.7% for all types of trained defects. As it is difficult to detect defects in a training environment, it is necessary to improve the performance of the network. However, the object detection network employed in this study yields an excellent performance in complex real-world images, indicating the possibility of developing a system that would detect defects in more types of building façades.

Download Full-text

Identifying Ethnics of People through Face Recognition: A Deep CNN Approach

Scientific Programming ◽

10.1155/2020/6385281 ◽

2020 ◽

Vol 2020 ◽

pp. 1-7

Author(s):

Ahmed Jawad A. AlBdairi ◽

Zhu Xiao ◽

Mohammed Alghaili

Keyword(s):

Neural Network ◽

Deep Learning ◽

Face Recognition ◽

Convolutional Neural Network ◽

State Of The Art ◽

Research Community ◽

Facial Features ◽

New Model ◽

Image Dataset ◽

Deep Cnn

The interest in face recognition studies has grown rapidly in the last decade. One of the most important problems in face recognition is the identification of ethnics of people. In this study, a new deep learning convolutional neural network is designed to create a new model that can recognize the ethnics of people through their facial features. The new dataset for ethnics of people consists of 3141 images collected from three different nationalities. To the best of our knowledge, this is the first image dataset collected for the ethnics of people and that dataset will be available for the research community. The new model was compared with two state-of-the-art models, VGG and Inception V3, and the validation accuracy was calculated for each convolutional neural network. The generated models have been tested through several images of people, and the results show that the best performance was achieved by our model with a verification accuracy of 96.9%.

Download Full-text

RGB-D Object Recognition Using Multi-Modal Deep Neural Network and DS Evidence Theory

Sensors ◽

10.3390/s19030529 ◽

2019 ◽

Vol 19 (3) ◽

pp. 529 ◽

Cited By ~ 2

Author(s):

Hui Zeng ◽

Bin Yang ◽

Xiuqing Wang ◽

Jiwei Liu ◽

Dongmei Fu

Keyword(s):

Neural Network ◽

Object Recognition ◽

Deep Neural Network ◽

Low Cost ◽

Evidence Theory ◽

Feature Learning ◽

Decision Fusion ◽

Support Vector ◽

Depth Sensors ◽

Depth Images

With the development of low-cost RGB-D (Red Green Blue-Depth) sensors, RGB-D object recognition has attracted more and more researchers’ attention in recent years. The deep learning technique has become popular in the field of image analysis and has achieved competitive results. To make full use of the effective identification information in the RGB and depth images, we propose a multi-modal deep neural network and a DS (Dempster Shafer) evidence theory based RGB-D object recognition method. First, the RGB and depth images are preprocessed and two convolutional neural networks are trained, respectively. Next, we perform multi-modal feature learning using the proposed quadruplet samples based objective function to fine-tune the network parameters. Then, two probability classification results are obtained using two sigmoid SVMs (Support Vector Machines) with the learned RGB and depth features. Finally, the DS evidence theory based decision fusion method is used for integrating the two classification results. Compared with other RGB-D object recognition methods, our proposed method adopts two fusion strategies: Multi-modal feature learning and DS decision fusion. Both the discriminative information of each modality and the correlation information between the two modalities are exploited. Extensive experimental results have validated the effectiveness of the proposed method.

Download Full-text