Underwater Object Detection using Transfer Learning with Deep Learning

Author(s):  
Zhu Kaiyan ◽  
Li Xiang ◽  
Song Weibo
2019 ◽  
Vol 52 (21) ◽  
pp. 78-81 ◽  
Author(s):  
MyungHwan Jeon ◽  
Yeongjun Lee ◽  
Young-Sik Shin ◽  
Hyesu Jang ◽  
Ayoung Kim

2021 ◽  
Vol 14 (1) ◽  
pp. 103
Author(s):  
Dongchuan Yan ◽  
Hao Zhang ◽  
Guoqing Li ◽  
Xiangqiang Li ◽  
Hua Lei ◽  
...  

The breaching of tailings pond dams may lead to casualties and environmental pollution; therefore, timely and accurate monitoring is an essential aspect of managing such structures and preventing accidents. Remote sensing technology is suitable for the regular extraction and monitoring of tailings pond information. However, traditional remote sensing is inefficient and unsuitable for the frequent extraction of large volumes of highly precise information. Object detection, based on deep learning, provides a solution to this problem. Most remote sensing imagery applications for tailings pond object detection using deep learning are based on computer vision, utilizing the true-color triple-band data of high spatial resolution imagery for information extraction. The advantage of remote sensing image data is their greater number of spectral bands (more than three), providing more abundant spectral information. There is a lack of research on fully harnessing multispectral band information to improve the detection precision of tailings ponds. Accordingly, using a sample dataset of tailings pond satellite images from the Gaofen-1 high-resolution Earth observation satellite, we improved the Faster R-CNN deep learning object detection model by increasing the inputs from three true-color bands to four multispectral bands. Moreover, we used the attention mechanism to recalibrate the input contributions. Subsequently, we used a step-by-step transfer learning method to improve and gradually train our model. The improved model could fully utilize the near-infrared (NIR) band information of the images to improve the precision of tailings pond detection. Compared with that of the three true-color band input models, the tailings pond detection average precision (AP) and recall notably improved in our model, with the AP increasing from 82.3% to 85.9% and recall increasing from 65.4% to 71.9%. This research could serve as a reference for using multispectral band information from remote sensing images in the construction and application of deep learning models.


2021 ◽  
Author(s):  
Abhinav Sundar

The objective of this thesis was to evaluate the viability of implementation of an object recognition algorithm driven by deep learning for aerospace manufacturing, maintenance and assembly tasks. Comparison research has found that current computer vision methods such as, spatial mapping was limited to macro-object recognition because of its nodal wireframe analysis. An optical object recognition algorithm was trained to learn complex geometric and chromatic characteristics, therefore allowing for micro-object recognition, such as cables and other critical components. This thesis investigated the use of a convolutional neural network with object recognition algorithms. The viability of two categories of object recognition algorithms were analyzed: image prediction and object detection. Due to a viral epidemic, this thesis was limited in analytical consistency as resources were not readily available. The prediction-class algorithm was analyzed using a custom dataset comprised of 15 552 images of the MaxFlight V2002 Full Motion Simulator’s inverter system, and a model was created by transfer-learning that dataset onto the InceptionV3 convolutional neural network (CNN). The detection-class algorithm was analyzed using a custom dataset comprised of 100 images of two SUVs of different brand and style, and a model was created by transfer-learning that dataset onto the YOLOv3 deep learning architecture. The tests showed that the object recognition algorithms successfully identified the components with good accuracy, 99.97% mAP for prediction-class and 89.54% mAP. For detection-class. The accuracies and data collected with literature review found that object detection algorithms are accuracy, created for live -feed analysis and were suitable for the significant applications of AVI and aircraft assembly. In the future, a larger dataset needs to be complied to increase reliability and a custom convolutional neural network and deep learning algorithm needs to be developed specifically for aerospace assembly, maintenance and manufacturing applications.


2021 ◽  
pp. 1-11
Author(s):  
Yike Li ◽  
Jiajie Guo ◽  
Peikai Yang

Background: The Pentagon Drawing Test (PDT) is a common assessment for visuospatial function. Evaluating the PDT by artificial intelligence can improve efficiency and reliability in the big data era. This study aimed to develop a deep learning (DL) framework for automatic scoring of the PDT based on image data. Methods: A total of 823 PDT photos were retrospectively collected and preprocessed into black-and-white, square-shape images. Stratified fivefold cross-validation was applied for training and testing. Two strategies based on convolutional neural networks were compared. The first strategy was to perform an image classification task using supervised transfer learning. The second strategy was designed with an object detection model for recognizing the geometric shapes in the figure, followed by a predetermined algorithm to score based on their classes and positions. Results: On average, the first framework demonstrated 62%accuracy, 62%recall, 65%precision, 63%specificity, and 0.72 area under the receiver operating characteristic curve. This performance was substantially outperformed by the second framework, with averages of 94%, 95%, 93%, 93%, and 0.95, respectively. Conclusion: An image-based DL framework based on the object detection approach may be clinically applicable for automatic scoring of the PDT with high efficiency and reliability. With a limited sample size, transfer learning should be used with caution if the new images are distinct from the previous training data. Partitioning the problem-solving workflow into multiple simple tasks should facilitate model selection, improve performance, and allow comprehensible logic of the DL framework.


Author(s):  
Pritam Ghosh ◽  
Subhranil Mustafi ◽  
Satyendra Nath Mandal

In this paper an attempt has been made to identify six different goat breeds from pure breed goat images. The images of goat breeds have been captured from different organized registered goat farms in India, and almost two thousand digital images of individual goats were captured in restricted (to get similar image background) and unrestricted (natural) environments without imposing stress to animals. A pre-trained deep learning-based object detection model called Faster R-CNN has been fine-tuned by using transfer-learning on the acquired images for automatic classification and localization of goat breeds. This fine-tuned model is able to locate the goat (localize) and classify (identify) its breed in the image. The Pascal VOC object detection evaluation metrics have been used to evaluate this model. Finally, comparison has been made with prediction accuracies of different technologies used for different animal breed identification.


2020 ◽  
Author(s):  
Than Le

In this paper, we propose the simple method to optimize the datasets noise under the uncertainty applied to many applications in industry. Specifically, we use firstly the deep learning module at transfer learning based on using the mask-rcnn to detect the objects and segmentation effectively, then return the contours only. After that we address the shortest path for reduce the noise in order to increasing the highspeed in industrial applications. We illustrate adaptive many applications web applications such as mobile application where power computer is limited a source


2020 ◽  
Author(s):  
Andrew Shepley ◽  
Greg Falzon ◽  
Paul Meek ◽  
Paul Kwan

AbstractA time-consuming challenge faced by camera trap practitioners all over the world is the extraction of meaningful data from images to inform ecological management. The primary methods of image processing used by practitioners includes manual analysis and citizen science. An increasingly popular alternative is automated image classification software. However, most automated solutions are not sufficiently robust to be deployed on a large scale. Key challenges include limited access to images for each species and lack of location invariance when transferring models between sites. This prevents optimal use of ecological data and results in significant expenditure of time and resources to annotate and retrain deep learning models.In this study, we aimed to (a) assess the value of publicly available non-iconic FlickR images in the training of deep learning models for camera trap object detection, (b) develop an out-of-the-box location invariant automated camera trap image processing solution for ecologist using deep transfer learning and (c) explore the use of small subsets of camera trap images in optimisation of a FlickR trained deep learning model for high precision ecological object detection.We collected and annotated a dataset of images of “pigs” (Sus scrofa and Phacochoerus africanus) from the consumer image sharing website FlickR. These images were used to achieve transfer learning using a RetinaNet model in the task of object detection. We compared the performance of this model to the performance of models trained on combinations of camera trap images obtained from five different projects, each characterised by 5 different geographical regions. Furthermore, we explored optimisation of the FlickR model via infusion of small subsets of camera trap images to increase robustness in difficult images.In most cases, the mean Average Precision (mAP) of the FlickR trained model when tested on out of sample camera trap sites (67.21-91.92%) was significantly higher than the mAP achieved by models trained on only one geographical location (4.42-90.8%) and rivalled the mAP of models trained on mixed camera trap datasets (68.96-92.75%). The infusion of camera trap images into the FlickR training further improved AP by 5.10-22.32% to 83.60-97.02%.Ecology researchers can use FlickR images in the training of automated deep learning solutions for camera trap image processing to significantly reduce time and resource expenditure by allowing the development of location invariant, highly robust out-of-the-box solutions. This would allow AI technologies to be deployed on a large scale in ecological applications.


2020 ◽  
Author(s):  
Than Le

In this paper, we propose the simple method to optimize the datasets noise under the uncertainty applied to many applications in industry. Specifically, we use firstly the deep learning module at transfer learning based on using the mask-rcnn to detect the objects and segmentation effectively, then return the contours only. After that we address the shortest path for reduce the noise in order to increasing the highspeed in industrial applications. We illustrate adaptive many applications web applications such as mobile application where power computer is limited a source


Sign in / Sign up

Export Citation Format

Share Document