Seeing the Trees from the Forest: Using Modern Methods to Identify Individual Objects in a Cluttered Environment for Robots

Mapping Intimacies ◽

10.26686/wgtn.17141933.v1 ◽

2021 ◽

Author(s):

◽

Josh Prow

Keyword(s):

Computer Vision ◽

Object Detection ◽

Low Cost ◽

Region Of Interest ◽

High Growth ◽

Working Environment ◽

Single Shot ◽

Graphics Processors ◽

High Definition ◽

Depth Cameras

<p>Robotics and computer vision are areas of high growth across both industry and personal usage environments. Robots in industrial situations have been used to work in environments that are hazardous for humans or to perform basic tasks that require fine detail beyond that which human operators can reliably perform. These robotic solutions require a variety of sensors and cameras to navigate and identify objects within their working environment, as well as software and intelligent detection systems. These solutions generally require high definition depth cameras, laser range finders and computer vision algorithms, which are both expensive and require expensive graphics processors to run practically. This thesis explores the option of a low-cost computer vision enabled robotic solution, which can operate within a forestry environment. Starting with the accuracy of camera technologies, testing two of the main cameras available for robotic vision, and demonstrating the benefits of the RealSense D435 by Intel over the Kinect for X-Box One. Followed by testing common object detection and recognition algorithms on different devices; considering the advantages and weaknesses of the determined models for the intended purpose of forestry. These tests support other research on finding that the MobileNet Single Shot Detector has the fastest recognition speeds with accurate precision, however, it struggles where multiple objects were present, or the background was complex. In comparison, the Mask R-CNN had high accuracy and was able to identify objects consistently even with large numbers overlaid within a single frame. A combined method based on the Faster R-CNN architecture with a MobileNet backbone and masking layers is proposed, developed and tested based on these findings. This method utilized the feature extraction and object detection abilities of the faster MobileNet in place of the traditionally ResNet based feature proposal networks, while still capitalizing on the benefits of the region of interest (ROI) align and masking from the Mask R-CNN architecture. The results from this model did not meet the criteria required to recommend the model as an operational solution for the forestry environment. However, they do show that the model has higher performance and average precision than other models with similar frame rates on the non-CUDA enabled testing device. Demonstrating the technology and methodology has the potential to be the basis for a future solution to the problem of balancing accuracy and performance on a low performance or non GPU-enabled robotic unit.</p>

Download Full-text

Seeing the Trees from the Forest: Using Modern Methods to Identify Individual Objects in a Cluttered Environment for Robots

10.26686/wgtn.17141933 ◽

2021 ◽

Author(s):

◽

Josh Prow

Keyword(s):

Computer Vision ◽

Object Detection ◽

Low Cost ◽

Region Of Interest ◽

High Growth ◽

Working Environment ◽

Single Shot ◽

Graphics Processors ◽

High Definition ◽

Depth Cameras

Download Full-text

Exploring RGBDepth Fusion for Real-Time Object Detection

Sensors ◽

10.3390/s19040866 ◽

2019 ◽

Vol 19 (4) ◽

pp. 866 ◽

Cited By ~ 8

Author(s):

Tanguy Ophoff ◽

Kristof Van Beeck ◽

Toon Goedemé

Keyword(s):

Object Detection ◽

Real Time ◽

Network Architecture ◽

Depth Information ◽

Special Focus ◽

Single Shot ◽

Depth Sensing ◽

Real Time Processing ◽

Depth Cameras ◽

Optimal Fusion

In this paper, we investigate whether fusing depth information on top of normal RGB data for camera-based object detection can help to increase the performance of current state-of-the-art single-shot detection networks. Indeed, depth sensing is easily acquired using depth cameras such as a Kinect or stereo setups. We investigate the optimal manner to perform this sensor fusion with a special focus on lightweight single-pass convolutional neural network (CNN) architectures, enabling real-time processing on limited hardware. For this, we implement a network architecture allowing us to parameterize at which network layer both information sources are fused together. We performed exhaustive experiments to determine the optimal fusion point in the network, from which we can conclude that fusing towards the mid to late layers provides the best results. Our best fusion models significantly outperform the baseline RGB network in both accuracy and localization of the detections.

Download Full-text

Precise Inspection of Geometric Parameters for Polyvinyl Chloride Pipe Section Based on Computer Vision

Traitement du signal ◽

10.18280/ts.380608 ◽

2021 ◽

Vol 38 (6) ◽

pp. 1647-1655

Author(s):

Qilin Bi ◽

Minling Lai ◽

Huiling Tang ◽

Yanyao Guo ◽

Jinyuan Li ◽

...

Keyword(s):

Computer Vision ◽

Polyvinyl Chloride ◽

Imaging System ◽

Low Cost ◽

Region Of Interest ◽

Geometric Parameters ◽

Outer Diameter ◽

Pipe Section ◽

Edge Operator ◽

Pvc Pipe

The precise inspection of geometric parameters is crucial for quality control in the context of Industry 4.0. The current technique of precise inspection depends on the operation of professional personnel, and the measuring accuracy is restricted by the proficiency of operators. To solve the defects, this paper proposes a precise inspection framework for the geometric parameters of polyvinyl chloride (PVC) pipe section (G-PVC), using low-cost visual sensors and high-precision computer vision algorithms. Firstly, a robust imaging system was built to acquire images of a PVC pipe section under irregular illumination changes. Next, an engineering semantic model was established to calculate G-PVC like inner diameter, outer diameter, wall thickness, and roundness. After that, a region-of-interest (ROI) extraction algorithm was combined with an improved edge operator to obtain the coordinates of measured points on PVC end-face image in a stable and precise manner. Finally, our framework was proved highly precise and robust through experiments.

Download Full-text

Why Is Deep Learning Challenging for Printed Circuit Board (PCB) Component Recognition and How Can We Address It?

Cryptography ◽

10.3390/cryptography5010009 ◽

2021 ◽

Vol 5 (1) ◽

pp. 9

Author(s):

Mukhil Azhagan Mallaiyan Sathiaseelan ◽

Olivia P. Paradis ◽

Shayan Taheri ◽

Navid Asadizanjani

Keyword(s):

Computer Vision ◽

Deep Learning ◽

Object Detection ◽

Learning Communities ◽

Printed Circuit Board ◽

Circuit Board ◽

Single Shot ◽

Limited Data ◽

Detection Techniques ◽

Printed Circuit

In this paper, we present the need for specialized artificial intelligence (AI) for counterfeit and defect detection of PCB components. Popular computer vision object detection techniques are not sufficient for such dense, low inter-class/high intra-class variation, and limited-data hardware assurance scenarios in which accuracy is paramount. Hence, we explored the limitations of existing object detection methodologies, such as region based convolutional neural networks (RCNNs) and single shot detectors (SSDs), and compared them with our proposed method, the electronic component localization and detection network (ECLAD-Net). The results indicate that, of the compared methods, ECLAD-Net demonstrated the highest performance, with a precision of 87.2% and a recall of 98.9%. Though ECLAD-Net demonstrated decent performance, there is still much progress and collaboration needed from the hardware assurance, computer vision, and deep learning communities for automated, accurate, and scalable PCB assurance.

Download Full-text

A technique for protecting delicate specimens during processing

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100156894 ◽

1989 ◽

Vol 47 ◽

pp. 982-983

Author(s):

R.J. Mount ◽

R.V. Harrison

Keyword(s):

Basilar Membrane ◽

Low Cost ◽

Organ Of Corti ◽

Region Of Interest ◽

Sensory Epithelium ◽

Physical Damage ◽

Air Drying ◽

Specimen Handling ◽

Two Component

The sensory end organ of the ear, the organ of Corti, rests on a thin basilar membrane which lies between the bone of the central modiolus and the bony wall of the cochlea. In vivo, the organ of Corti is protected by the bony wall which totally surrounds it. In order to examine the sensory epithelium by scanning electron microscopy it is necessary to dissect away the protective bone and expose the region of interest (Fig. 1). This leaves the fragile organ of Corti susceptible to physical damage during subsequent handling. In our laboratory cochlear specimens, after dissection, are routinely prepared by the O-T- O-T-O technique, critical point dried and then lightly sputter coated with gold. This processing involves considerable specimen handling including several hours on a rotator during which the organ of Corti is at risk of being physically damaged. The following procedure uses low cost, readily available materials to hold the specimen during processing ,preventing physical damage while allowing an unhindered exchange of fluids.Following fixation, the cochlea is dehydrated to 70% ethanol then dissected under ethanol to prevent air drying. The holder is prepared by punching a hole in the flexible snap cap of a Wheaton vial with a paper hole punch. A small amount of two component epoxy putty is well mixed then pushed through the hole in the cap. The putty on the inner cap is formed into a “cup” to hold the specimen (Fig. 2), the putty on the outside is smoothed into a “button” to give good attachment even when the cap is flexed during handling (Fig. 3). The cap is submerged in the 70% ethanol, the bone at the base of the cochlea is seated into the cup and the sides of the cup squeezed with forceps to grip it (Fig.4). Several types of epoxy putty have been tried, most are either soluble in ethanol to some degree or do not set in ethanol. The only putty we find successful is “DUROtm MASTERMENDtm Epoxy Extra Strength Ribbon” (Loctite Corp., Cleveland, Ohio), this is a blue and yellow ribbon which is kneaded to form a green putty, it is available at many hardware stores.

Download Full-text

ON METHODS OF OBJECT DETECTION IN VIDEO STREAMS

Computer systems and network ◽

10.23939/csn2020.01.080 ◽

2017 ◽

Vol 2 (1) ◽

pp. 80-87

Author(s):

Puyda V. ◽

◽

Stoian. A.

Keyword(s):

Computer Vision ◽

Object Detection ◽

Open Source ◽

Feature Detection ◽

Video Stream ◽

Object Identification ◽

Vision Systems ◽

Modern Computer ◽

Computer Vision Systems ◽

Open Source Hardware

Detecting objects in a video stream is a typical problem in modern computer vision systems that are used in multiple areas. Object detection can be done on both static images and on frames of a video stream. Essentially, object detection means finding color and intensity non-uniformities which can be treated as physical objects. Beside that, the operations of finding coordinates, size and other characteristics of these non-uniformities that can be used to solve other computer vision related problems like object identification can be executed. In this paper, we study three algorithms which can be used to detect objects of different nature and are based on different approaches: detection of color non-uniformities, frame difference and feature detection. As the input data, we use a video stream which is obtained from a video camera or from an mp4 video file. Simulations and testing of the algoritms were done on a universal computer based on an open-source hardware, built on the Broadcom BCM2711, quad-core Cortex-A72 (ARM v8) 64-bit SoC processor with frequency 1,5GHz. The software was created in Visual Studio 2019 using OpenCV 4 on Windows 10 and on a universal computer operated under Linux (Raspbian Buster OS) for an open-source hardware. In the paper, the methods under consideration are compared. The results of the paper can be used in research and development of modern computer vision systems used for different purposes. Keywords: object detection, feature points, keypoints, ORB detector, computer vision, motion detection, HSV model color

Download Full-text

Design of Desktop Audiovisual Entertainment System with Deep Learning and Haptic Sensations

Symmetry ◽

10.3390/sym12101718 ◽

2020 ◽

Vol 12 (10) ◽

pp. 1718

Author(s):

Chien-Hsing Chou ◽

Yu-Sheng Su ◽

Che-Ju Hsu ◽

Kong-Chang Lee ◽

Ping-Hsuan Han

Keyword(s):

Deep Learning ◽

Object Detection ◽

User Experience ◽

Recognition System ◽

Scene Recognition ◽

Single Shot ◽

Auditory Signals ◽

Hot Weather ◽

Viewing Experience ◽

At Home

In this study, we designed a four-dimensional (4D) audiovisual entertainment system called Sense. This system comprises a scene recognition system and hardware modules that provide haptic sensations for users when they watch movies and animations at home. In the scene recognition system, we used Google Cloud Vision to detect common scene elements in a video, such as fire, explosions, wind, and rain, and further determine whether the scene depicts hot weather, rain, or snow. Additionally, for animated videos, we applied deep learning with a single shot multibox detector to detect whether the animated video contained scenes of fire-related objects. The hardware module was designed to provide six types of haptic sensations set as line-symmetry to provide a better user experience. After the system considers the results of object detection via the scene recognition system, the system generates corresponding haptic sensations. The system integrates deep learning, auditory signals, and haptic sensations to provide an enhanced viewing experience.

Download Full-text

Automatic Hip Detection in Anteroposterior Pelvic Radiographs—A Labelless Practical Framework

Journal of Personalized Medicine ◽

10.3390/jpm11060522 ◽

2021 ◽

Vol 11 (6) ◽

pp. 522

Author(s):

Feng-Yu Liu ◽

Chih-Chi Chen ◽

Chi-Tung Cheng ◽

Cheng-Ta Wu ◽

Chih-Po Hsu ◽

...

Keyword(s):

Medical Image ◽

Image Annotation ◽

Region Of Interest ◽

Confidence Score ◽

Heterogeneous Data ◽

Single Shot ◽

Hip Joints ◽

Pelvic Radiographs ◽

Wide Range ◽

Average Confidence

Automated detection of the region of interest (ROI) is a critical step in the two-step classification system in several medical image applications. However, key information such as model parameter selection, image annotation rules, and ROI confidence score are essential but usually not reported. In this study, we proposed a practical framework of ROI detection by analyzing hip joints seen on 7399 anteroposterior pelvic radiographs (PXR) from three diverse sources. We presented a deep learning-based ROI detection framework utilizing a single-shot multi-box detector with a customized head structure based on the characteristics of the obtained datasets. Our method achieved average intersection over union (IoU) = 0.8115, average confidence = 0.9812, and average precision with threshold IoU = 0.5 (AP50) = 0.9901 in the independent testing set, suggesting that the detected hip regions appropriately covered the main features of the hip joints. The proposed approach featured flexible loose-fitting labeling, customized model design, and heterogeneous data testing. We demonstrated the feasibility of training a robust hip region detector for PXRs. This practical framework has a promising potential for a wide range of medical image applications.

Download Full-text

Underwater Object Detection Based on Improved Single Shot MultiBox Detector

2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence ◽

10.1145/3446132.3446170 ◽

2020 ◽

Author(s):

Zhongyun Jiang ◽

Rongrong Wang

Keyword(s):

Object Detection ◽

Single Shot ◽

Underwater Object

Download Full-text

An Automated Light Trap to Monitor Moths (Lepidoptera) Using Computer Vision-Based Tracking and Deep Learning

Sensors ◽

10.3390/s21020343 ◽

2021 ◽

Vol 21 (2) ◽

pp. 343

Author(s):

Kim Bjerge ◽

Jakob Bonde Nielsen ◽

Martin Videbæk Sepstrup ◽

Flemming Helsing-Nielsen ◽

Toke Thomas Høye

Keyword(s):

Computer Vision ◽

Deep Learning ◽

Vision System ◽

Low Cost ◽

Light Trap ◽

Automatic Monitoring ◽

Light Sources ◽

Monitoring Methods ◽

Computer Vision System ◽

Substantial Investment

Insect monitoring methods are typically very time-consuming and involve substantial investment in species identification following manual trapping in the field. Insect traps are often only serviced weekly, resulting in low temporal resolution of the monitoring data, which hampers the ecological interpretation. This paper presents a portable computer vision system capable of attracting and detecting live insects. More specifically, the paper proposes detection and classification of species by recording images of live individuals attracted to a light trap. An Automated Moth Trap (AMT) with multiple light sources and a camera was designed to attract and monitor live insects during twilight and night hours. A computer vision algorithm referred to as Moth Classification and Counting (MCC), based on deep learning analysis of the captured images, tracked and counted the number of insects and identified moth species. Observations over 48 nights resulted in the capture of more than 250,000 images with an average of 5675 images per night. A customized convolutional neural network was trained on 2000 labeled images of live moths represented by eight different classes, achieving a high validation F1-score of 0.93. The algorithm measured an average classification and tracking F1-score of 0.71 and a tracking detection rate of 0.79. Overall, the proposed computer vision system and algorithm showed promising results as a low-cost solution for non-destructive and automatic monitoring of moths.

Download Full-text