Continuous detection and recognition of actions of interest among actions of non-interest using a depth camera

Author(s):  
Neha Dawar ◽  
Nasser Kehtarnavaz
Symmetry ◽  
2022 ◽  
Vol 14 (1) ◽  
pp. 148
Author(s):  
Nikita Andriyanov ◽  
Ilshat Khasanshin ◽  
Daniil Utkin ◽  
Timur Gataullin ◽  
Stefan Ignar ◽  
...  

Despite the great possibilities of modern neural network architectures concerning the problems of object detection and recognition, the output of such models is the local (pixel) coordinates of objects bounding boxes in the image and their predicted classes. However, in several practical tasks, it is necessary to obtain more complete information about the object from the image. In particular, for robotic apple picking, it is necessary to clearly understand where and how much to move the grabber. To determine the real position of the apple relative to the source of image registration, it is proposed to use the Intel Real Sense depth camera and aggregate information from its depth and brightness channels. The apples detection is carried out using the YOLOv3 architecture; then, based on the distance to the object and its localization in the image, the relative distances are calculated for all coordinates. In this case, to determine the coordinates of apples, a transition to a symmetric coordinate system takes place by means of simple linear transformations. Estimating the position in a symmetric coordinate system allows estimating not only the magnitude of the shift but also the location of the object relative to the camera. The proposed approach makes it possible to obtain position estimates with high accuracy. The approximate root mean square error is 7–12 mm, depending on the range and axis. As for precision and recall metrics, the first is 100% and the second is 90%.


1978 ◽  
Vol 85 (3) ◽  
pp. 192-206 ◽  
Author(s):  
David M. Green ◽  
Theodore G. Birdsall

2019 ◽  
Vol 78 (9) ◽  
pp. 771-781 ◽  
Author(s):  
V. M. Kartashov ◽  
V. N. Oleynikov ◽  
S. A. Sheyko ◽  
S. I. Babkin ◽  
I. V. Korytsev ◽  
...  

2020 ◽  
Vol 2020 (1) ◽  
pp. 78-81
Author(s):  
Simone Zini ◽  
Simone Bianco ◽  
Raimondo Schettini

Rain removal from pictures taken under bad weather conditions is a challenging task that aims to improve the overall quality and visibility of a scene. The enhanced images usually constitute the input for subsequent Computer Vision tasks such as detection and classification. In this paper, we present a Convolutional Neural Network, based on the Pix2Pix model, for rain streaks removal from images, with specific interest in evaluating the results of the processing operation with respect to the Optical Character Recognition (OCR) task. In particular, we present a way to generate a rainy version of the Street View Text Dataset (R-SVTD) for "text detection and recognition" evaluation in bad weather conditions. Experimental results on this dataset show that our model is able to outperform the state of the art in terms of two commonly used image quality metrics, and that it is capable to improve the performances of an OCR model to detect and recognise text in the wild.


Sign in / Sign up

Export Citation Format

Share Document