scholarly journals DETECTION OF TEXTURE-LESS OBJECTS BY LINE-BASED APPROACH

2020 ◽  
Vol 18 (2) ◽  
pp. 079
Author(s):  
Stevica Cvetković ◽  
Nemanja Grujić ◽  
Slobodan Ilić ◽  
Goran Stančić

This paper proposes a method for tackling the problem of scalable object instance detection in the presence of clutter and occlusions. It gathers together advantages in respect of the state-of-the-art object detection approaches, being at the same time able to scale favorably with the number of models, computationally efficient and suited to texture-less objects as well. The proposed method has the following advantages: a) generality – it works for both texture-less and textured objects, b) scalability – it scales sub-linearly with the number of objects stored in the object database, and c) computational efficiency – it runs in near real-time. In contrast to the traditional affine-invariant detectors/descriptors which are local and not discriminative for texture-less objects, our method is based on line segments around which it computes semi-global descriptor by encoding gradient information in scale and rotation invariant manner. It relies on both texture and shape information and is, therefore, suited for both textured and texture-less objects. The descriptor is integrated into efficient object detection procedure which exploits the fact that the line segment determines scale, orientation and position of an object, by its two endpoints. This is used to construct several effective techniques for object hypotheses generation, scoring and multiple object reasoning; which are integrated in the proposed object detection procedure. Thanks to its ability to detect objects even if only one correct line match is found, our method allows detection of the objects under heavy clutter and occlusions. Extensive evaluation on several public benchmark datasets for texture-less and textured object detection, demonstrates its scalability and high effectiveness.

2020 ◽  
Vol 10 (9) ◽  
pp. 3280 ◽  
Author(s):  
Chinthakindi Balaram Murthy ◽  
Mohammad Farukh Hashmi ◽  
Neeraj Dhanraj Bokde ◽  
Zong Woo Geem

In recent years there has been remarkable progress in one computer vision application area: object detection. One of the most challenging and fundamental problems in object detection is locating a specific object from the multiple objects present in a scene. Earlier traditional detection methods were used for detecting the objects with the introduction of convolutional neural networks. From 2012 onward, deep learning-based techniques were used for feature extraction, and that led to remarkable breakthroughs in this area. This paper shows a detailed survey on recent advancements and achievements in object detection using various deep learning techniques. Several topics have been included, such as Viola–Jones (VJ), histogram of oriented gradient (HOG), one-shot and two-shot detectors, benchmark datasets, evaluation metrics, speed-up techniques, and current state-of-art object detectors. Detailed discussions on some important applications in object detection areas, including pedestrian detection, crowd detection, and real-time object detection on Gpu-based embedded systems have been presented. At last, we conclude by identifying promising future directions.


2021 ◽  
Vol 28 (1) ◽  
pp. 1-46
Author(s):  
Eugene M. Taranta II ◽  
Corey R. Pittman ◽  
Mehran Maghoumi ◽  
Mykola Maslych ◽  
Yasmine M. Moolenaar ◽  
...  

We present Machete, a straightforward segmenter one can use to isolate custom gestures in continuous input. Machete uses traditional continuous dynamic programming with a novel dissimilarity measure to align incoming data with gesture class templates in real time. Advantages of Machete over alternative techniques is that our segmenter is computationally efficient, accurate, device-agnostic, and works with a single training sample. We demonstrate Machete’s effectiveness through an extensive evaluation using four new high-activity datasets that combine puppeteering, direct manipulation, and gestures. We find that Machete outperforms three alternative techniques in segmentation accuracy and latency, making Machete the most performant segmenter. We further show that when combined with a custom gesture recognizer, Machete is the only option that achieves both high recognition accuracy and low latency in a video game application.


Author(s):  
Pavel Karpov ◽  
Guillaume Godin ◽  
Igor Tetko

We present SMILES-embeddings derived from internal encoder state of a Transformer model trained to canonize SMILES as a Seq2Seq problem. Using CharNN architecture upon the embeddings results in a higher quality QSAR/QSPR models on diverse benchmark datasets including regression and classification tasks. The proposed Transformer-CNN method uses SMILES augmentation for training and inference, and thus the prognosis grounds on an internal consensus. Both the augmentation and transfer learning based on embedding allows the method to provide good results for small datasets. We discuss the reasons for such effectiveness and draft future directions for the development of the method. The source code and the embeddings are available on https://github.com/bigchem/transformer-cnn, whereas the OCHEM environment (https://ochem.eu) hosts its on-line implementation.


2021 ◽  
Author(s):  
Da-Ren Chen ◽  
Wei-Min Chiu

Abstract Machine learning techniques have been used to increase detection accuracy of cracks in road surfaces. Most studies failed to consider variable illumination conditions on the target of interest (ToI), and only focus on detecting the presence or absence of road cracks. This paper proposes a new road crack detection method, IlumiCrack, which integrates Gaussian mixture models (GMM) and object detection CNN models. This work provides the following contributions: 1) For the first time, a large-scale road crack image dataset with a range of illumination conditions (e.g., day and night) is prepared using a dashcam. 2) Based on GMM, experimental evaluations on 2 to 4 levels of brightness are conducted for optimal classification. 3) the IlumiCrack framework is used to integrate state-of-the-art object detecting methods with CNN to classify the road crack images into eight types with high accuracy. Experimental results show that IlumiCrack outperforms the state-of-the-art R-CNN object detection frameworks.


2020 ◽  
Author(s):  
Andrey De Aguiar Salvi ◽  
Rodrigo Coelho Barros

Recent research on Convolutional Neural Networks focuses on how to create models with a reduced number of parameters and a smaller storage size while keeping the model’s ability to perform its task, allowing the use of the best CNN for automating tasks in limited devices, with reduced processing power, memory, or energy consumption constraints. There are many different approaches in the literature: removing parameters, reduction of the floating-point precision, creating smaller models that mimic larger models, neural architecture search (NAS), etc. With all those possibilities, it is challenging to say which approach provides a better trade-off between model reduction and performance, due to the difference between the approaches, their respective models, the benchmark datasets, or variations in training details. Therefore, this article contributes to the literature by comparing three state-of-the-art model compression approaches to reduce a well-known convolutional approach for object detection, namely YOLOv3. Our experimental analysis shows that it is possible to create a reduced version of YOLOv3 with 90% fewer parameters and still outperform the original model by pruning parameters. We also create models that require only 0.43% of the original model’s inference effort.


Author(s):  
Bo Li ◽  
Zhengxing Sun ◽  
Yuqi Guo

Image saliency detection has recently witnessed rapid progress due to deep neural networks. However, there still exist many important problems in the existing deep learning based methods. Pixel-wise convolutional neural network (CNN) methods suffer from blurry boundaries due to the convolutional and pooling operations. While region-based deep learning methods lack spatial consistency since they deal with each region independently. In this paper, we propose a novel salient object detection framework using a superpixelwise variational autoencoder (SuperVAE) network. We first use VAE to model the image background and then separate salient objects from the background through the reconstruction residuals. To better capture semantic and spatial contexts information, we also propose a perceptual loss to take advantage from deep pre-trained CNNs to train our SuperVAE network. Without the supervision of mask-level annotated data, our method generates high quality saliency results which can better preserve object boundaries and maintain the spatial consistency. Extensive experiments on five wildly-used benchmark datasets show that the proposed method achieves superior or competitive performance compared to other algorithms including the very recent state-of-the-art supervised methods.


Sign in / Sign up

Export Citation Format

Share Document