Accurate hierarchical stereo matching based on 3D plane labeling of superpixel for stereo images from rovers

An accurate hierarchical stereo matching method is proposed based on continuous 3D plane labeling of superpixel for rover’s stereo images. This method can infer the 3D plane label of each pixel combined with the slanted-patch matching strategy and coarse-to-fine constraints, which is especially suitable for large-scale scene matching with low-texture or textureless regions. At every level, the stereo matching method based on superpixel segmentation makes the iteration convergence faster and avoids huge redundant computations. In the coarse-to-fine matching scheme, we propose disparity constraint and 3D normal vector constraint between adjacent levels through which the disparity map and 3D normal vector map at a coarser level are used to restrict the search range of disparity and normal vector at a fine level. The experimental results with the Chang’e-3 rover dataset and the KITTI dataset show that the proposed stereo matching method is efficiently and accurately compared with the state-of-the-art 3D labeling algorithm, especially in low-texture or textureless regions. The computational efficiency of this method is about five to six times faster than the state-of-the-art 3D labeling method, and the accuracy is better.

Download Full-text

State measurement of isolating switch using cost fusion and smoothness prior based stereo matching

International Journal of Advanced Robotic Systems ◽

10.1177/1729881420925299 ◽

2020 ◽

Vol 17 (3) ◽

pp. 172988142092529

Author(s):

Jinxin Xu ◽

Qingwu Li ◽

Ying Luo ◽

Yan Zhou ◽

Jiayu Wang

Keyword(s):

Stereo Matching ◽

Three Dimensional ◽

The State ◽

Inspection System ◽

Center Line ◽

Disparity Map ◽

Matching Method ◽

Data Set ◽

State Measurement ◽

Distortion Rectification

To better monitor the state of isolating switches, an efficient binocular vision-based state measurement system is proposed in this article. Two optimal cameras are selected as the vision of our inspection system. Firstly, stereo calibration and distortion rectification are performed on acquired image pair. Secondly, to recover the three-dimensional information of switch, we propose a semi-global stereo matching method by using data- and structure-driven cost volume fusion and then optimizing raw disparity map with weighted- and edge discriminated-smoothness prior. Gradient content is enforced on the weight for suppressing small-weight-accumulation problem in weak-textured regions. Besides, Hough transform with feature constraints is implemented for removing the chaotic lines and extracting center line of the switch arm. Finally, based on the center line and corresponding disparity map of the switch arm, triangulation principle is used for calculating the true angle between the switch arm and insulator such that whether or not the isolating switch is fully closed can be detected. The experimental results demonstrate that the proposed stereo matching method can achieve good performance in Middlebury v.3 data set and switch images, and the system can precisely measure the state of switches.

Download Full-text

An Optimized Mean Shift Filtering Technique to Image Representation Through Disparity Map for Large Scale Stereo Images

10.22496/atct20170104126 ◽

2017 ◽

Vol 2 (1) ◽

pp. 36-45

Author(s):

Kavitha ◽

Balakrishnan

Keyword(s):

Large Scale ◽

Mean Shift ◽

Image Representation ◽

Stereo Images ◽

Disparity Map ◽

Filtering Technique

Download Full-text

A Joint 2D-3D Complementary Network for Stereo Matching

Sensors ◽

10.3390/s21041430 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1430

Author(s):

Xiaogang Jia ◽

Wei Chen ◽

Zhengfa Liang ◽

Xin Luo ◽

Mingfei Wu ◽

...

Keyword(s):

Stereo Matching ◽

Computational Cost ◽

Research Field ◽

Disparity Map ◽

Improve Performance ◽

Cost Aggregation ◽

Disparity Range ◽

Public Datasets ◽

Coarse To Fine ◽

Speed And Accuracy

Stereo matching is an important research field of computer vision. Due to the dimension of cost aggregation, current neural network-based stereo methods are difficult to trade-off speed and accuracy. To this end, we integrate fast 2D stereo methods with accurate 3D networks to improve performance and reduce running time. We leverage a 2D encoder-decoder network to generate a rough disparity map and construct a disparity range to guide the 3D aggregation network, which can significantly improve the accuracy and reduce the computational cost. We use a stacked hourglass structure to refine the disparity from coarse to fine. We evaluated our method on three public datasets. According to the KITTI official website results, Our network can generate an accurate result in 80 ms on a modern GPU. Compared to other 2D stereo networks (AANet, DeepPruner, FADNet, etc.), our network has a big improvement in accuracy. Meanwhile, it is significantly faster than other 3D stereo networks (5× than PSMNet, 7.5× than CSN and 22.5× than GANet, etc.), demonstrating the effectiveness of our method.

Download Full-text

Large-scale Semantic Parsing without Question-Answer Pairs

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00190 ◽

2014 ◽

Vol 2 ◽

pp. 377-392 ◽

Cited By ~ 40

Author(s):

Siva Reddy ◽

Mirella Lapata ◽

Mark Steedman

Keyword(s):

Natural Language ◽

Large Scale ◽

Graph Matching ◽

State Of The Art ◽

The State ◽

Semantic Parsing ◽

Matching Problem ◽

Weak Supervision ◽

Benchmark Datasets

In this paper we introduce a novel semantic parsing approach to query Freebase in natural language without requiring manual annotations or question-answer pairs. Our key insight is to represent natural language via semantic graphs whose topology shares many commonalities with Freebase. Given this representation, we conceptualize semantic parsing as a graph matching problem. Our model converts sentences to semantic graphs using CCG and subsequently grounds them to Freebase guided by denotations as a form of weak supervision. Evaluation experiments on a subset of the Free917 and WebQuestions benchmark datasets show our semantic parser improves over the state of the art.

Download Full-text

Disparity estimation using Graph cuts for road applications

E3S Web of Conferences ◽

10.1051/e3sconf/202129701055 ◽

2021 ◽

Vol 297 ◽

pp. 01055

Author(s):

Mohamed El Ansari ◽

Ilyas El Jaafari ◽

Lahcen Koutti

Keyword(s):

Stereo Matching ◽

Optimal Solution ◽

Similarity Criterion ◽

Graph Cuts ◽

Search Space ◽

Disparity Estimation ◽

Stereo Images ◽

Disparity Map ◽

Current Frame ◽

Edge Points

This paper proposes a new edge based stereo matching approach for road applications. The new approach consists in matching the edge points extracted from the input stereo images using temporal constraints. At the current frame, we propose to estimate a disparity range for each image line based on the disparity map of its preceding one. The stereo images are divided into multiple parts according to the estimated disparity ranges. The optimal solution of each part is independently approximated via the state-of-the-art energy minimization approach Graph cuts. The disparity search space at each image part is very small compared to the global one, which improves the results and reduces the execution time. Furthermore, as a similarity criterion between corresponding edge points, we propose a new cost function based on the intensity, the gradient magnitude and gradient orientation. The proposed method has been tested on virtual stereo images, and it has been compared to a recently proposed method and the results are satisfactory.

Download Full-text

A Comprehensive Taxonomy of Dynamic Texture Representation

ACM Computing Surveys ◽

10.1145/3487892 ◽

2023 ◽

Vol 55 (1) ◽

pp. 1-39

Author(s):

Thanh Tuan Nguyen ◽

Thanh Phuong Nguyen

Keyword(s):

Large Scale ◽

Environmental Changes ◽

State Of The Art ◽

The State ◽

Future Research ◽

Research Activities ◽

Potential Applications ◽

Benchmark Datasets ◽

Negative Impacts ◽

Made In

Representing dynamic textures (DTs) plays an important role in many real implementations in the computer vision community. Due to the turbulent and non-directional motions of DTs along with the negative impacts of different factors (e.g., environmental changes, noise, illumination, etc.), efficiently analyzing DTs has raised considerable challenges for the state-of-the-art approaches. For 20 years, many different techniques have been introduced to handle the above well-known issues for enhancing the performance. Those methods have shown valuable contributions, but the problems have been incompletely dealt with, particularly recognizing DTs on large-scale datasets. In this article, we present a comprehensive taxonomy of DT representation in order to purposefully give a thorough overview of the existing methods along with overall evaluations of their obtained performances. Accordingly, we arrange the methods into six canonical categories. Each of them is then taken in a brief presentation of its principal methodology stream and various related variants. The effectiveness levels of the state-of-the-art methods are then investigated and thoroughly discussed with respect to quantitative and qualitative evaluations in classifying DTs on benchmark datasets. Finally, we point out several potential applications and the remaining challenges that should be addressed in further directions. In comparison with two existing shallow DT surveys (i.e., the first one is out of date as it was made in 2005, while the newer one (published in 2016) is an inadequate overview), we believe that our proposed comprehensive taxonomy not only provides a better view of DT representation for the target readers but also stimulates future research activities.

Download Full-text

Precise No-Reference Image Quality Evaluation Based on Distortion Identification

ACM Transactions on Multimedia Computing Communications and Applications ◽

10.1145/3468872 ◽

2021 ◽

Vol 17 (3s) ◽

pp. 1-21

Author(s):

Chenggang Yan ◽

Tong Teng ◽

Yutao Liu ◽

Yongbing Zhang ◽

Haoqian Wang ◽

...

Keyword(s):

Neural Network ◽

Image Quality ◽

Quality Assessment ◽

Large Scale ◽

Quality Evaluation ◽

Image Quality Assessment ◽

State Of The Art ◽

Gaussian White Noise ◽

The State ◽

Reference Image

The difficulty of no-reference image quality assessment (NR IQA) often lies in the lack of knowledge about the distortion in the image, which makes quality assessment blind and thus inefficient. To tackle such issue, in this article, we propose a novel scheme for precise NR IQA, which includes two successive steps, i.e., distortion identification and targeted quality evaluation. In the first step, we employ the well-known Inception-ResNet-v2 neural network to train a classifier that classifies the possible distortion in the image into the four most common distortion types, i.e., Gaussian white noise (WN), Gaussian blur (GB), jpeg compression (JPEG), and jpeg2000 compression (JP2K). Specifically, the deep neural network is trained on the large-scale Waterloo Exploration database, which ensures the robustness and high performance of distortion classification. In the second step, after determining the distortion type of the image, we then design a specific approach to quantify the image distortion level, which can estimate the image quality specially and more precisely. Extensive experiments performed on LIVE, TID2013, CSIQ, and Waterloo Exploration databases demonstrate that (1) the accuracy of our distortion classification is higher than that of the state-of-the-art distortion classification methods, and (2) the proposed NR IQA method outperforms the state-of-the-art NR IQA methods in quantifying the image quality.

Download Full-text

Coupled CycleGAN: Unsupervised Hashing Network for Cross-Modal Retrieval

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.3301176 ◽

2019 ◽

Vol 33 ◽

pp. 176-183 ◽

Cited By ~ 11

Author(s):

Chao Li ◽

Cheng Deng ◽

Lei Wang ◽

De Xie ◽

Xianglong Liu

Keyword(s):

Large Scale ◽

State Of The Art ◽

The State ◽

Storage Cost ◽

Common Representation ◽

Benchmark Datasets ◽

Query Efficiency ◽

Hash Codes

In recent years, hashing has attracted more and more attention owing to its superior capacity of low storage cost and high query efficiency in large-scale cross-modal retrieval. Benefiting from deep leaning, continuously compelling results in cross-modal retrieval community have been achieved. However, existing deep cross-modal hashing methods either rely on amounts of labeled information or have no ability to learn an accuracy correlation between different modalities. In this paper, we proposed Unsupervised coupled Cycle generative adversarial Hashing networks (UCH), for cross-modal retrieval, where outer-cycle network is used to learn powerful common representation, and inner-cycle network is explained to generate reliable hash codes. Specifically, our proposed UCH seamlessly couples these two networks with generative adversarial mechanism, which can be optimized simultaneously to learn representation and hash codes. Extensive experiments on three popular benchmark datasets show that the proposed UCH outperforms the state-of-the-art unsupervised cross-modal hashing methods.

Download Full-text

RDFuzz: Accelerating Directed Fuzzing with Intertwined Schedule and Optimized Mutation

Mathematical Problems in Engineering ◽

10.1155/2020/7698916 ◽

2020 ◽

Vol 2020 ◽

pp. 1-12

Author(s):

Jiaxi Ye ◽

Ruilin Li ◽

Bin Zhang

Keyword(s):

Large Scale ◽

State Of The Art ◽

The State ◽

Experimental Results ◽

Exploration And Exploitation ◽

Balance Problem ◽

Evaluation Strategy ◽

Testing Schedule ◽

Available Resources

Directed fuzzing is a practical technique, which concentrates its testing energy on the process toward the target code areas, while costing little on other unconcerned components. It is a promising way to make better use of available resources, especially in testing large-scale programs. However, by observing the state-of-the-art-directed fuzzing engine (AFLGo), we argue that there are two universal limitations, the balance problem between the exploration and the exploitation and the blindness in mutation toward the target code areas. In this paper, we present a new prototype RDFuzz to address these two limitations. In RDFuzz, we first introduce the frequency-guided strategy in the exploration and improve its accuracy by adopting the branch-level instead of the path-level frequency. Then, we introduce the input-distance-based evaluation strategy in the exploitation stage and present an optimized mutation to distinguish and protect the distance sensitive input content. Moreover, an intertwined testing schedule is leveraged to perform the exploration and exploitation in turn. We test RDFuzz on 7 benchmarks, and the experimental results demonstrate that RDFuzz is skilled at driving the program toward the target code areas, and it is not easily stuck by the balance problem of the exploration and the exploitation.

Download Full-text

Uniformity Attentive Learning-Based Siamese Network for Person Re-Identification

Sensors ◽

10.3390/s20123603 ◽

2020 ◽

Vol 20 (12) ◽

pp. 3603

Author(s):

Dasol Jeong ◽

Hasil Park ◽

Joongchol Shin ◽

Donggoo Kang ◽

Joonki Paik

Keyword(s):

Large Scale ◽

Body Shape ◽

State Of The Art ◽

The State ◽

Whole Body ◽

Distinctive Features ◽

Common Features ◽

Siamese Network ◽

Art Methods ◽

Triplet Loss

Person re-identification (Re-ID) has a problem that makes learning difficult such as misalignment and occlusion. To solve these problems, it is important to focus on robust features in intra-class variation. Existing attention-based Re-ID methods focus only on common features without considering distinctive features. In this paper, we present a novel attentive learning-based Siamese network for person Re-ID. Unlike existing methods, we designed an attention module and attention loss using the properties of the Siamese network to concentrate attention on common and distinctive features. The attention module consists of channel attention to select important channels and encoder-decoder attention to observe the whole body shape. We modified the triplet loss into an attention loss, called uniformity loss. The uniformity loss generates a unique attention map, which focuses on both common and discriminative features. Extensive experiments show that the proposed network compares favorably to the state-of-the-art methods on three large-scale benchmarks including Market-1501, CUHK03 and DukeMTMC-ReID datasets.

Download Full-text