scholarly journals Street-Level Image Localization Based on Building-Aware Features via Patch-Region Retrieval under Metropolitan-Scale

2021 ◽  
Vol 13 (23) ◽  
pp. 4876
Author(s):  
Lanyue Zhi ◽  
Zhifeng Xiao ◽  
Yonggang Qiang ◽  
Linjun Qian

The aim of image-based localization (IBL) is to localize the real location of query image by matching reference image in database with GNSS-tags. Popular methods related to IBL commonly use street-level images, which have high value in practical application. Using street-level image to tackle IBL task has the primary challenges: existing works have not made targeted optimization for urban IBL tasks. Besides, the matching result is over-reliant on the quality of image features. Methods should address their practicality and robustness in engineering application, under metropolitan-scale. In response to these, this paper made following contributions: firstly, given the critical of buildings in distinguishing urban scenes, we contribute a feature called Building-Aware Feature (BAF). Secondly, in view of negative influence of complex urban scenes in retrieval process, we propose a retrieval method called Patch-Region Retrieval (PRR). To prove the effectiveness of BAF and PRR, we established an image-based localization experimental framework. Experiments prove that BAF can retain the feature points that fall on the building, and selectively lessen the feature points that fall on other things. While this effectively compresses the storage amount of feature index, we can also improve recall of localization results; implemented in the stage of geometric verification, PRR compares matching results of regional features and selects the best ranking as final result. PRR can enhance effectiveness of patch-regional feature. In addition, we fully confirmed the superiority of our proposed methods through a metropolitan-scale street-level image dataset.

2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Wang Li ◽  
Zhang Yong ◽  
Yuan Wei ◽  
Shi Hongxing

Vehicle reidentification refers to the mission of matching vehicles across nonoverlapping cameras, which is one of the critical problems of the intelligent transportation system. Due to the resemblance of the appearance of the vehicles on road, traditional methods could not perform well on vehicles with high similarity. In this paper, we utilize hypergraph representation to integrate image features and tackle the issue of vehicles re-ID via hypergraph learning algorithms. A feature descriptor can only extract features from a single aspect. To merge multiple feature descriptors, an efficient and appropriate representation is particularly necessary, and a hypergraph is naturally suitable for modeling high-order relationships. In addition, the spatiotemporal correlation of traffic status between cameras is the constraint beyond the image, which can greatly improve the re-ID accuracy of different vehicles with similar appearances. The method proposed in this paper uses hypergraph optimization to learn about the similarity between the query image and images in the library. By using the pair and higher-order relationship between query objects and image library, the similarity measurement method is improved compared to direct matching. The experiments conducted on the image library constructed in this paper demonstrates the effectiveness of using multifeature hypergraph fusion and the spatiotemporal correlation model to address issues in vehicle reidentification.


2021 ◽  
Vol 8 (7) ◽  
pp. 97-105
Author(s):  
Ali Ahmed ◽  
◽  
Sara Mohamed ◽  

Content-Based Image Retrieval (CBIR) systems retrieve images from the image repository or database in which they are visually similar to the query image. CBIR plays an important role in various fields such as medical diagnosis, crime prevention, web-based searching, and architecture. CBIR consists mainly of two stages: The first is the extraction of features and the second is the matching of similarities. There are several ways to improve the efficiency and performance of CBIR, such as segmentation, relevance feedback, expansion of queries, and fusion-based methods. The literature has suggested several methods for combining and fusing various image descriptors. In general, fusion strategies are typically divided into two groups, namely early and late fusion strategies. Early fusion is the combination of image features from more than one descriptor into a single vector before the similarity computation, while late fusion refers either to the combination of outputs produced by various retrieval systems or to the combination of different rankings of similarity. In this study, a group of color and texture features is proposed to be used for both methods of fusion strategies. Firstly, an early combination of eighteen color features and twelve texture features are combined into a single vector representation and secondly, the late fusion of three of the most common distance measures are used in the late fusion stage. Our experimental results on two common image datasets show that our proposed method has good performance retrieval results compared to the traditional way of using single features descriptor and also has an acceptable retrieval performance compared to some of the state-of-the-art methods. The overall accuracy of our proposed method is 60.6% and 39.07% for Corel-1K and GHIM-10K ‎datasets, respectively.


2021 ◽  
Author(s):  
Rudy Venguswamy ◽  
Mike Levy ◽  
Anirudh Koul ◽  
Satyarth Praveen ◽  
Tarun Narayanan ◽  
...  

<p>Machine learning modeling for Earth events at NASA is often limited by the availability of labeled examples. For example, training classifiers for forest fires or oil spills from satellite imagery requires curating a massive and diverse dataset of example forest fires, a tedious multi-month effort requiring careful review of over 196.9 million square miles of data per day for 20 years. While such images might exist in abundance within 40 petabytes of unlabeled satellite data, finding these positive examples to include in a training dataset for a machine learning model is extremely time-consuming and requires researchers to "hunt" for positive examples, like finding a needle in a haystack. </p><p>We present a no-code open-source tool, Curator, whose goal is to minimize the amount of human manual image labeling needed to achieve a state of the art classifier. The pipeline, purpose-built to take advantage of the massive amount of unlabeled images, consists of (1) self-supervision training to convert unlabeled images into meaningful representations, (2) search-by-example to collect a seed set of images, (3) human-in-the-loop active learning to iteratively ask for labels on uncertain examples and train on them. </p><p>In step 1, a model capable of representing unlabeled images meaningfully is trained with a self-supervised algorithm (like SimCLR) on a random subset of the dataset (that conforms to researchers’ specified “training budget.”). Since real-world datasets are often imbalanced leading to suboptimal models, the initial model is used to generate embeddings on the entire dataset. Then, images with equidistant embeddings are sampled. This iterative training and resampling strategy improves both balanced training data and models every iteration. In step 2, researchers supply an example image of interest, and the output embeddings generated from this image are used to find other images with embeddings near the reference image’s embedding in euclidean space (hence similar looking images to the query image). These proposed candidate images contain a higher density of positive examples and are annotated manually as a seed set. In step 3, the seed labels are used to train a classifier to identify more candidate images for human inspection with active learning. Each classification training loop, candidate images for labeling are sampled from the larger unlabeled dataset based on the images that the model is most uncertain about (p ≈ 0.5).</p><p>Curator is released as an open-source package built on PyTorch-Lightning. The pipeline uses GPU-based transforms from the NVIDIA-Dali package for augmentation, leading to a 5-10x speed up in self-supervised training and is run from the command line.</p><p>By iteratively training a self-supervised model and a classifier in tandem with human manual annotation, this pipeline is able to unearth more positive examples from severely imbalanced datasets which were previously untrainable with self-supervision algorithms. In applications such as detecting wildfires, atmospheric dust, or turning outward with telescopic surveys, increasing the number of positive candidates presented to humans for manual inspection increases the efficacy of classifiers and multiplies the efficiency of researchers’ data curation efforts.</p>


Author(s):  
Siddhivinayak Kulkarni

Developments in technology and the Internet have led to an increase in number of digital images and videos. Thousands of images are added to WWW every day. Content based Image Retrieval (CBIR) system typically consists of a query example image, given by the user as an input, from which low-level image features are extracted. These low level image features are used to find images in the database which are most similar to the query image and ranked according their similarity. This chapter evaluates various CBIR techniques based on fuzzy logic and neural networks and proposes a novel fuzzy approach to classify the colour images based on their content, to pose a query in terms of natural language and fuse the queries based on neural networks for fast and efficient retrieval. A number of experiments were conducted for classification, and retrieval of images on sets of images and promising results were obtained.


2020 ◽  
Vol 12 (23) ◽  
pp. 3978
Author(s):  
Tianyou Chu ◽  
Yumin Chen ◽  
Liheng Huang ◽  
Zhiqiang Xu ◽  
Huangyuan Tan

Street view image retrieval aims to estimate the image locations by querying the nearest neighbor images with the same scene from a large-scale reference dataset. Query images usually have no location information and are represented by features to search for similar results. The deep local features (DELF) method shows great performance in the landmark retrieval task, but the method extracts many features so that the feature file is too large to load into memory when training the features index. The memory size is limited, and removing the part of features simply causes a great retrieval precision loss. Therefore, this paper proposes a grid feature-point selection method (GFS) to reduce the number of feature points in each image and minimize the precision loss. Convolutional Neural Networks (CNNs) are constructed to extract dense features, and an attention module is embedded into the network to score features. GFS divides the image into a grid and selects features with local region high scores. Product quantization and an inverted index are used to index the image features to improve retrieval efficiency. The retrieval performance of the method is tested on a large-scale Hong Kong street view dataset, and the results show that the GFS reduces feature points by 32.27–77.09% compared with the raw feature. In addition, GFS has a 5.27–23.59% higher precision than other methods.


2017 ◽  
Vol 865 ◽  
pp. 547-553 ◽  
Author(s):  
Ji Hun Park

This paper presents a new computation method for human joint angle. A human structure is modelled as an articulated rigid body kinematics in single video stream. Every input image consists of a rotating articulated segment with a different 3D angle. Angle computation for a human joint is achieved by several steps. First we compute internal as well as external parameters of a camera using feature points of fixed environment using nonlinear programming. We set an image as a reference image frame for 3D scene analysis for a rotating articulated segment. Then we compute angles of rotation and a center of rotation of the segment for each input frames using corresponding feature points as well as computed camera parameters using nonlinear programming. With computed angles of rotation and a center of rotation, we can perform volumetric reconstruction of an articulated human body in 3D. Basic idea for volumetric reconstruction is regarding separate 3D reconstruction for each articulated body segment. Volume reconstruction in 3D for a rotating segment is done by modifying transformation relation of world-to-camera to adjust an angle of rotation of a rotated segment as if there were no rotation for the segment. Our experimental results for a single rotating segment show our method works well.


2011 ◽  
Vol 37 (5) ◽  
pp. 744-756 ◽  
Author(s):  
Mohammad Bagher Akbari Haghighat ◽  
Ali Aghagolzadeh ◽  
Hadi Seyedarabi

2011 ◽  
Vol 121-126 ◽  
pp. 4630-4634
Author(s):  
Wen Yu Chen ◽  
Wen Zhi Xie ◽  
Yan Li Zhao ◽  
Zhong Bo Hao

Items detection and recognition have become one of hotspots in the field of computer vision research. Based on image features method has the advantage of low amount of information, fast running speed, high precision, and SIFT algorithm is one of them. But traditional SIFI algorithm have large amount of calculation data and spend long time to compute in terms of items recognition. Therefore, this paper come up with a method of items recognition based on SURF. This article elaborates the basic principle of SURF algorithm that firstly use SURF algorithm to extract feature points of item image, secondly adopt Euclidean distance method to find corresponding interest points of image, and finally get the image after items recognition combination with mapping relation of item image using RANSAC(Random Sample Consesus). Experimental results show that the system of item recognition based on SURF algorithm have better effect on matching recognition, higher instantaneity, better robustness.


2015 ◽  
Vol 2015 ◽  
pp. 1-12 ◽  
Author(s):  
Zhong Qu ◽  
Si-Peng Lin ◽  
Fang-Rong Ju ◽  
Ling Liu

The traditional image stitching result based on the SIFT feature points extraction, to a certain extent, has distortion errors. The panorama, especially, would get more seriously distorted when compositing a panoramic result using a long image sequence. To achieve the goal of creating a high-quality panorama, the improved algorithm is proposed in this paper, including altering the way of selecting the reference image and putting forward a method that can compute the transformation matrix for any image of the sequence to align with the reference image in the same coordinate space. Additionally, the improved stitching method dynamically selects the next input image based on the number of SIFT matching points. Compared with the traditional stitching process, the improved method increases the number of matching feature points and reduces SIFT feature detection area of the reference image. The experimental results show that the improved method can not only accelerate the efficiency of image stitching processing, but also reduce the panoramic distortion errors, and finally we can obtain a pleasing panoramic result.


Sign in / Sign up

Export Citation Format

Share Document