scholarly journals Image-Based Localization Using Context

2015 ◽  
Vol 1 (1) ◽  
Author(s):  
Charbel Azzi ◽  
John Zelek ◽  
Daniel Asmar ◽  
Adel Fakih

<p>Image-based localization problem consists of estimating the 6 DoF<br />camera pose by matching the image to a 3D point cloud (or equivalent)<br />representing a 3D environment. The robustness and accuracy<br />of current solutions is not objective and quantifiable. We<br />have completed a comparative analysis of the main state of the art<br />approaches, namely Brute Force Matching, Approximate Nearest<br />Neighbour Matching, Embedded Ferns Classification, ACG Localizer(<br />Using Visual Vocabulary) and Keyframe Matching Approach.<br />The results of the study revealed major deficiencies in each approach<br />mainly in search space reduction, clustering, feature matching<br />and sensitivity to where the query image was taken. Then, we<br />choose to focus on one common major problem that is reducing<br />the search space. We propose to create a new image-based localization<br />approach based on reducing the search space by using<br />global descriptors to find candidate keyframes in the database then<br />search against the 3D points that are only seen from these candidates<br />using local descriptors stored in a 3D cloud map.</p>

2020 ◽  
Vol 34 (07) ◽  
pp. 12717-12724
Author(s):  
Yang You ◽  
Yujing Lou ◽  
Qi Liu ◽  
Yu-Wing Tai ◽  
Lizhuang Ma ◽  
...  

Point cloud analysis without pose priors is very challenging in real applications, as the orientations of point clouds are often unknown. In this paper, we propose a brand new point-set learning framework PRIN, namely, Pointwise Rotation-Invariant Network, focusing on rotation-invariant feature extraction in point clouds analysis. We construct spherical signals by Density Aware Adaptive Sampling to deal with distorted point distributions in spherical space. In addition, we propose Spherical Voxel Convolution and Point Re-sampling to extract rotation-invariant features for each point. Our network can be applied to tasks ranging from object classification, part segmentation, to 3D feature matching and label alignment. We show that, on the dataset with randomly rotated point clouds, PRIN demonstrates better performance than state-of-the-art methods without any data augmentation. We also provide theoretical analysis for the rotation-invariance achieved by our methods.


Entropy ◽  
2020 ◽  
Vol 22 (8) ◽  
pp. 799
Author(s):  
Hafeez Anwar ◽  
Serwah Sabetghadam ◽  
Peter Bell

We propose an image-based class retrieval system for ancient Roman Republican coins that can be instrumental in various archaeological applications such as museums, Numismatics study, and even online auctions websites. For such applications, the aim is not only classification of a given coin, but also the retrieval of its information from standard reference book. Such classification and information retrieval is performed by our proposed system via a user friendly graphical user interface (GUI). The query coin image gets matched with exemplar images of each coin class stored in the database. The retrieved coin classes are then displayed in the GUI along with their descriptions from a reference book. However, it is highly impractical to match a query image with each of the class exemplar images as there are 10 exemplar images for each of the 60 coin classes. Similarly, displaying all the retrieved coin classes and their respective information in the GUI will cause user inconvenience. Consequently, to avoid such brute-force matching, we incrementally vary the number of matches per class to find the least matches attaining the maximum classification accuracy. In a similar manner, we also extend the search space for coin class to find the minimal number of retrieved classes that achieve maximum classification accuracy. On the current dataset, our system successfully attains a classification accuracy of 99% for five matches per class such that the top ten retrieved classes are considered. As a result, the computational complexity is reduced by matching the query image with only half of the exemplar images per class. In addition, displaying the top 10 retrieved classes is far more convenient than displaying all 60 classes.


2020 ◽  
Vol 34 (07) ◽  
pp. 10526-10533 ◽  
Author(s):  
Hanlin Chen ◽  
Li'an Zhuo ◽  
Baochang Zhang ◽  
Xiawu Zheng ◽  
Jianzhuang Liu ◽  
...  

Neural architecture search (NAS) can have a significant impact in computer vision by automatically designing optimal neural network architectures for various tasks. A variant, binarized neural architecture search (BNAS), with a search space of binarized convolutions, can produce extremely compressed models. Unfortunately, this area remains largely unexplored. BNAS is more challenging than NAS due to the learning inefficiency caused by optimization requirements and the huge architecture space. To address these issues, we introduce channel sampling and operation space reduction into a differentiable NAS to significantly reduce the cost of searching. This is accomplished through a performance-based strategy used to abandon less potential operations. Two optimization methods for binarized neural networks are used to validate the effectiveness of our BNAS. Extensive experiments demonstrate that the proposed BNAS achieves a performance comparable to NAS on both CIFAR and ImageNet databases. An accuracy of 96.53% vs. 97.22% is achieved on the CIFAR-10 dataset, but with a significantly compressed model, and a 40% faster search than the state-of-the-art PC-DARTS.


2021 ◽  
Vol 13 (10) ◽  
pp. 1985
Author(s):  
Emre Özdemir ◽  
Fabio Remondino ◽  
Alessandro Golkar

With recent advances in technologies, deep learning is being applied more and more to different tasks. In particular, point cloud processing and classification have been studied for a while now, with various methods developed. Some of the available classification approaches are based on specific data source, like LiDAR, while others are focused on specific scenarios, like indoor. A general major issue is the computational efficiency (in terms of power consumption, memory requirement, and training/inference time). In this study, we propose an efficient framework (named TONIC) that can work with any kind of aerial data source (LiDAR or photogrammetry) and does not require high computational power while achieving accuracy on par with the current state of the art methods. We also test our framework for its generalization ability, showing capabilities to learn from one dataset and predict on unseen aerial scenarios.


2021 ◽  
Vol 5 (4) ◽  
pp. 783-793
Author(s):  
Muhammad Muttabi Hudaya ◽  
Siti Saadah ◽  
Hendy Irawan

needs a solid validation that has verification and matching uploaded images. To solve this problem, this paper implementing a detection model using Faster R-CNN and a matching method using ORB (Oriented FAST and Rotated BRIEF) and KNN-BFM (K-Nearest Neighbor Brute Force Matcher). The goal of the implementations is to reach both an 80% mark of accuracy and prove matching using ORB only can be a replaced OCR technique. The implementation accuracy results in the detection model reach mAP (Mean Average Precision) of 94%. But, the matching process only achieves an accuracy of 43,46%. The matching process using only image feature matching underperforms the previous OCR technique but improves processing time from 4510ms to 60m). Image matching accuracy has proven to increase by using a high-quality dan high quantity dataset, extracting features on the important area of EKTP card images.


Author(s):  
Nur Ariffin Mohd Zin ◽  
Hishammuddin Asmuni ◽  
Haza Nuzly Abdul Hamed ◽  
Razib M. Othman ◽  
Shahreen Kasim ◽  
...  

Recent studies have shown that the wearing of soft lens may lead to performance degradation with the increase of false reject rate. However, detecting the presence of soft lens is a non-trivial task as its texture that almost indiscernible. In this work, we proposed a classification method to identify the existence of soft lens in iris image. Our proposed method starts with segmenting the lens boundary on top of the sclera region. Then, the segmented boundary is used as features and extracted by local descriptors. These features are then trained and classified using Support Vector Machines. This method was tested on Notre Dame Cosmetic Contact Lens 2013 database. Experiment showed that the proposed method performed better than state of the art methods.


2022 ◽  
Vol 19 (1) ◽  
pp. 1-21
Author(s):  
Daeyeal Lee ◽  
Bill Lin ◽  
Chung-Kuan Cheng

SMART NoCs achieve ultra-low latency by enabling single-cycle multiple-hop transmission via bypass channels. However, contention along bypass channels can seriously degrade the performance of SMART NoCs by breaking the bypass paths. Therefore, contention-free task mapping and scheduling are essential for optimal system performance. In this article, we propose an SMT (Satisfiability Modulo Theories)-based framework to find optimal contention-free task mappings with minimum application schedule lengths on 2D/3D SMART NoCs with mixed dimension-order routing. On top of SMT’s fast reasoning capability for conditional constraints, we develop efficient search-space reduction techniques to achieve practical scalability. Experiments demonstrate that our SMT framework achieves 10× higher scalability than ILP (Integer Linear Programming) with 931.1× (ranges from 2.2× to 1532.1×) and 1237.1× (ranges from 4× to 4373.8×) faster average runtimes for finding optimum solutions on 2D and 3D SMART NoCs and our 2D and 3D extensions of the SMT framework with mixed dimension-order routing also maintain the improved scalability with the extended and diversified routing paths, resulting in reduced application schedule lengths throughout various application benchmarks.


Sign in / Sign up

Export Citation Format

Share Document