Encoding Efficiency and Computational Cost Assessment of State-Of-The-Art Point Cloud Codecs

With recent advances in technologies, deep learning is being applied more and more to different tasks. In particular, point cloud processing and classification have been studied for a while now, with various methods developed. Some of the available classification approaches are based on specific data source, like LiDAR, while others are focused on specific scenarios, like indoor. A general major issue is the computational efficiency (in terms of power consumption, memory requirement, and training/inference time). In this study, we propose an efficient framework (named TONIC) that can work with any kind of aerial data source (LiDAR or photogrammetry) and does not require high computational power while achieving accuracy on par with the current state of the art methods. We also test our framework for its generalization ability, showing capabilities to learn from one dataset and predict on unseen aerial scenarios.

Download Full-text

Plant Leaf Disease Recognition Using Depth-Wise Separable Convolution-Based Models

Symmetry ◽

10.3390/sym13030511 ◽

2021 ◽

Vol 13 (3) ◽

pp. 511

Author(s):

Syed Mohammad Minhaz Hossain ◽

Kaushik Deb ◽

Pranab Kumar Dhar ◽

Takeshi Koshiba

Keyword(s):

State Of The Art ◽

Computational Cost ◽

Region Of Interest ◽

Number Of Clusters ◽

Plant Leaf ◽

Leaf Disease ◽

Automatic Initialization ◽

Adequate Accuracy ◽

Model Size ◽

High Computational Cost

Proper plant leaf disease (PLD) detection is challenging in complex backgrounds and under different capture conditions. For this reason, initially, modified adaptive centroid-based segmentation (ACS) is used to trace the proper region of interest (ROI). Automatic initialization of the number of clusters (K) using modified ACS before recognition increases tracing ROI’s scalability even for symmetrical features in various plants. Besides, convolutional neural network (CNN)-based PLD recognition models achieve adequate accuracy to some extent. However, memory requirements (large-scaled parameters) and the high computational cost of CNN-based PLD models are burning issues for the memory restricted mobile and IoT-based devices. Therefore, after tracing ROIs, three proposed depth-wise separable convolutional PLD (DSCPLD) models, such as segmented modified DSCPLD (S-modified MobileNet), segmented reduced DSCPLD (S-reduced MobileNet), and segmented extended DSCPLD (S-extended MobileNet), are utilized to represent the constructive trade-off among accuracy, model size, and computational latency. Moreover, we have compared our proposed DSCPLD recognition models with state-of-the-art models, such as MobileNet, VGG16, VGG19, and AlexNet. Among segmented-based DSCPLD models, S-modified MobileNet achieves the best accuracy of 99.55% and F1-sore of 97.07%. Besides, we have simulated our DSCPLD models using both full plant leaf images and segmented plant leaf images and conclude that, after using modified ACS, all models increase their accuracy and F1-score. Furthermore, a new plant leaf dataset containing 6580 images of eight plants was used to experiment with several depth-wise separable convolution models.

Download Full-text

Y-Net: Dual-branch Joint Network for Semantic Segmentation

ACM Transactions on Multimedia Computing Communications and Applications ◽

10.1145/3460940 ◽

2021 ◽

Vol 17 (4) ◽

pp. 1-22

Author(s):

Yizhen Chen ◽

Haifeng Hu

Keyword(s):

Feature Vector ◽

State Of The Art ◽

Computational Cost ◽

Receptive Fields ◽

Semantic Segmentation ◽

Global Context ◽

Multi Level ◽

The One ◽

Public Datasets ◽

High Level

Most existing segmentation networks are built upon a “ U -shaped” encoder–decoder structure, where the multi-level features extracted by the encoder are gradually aggregated by the decoder. Although this structure has been proven to be effective in improving segmentation performance, there are two main drawbacks. On the one hand, the introduction of low-level features brings a significant increase in calculations without an obvious performance gain. On the other hand, general strategies of feature aggregation such as addition and concatenation fuse features without considering the usefulness of each feature vector, which mixes the useful information with massive noises. In this article, we abandon the traditional “ U -shaped” architecture and propose Y-Net, a dual-branch joint network for accurate semantic segmentation. Specifically, it only aggregates the high-level features with low-resolution and utilizes the global context guidance generated by the first branch to refine the second branch. The dual branches are effectively connected through a Semantic Enhancing Module, which can be regarded as the combination of spatial attention and channel attention. We also design a novel Channel-Selective Decoder (CSD) to adaptively integrate features from different receptive fields by assigning specific channelwise weights, where the weights are input-dependent. Our Y-Net is capable of breaking through the limit of singe-branch network and attaining higher performance with less computational cost than “ U -shaped” structure. The proposed CSD can better integrate useful information and suppress interference noises. Comprehensive experiments are carried out on three public datasets to evaluate the effectiveness of our method. Eventually, our Y-Net achieves state-of-the-art performance on PASCAL VOC 2012, PASCAL Person-Part, and ADE20K dataset without pre-training on extra datasets.

Download Full-text

Retrieving the leaked signals from noise using a fast dictionary learning method

Geophysics ◽

10.1190/geo2021-0243.1 ◽

2021 ◽

pp. 1-86

Author(s):

Wei Chen ◽

Omar M. Saad ◽

Yapo Abolé Serge Innocent Oboué ◽

Liuqing Yang ◽

Yangkang Chen

Keyword(s):

Dictionary Learning ◽

Seismic Data ◽

State Of The Art ◽

Computational Cost ◽

Learning Method ◽

Global Parameter ◽

Radius Parameter ◽

Orthogonalization Method ◽

Value Decomposition ◽

Learned Features

Most traditional seismic denoising algorithms will cause damages to useful signals, which are visible from the removed noise profiles and are known as signal leakage. The local signal-and-noise orthogonalization method is an effective method for retrieving the leaked signals from the removed noise. Retrieving leaked signals while rejecting the noise is compromised by the smoothing radius parameter in the local orthogonalization method. It is not convenient to adjust the smoothing radius because it is a global parameter while the seismic data is highly variable locally. To retrieve the leaked signals adaptively, we propose a new dictionary learning method. Because of the patch-based nature of the dictionary learning method, it can adapt to the local feature of seismic data. We train a dictionary of atoms that represent the features of the useful signals from the initially denoised data. Based on the learned features, we retrieve the weak leaked signals from the noise via a sparse co ding step. Considering the large computational cost when training a dictionary from high-dimensional seismic data, we leverage a fast dictionary up dating algorithm, where the singular value decomposition (SVD) is replaced via the algebraic mean to update the dictionary atom. We test the performance of the proposed method on several synthetic and field data examples, and compare it with that from the state-of-the-art local orthogonalization method.

Download Full-text

Fast Filtering of Search Results Sorted by Attribute

ACM Transactions on Information Systems ◽

10.1145/3477982 ◽

2022 ◽

Vol 40 (2) ◽

pp. 1-24

Author(s):

Franco Maria Nardini ◽

Roberto Trani ◽

Rossano Venturini

Keyword(s):

Heuristic Algorithms ◽

State Of The Art ◽

Optimal Algorithm ◽

Computational Cost ◽

Approximation Error ◽

Optimal Solution ◽

Exact Algorithm ◽

Performance Bounds ◽

Search Results ◽

Filtering Problem

Modern search services often provide multiple options to rank the search results, e.g., sort “by relevance”, “by price” or “by discount” in e-commerce. While the traditional rank by relevance effectively places the relevant results in the top positions of the results list, the rank by attribute could place many marginally relevant results in the head of the results list leading to poor user experience. In the past, this issue has been addressed by investigating the relevance-aware filtering problem, which asks to select the subset of results maximizing the relevance of the attribute-sorted list. Recently, an exact algorithm has been proposed to solve this problem optimally. However, the high computational cost of the algorithm makes it impractical for the Web search scenario, which is characterized by huge lists of results and strict time constraints. For this reason, the problem is often solved using efficient yet inaccurate heuristic algorithms. In this article, we first prove the performance bounds of the existing heuristics. We then propose two efficient and effective algorithms to solve the relevance-aware filtering problem. First, we propose OPT-Filtering, a novel exact algorithm that is faster than the existing state-of-the-art optimal algorithm. Second, we propose an approximate and even more efficient algorithm, ϵ-Filtering, which, given an allowed approximation error ϵ, finds a (1-ϵ)–optimal filtering, i.e., the relevance of its solution is at least (1-ϵ) times the optimum. We conduct a comprehensive evaluation of the two proposed algorithms against state-of-the-art competitors on two real-world public datasets. Experimental results show that OPT-Filtering achieves a significant speedup of up to two orders of magnitude with respect to the existing optimal solution, while ϵ-Filtering further improves this result by trading effectiveness for efficiency. In particular, experiments show that ϵ-Filtering can achieve quasi-optimal solutions while being faster than all state-of-the-art competitors in most of the tested configurations.

Download Full-text

MeshNet: Mesh Neural Network for 3D Shape Representation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33018279 ◽

2019 ◽

Vol 33 ◽

pp. 8279-8286 ◽

Cited By ~ 18

Author(s):

Yutong Feng ◽

Yifan Feng ◽

Haoxuan You ◽

Xibin Zhao ◽

Yue Gao

Keyword(s):

Neural Network ◽

Computer Vision ◽

Point Cloud ◽

State Of The Art ◽

Shape Representation ◽

Shape Classification ◽

Retrieval Performance ◽

3D Shape ◽

General Architecture ◽

3D Shapes

Mesh is an important and powerful type of data for 3D shapes and widely studied in the field of computer vision and computer graphics. Regarding the task of 3D shape representation, there have been extensive research efforts concentrating on how to represent 3D shapes well using volumetric grid, multi-view and point cloud. However, there is little effort on using mesh data in recent years, due to the complexity and irregularity of mesh data. In this paper, we propose a mesh neural network, named MeshNet, to learn 3D shape representation from mesh data. In this method, face-unit and feature splitting are introduced, and a general architecture with available and effective blocks are proposed. In this way, MeshNet is able to solve the complexity and irregularity problem of mesh and conduct 3D shape representation well. We have applied the proposed MeshNet method in the applications of 3D shape classification and retrieval. Experimental results and comparisons with the state-of-the-art methods demonstrate that the proposed MeshNet can achieve satisfying 3D shape classification and retrieval performance, which indicates the effectiveness of the proposed method on 3D shape representation.

Download Full-text

Graph Laplacian for image anomaly detection

Machine Vision and Applications ◽

10.1007/s00138-020-01059-4 ◽

2020 ◽

Vol 31 (1-2) ◽

Cited By ~ 1

Author(s):

Francesco Verdoja ◽

Marco Grangetto

Keyword(s):

Anomaly Detection ◽

State Of The Art ◽

Computational Cost ◽

Gaussian Model ◽

Graph Laplacian ◽

High Dimensional ◽

Spatial Awareness ◽

Detection Problem ◽

Multivariate Gaussian Model ◽

And Inversion

Abstract Reed–Xiaoli detector (RXD) is recognized as the benchmark algorithm for image anomaly detection; however, it presents known limitations, namely the dependence over the image following a multivariate Gaussian model, the estimation and inversion of a high-dimensional covariance matrix, and the inability to effectively include spatial awareness in its evaluation. In this work, a novel graph-based solution to the image anomaly detection problem is proposed; leveraging the graph Fourier transform, we are able to overcome some of RXD’s limitations while reducing computational cost at the same time. Tests over both hyperspectral and medical images, using both synthetic and real anomalies, prove the proposed technique is able to obtain significant gains over performance by other algorithms in the state of the art.

Download Full-text

Collaborative correlation filters for real-time tracking with spatial constraint

International Journal of Wavelets Multiresolution and Information Processing ◽

10.1142/s0219691319500127 ◽

2019 ◽

Vol 17 (03) ◽

pp. 1950012

Author(s):

Lifang Zhou ◽

Hongmei Li ◽

Weisheng Li ◽

Bangjun Lei ◽

Lu Wang

Keyword(s):

Adaptive Sampling ◽

Search Strategy ◽

State Of The Art ◽

Computational Cost ◽

Background Information ◽

Spatial Constraint ◽

Correlation Filters ◽

Computational Costs ◽

Real Time Tracking ◽

Selection Factor

Accurate scale estimation of the target plays an important role in object tracking. Most state-of-the-art methods estimate the target size by employing an exhaustive scale search. These methods can achieve high accuracy but suffer significantly from large computational cost. In this paper, we first propose an adaptive scale search strategy with the scale selection factor instead of an exhaustive scale search. This proposed strategy contributes to reducing computational costs by adaptive sampling. Furthermore, the boundary effects of correlation filters are suppressed by investigating background information so that the accuracy of the proposed tracker can be boosted. Experiments’ empirical evaluations of 61 challenging benchmark sequences demonstrate that the overall tracking performance of the proposed tracker is very successfully improved. Moreover, our method obtains the top rank in performance by outperforming 17 state-of-the-art trackers on OTB2013.

Download Full-text

A comprehensive study of the rate-distortion performance in MPEG point cloud compression

APSIPA Transactions on Signal and Information Processing ◽

10.1017/atsip.2019.20 ◽

2019 ◽

Vol 8 ◽

Cited By ~ 21

Author(s):

Evangelos Alexiou ◽

Irene Viola ◽

Tomás M. Borges ◽

Tiago A. Fonseca ◽

Ricardo L. de Queiroz ◽

...

Keyword(s):

Point Cloud ◽

State Of The Art ◽

Subjective Evaluation ◽

Rate Distortion ◽

Point Clouds ◽

Current Approach ◽

User Engagement ◽

Viable Solution ◽

Wide Range ◽

Objective Quality

Abstract Recent trends in multimedia technologies indicate the need for richer imaging modalities to increase user engagement with the content. Among other alternatives, point clouds denote a viable solution that offers an immersive content representation, as witnessed by current activities in JPEG and MPEG standardization committees. As a result of such efforts, MPEG is at the final stages of drafting an emerging standard for point cloud compression, which we consider as the state-of-the-art. In this study, the entire set of encoders that have been developed in the MPEG committee are assessed through an extensive and rigorous analysis of quality. We initially focus on the assessment of encoding configurations that have been defined by experts in MPEG for their core experiments. Then, two additional experiments are designed and carried to address some of the identified limitations of current approach. As part of the study, state-of-the-art objective quality metrics are benchmarked to assess their capability to predict visual quality of point clouds under a wide range of radically different compression artifacts. To carry the subjective evaluation experiments, a web-based renderer is developed and described. The subjective and objective quality scores along with the rendering software are made publicly available, to facilitate and promote research on the field.

Download Full-text

A k-Distribution-Based Spectral Module for Radiation Calculations in Multi-Phase Mixtures

Volume 1: Heat Transfer in Energy Systems; Thermophysical Properties; Heat Transfer Equipment; Heat Transfer in Electronic Equipment ◽

10.1115/ht2009-88245 ◽

2009 ◽

Cited By ~ 1

Author(s):

Gopalendu Pal ◽

Anquan Wang ◽

Michael F. Modest

Keyword(s):

State Of The Art ◽

Source Code ◽

Computational Cost ◽

High Accuracy ◽

Grid Structure ◽

Participating Media ◽

Flow Solver ◽

Domain Of Applicability ◽

Code Module ◽

Multi Phase

k-distribution-based approaches are promising models for radiation calculations in strongly nongray participating media. Advanced k-distribution methods were found to achieve close-to benchmark line-by-line (LBL) accuracy for strongly inhomogeneous multi-phase media accompanied by several orders of magnitude smaller computational cost. In this paper, a k-distribution-based portable spectral module is developed, incorporating several state-of-the-art k-distribution methods along with compact and high-accuracy databases of k-distributions. The module construction is flexible — the user can choose among various k-distribution methods with their relevant k-distribution databases, to carry out accurate radiation calculations. The spectral module is portable, such that it can be coupled to any flow solver code with its own grid structure, discretization scheme, and solver libraries. This open source code module is made available for free for all noncommercial purposes. This article outlines in detail the design and the use of the spectral module. The k-distribution methods included in the module are briefly described with a discussion of their advantages, disadvantages and their domain of applicability. Examples are provided for various sample radiation calculations in multi-phase mixtures using the new spectral module and the results are compared with LBL calculations.

Download Full-text