Fast and Robust Dictionary-based Classification for Image Data

2021 ◽  
Vol 15 (6) ◽  
pp. 1-22
Author(s):  
Shaoning Zeng ◽  
Bob Zhang ◽  
Jianping Gou ◽  
Yong Xu ◽  
Wei Huang

Dictionary-based classification has been promising in knowledge discovery from image data, due to its good performance and interpretable theoretical system. Dictionary learning effectively supports both small- and large-scale datasets, while its robustness and performance depends on the atoms of the dictionary most of the time. Empirically, using a large number of atoms is helpful to obtain a robust classification, while robustness cannot be ensured when setting a small number of atoms. However, learning a huge dictionary dramatically slows down the speed of classification, which is especially worse on the large-scale datasets. To address the problem, we propose a Fast and Robust Dictionary-based Classification (FRDC) framework, which fully utilizes the learned dictionary for classification by staging - and -norms to obtain a robust sparse representation. The new objective function, on the one hand, introduces an additional -norm term upon the conventional -norm optimization, which generates a more robust classification. On the other hand, the optimization based on both - and -norms is solved in two stages, which is much easier and faster than current solutions. In this way, even when using a limited size of dictionary, which makes sure the classification runs very fast, it still can gain higher robustness for multiple types of image data. The optimization is then theoretically analyzed in a new formulation, close but distinct to elastic-net, to prove it is crucial to improve the performance under the premise of robustness. According to our extensive experiments conducted on four image datasets for face and object classification, FRDC keeps generating a robust classification no matter whether using a small or large number of atoms. This guarantees a fast and robust dictionary-based image classification. Furthermore, when simply using deep features extracted via some popular pre-trained neural networks, it outperforms many state-of-the-art methods on the specific datasets.

2014 ◽  
Vol 18 (1) ◽  
pp. 22-35 ◽  
Author(s):  
Domenico Celenza ◽  
Fabrizio Rossi

Purpose – The aim of this paper is to investigate the relationship between corporate performance and Value Added Intellectual Coefficient (VAICTM) on the one hand, and the relationship between the variations in market value and the variations in VAIC on the other hand. Design/methodology/approach – Starting from the VAIC model, 23 Italian listed companies were examined with the aim of investigating the relationship between VAIC and the performance of the firms in the sample. The analysis was divided into two stages. In the first stage, eight models of linear regression were estimated to verify the presence of a positive and statistically significant relationship between M/BV and VAIC and between accounting performance indicators (ROE, ROI, ROS) and the VAIC. In the second stage, six other models were tested, considering as an independent variable the variations in VAIC and the variations in profitability indicators. Findings – The outcomes of the application stress the importance of VAIC in the explanation of the variations in MV and its role as “additional coefficient” in the analysis of equity performance. Originality/value – This methodology highlights some very interesting aspects. In particular, whereas the relationship between M/BV and VAIC and between profitability indicators (ROI, ROE, ROS) and VAIC is statistically insignificant, the subsequent analysis highlights the importance of VAIC as a variable capable of increasing the explanatory power of the regression in a cross-sectional perspective.


2020 ◽  
Vol 34 (04) ◽  
pp. 6917-6924 ◽  
Author(s):  
Ya Zhao ◽  
Rui Xu ◽  
Xinchao Wang ◽  
Peng Hou ◽  
Haihong Tang ◽  
...  

Lip reading has witnessed unparalleled development in recent years thanks to deep learning and the availability of large-scale datasets. Despite the encouraging results achieved, the performance of lip reading, unfortunately, remains inferior to the one of its counterpart speech recognition, due to the ambiguous nature of its actuations that makes it challenging to extract discriminant features from the lip movement videos. In this paper, we propose a new method, termed as Lip by Speech (LIBS), of which the goal is to strengthen lip reading by learning from speech recognizers. The rationale behind our approach is that the features extracted from speech recognizers may provide complementary and discriminant clues, which are formidable to be obtained from the subtle movements of the lips, and consequently facilitate the training of lip readers. This is achieved, specifically, by distilling multi-granularity knowledge from speech recognizers to lip readers. To conduct this cross-modal knowledge distillation, we utilize an efficacious alignment scheme to handle the inconsistent lengths of the audios and videos, as well as an innovative filtering strategy to refine the speech recognizer's prediction. The proposed method achieves the new state-of-the-art performance on the CMLR and LRS2 datasets, outperforming the baseline by a margin of 7.66% and 2.75% in character error rate, respectively.


2014 ◽  
Vol 71 (7) ◽  
pp. 2415-2429 ◽  
Author(s):  
Jacob P. Edman ◽  
David M. Romps

Abstract A new formulation of the weak pressure gradient approximation (WPG) is introduced for parameterizing large-scale dynamics in limited-domain atmospheric models. This new WPG is developed in the context of the one-dimensional, linearized, damped, shallow-water equations and then extended to Boussinesq and compressible fluids. Unlike previous supradomain-scale parameterizations, this formulation of WPG correctly reproduces both steady-state solutions and first baroclinic gravity waves. In so doing, this scheme eliminates the undesirable gravity wave resonance in previous versions of WPG. In addition, this scheme can be extended to accurately model the emission of gravity waves with arbitrary vertical wavenumber.


2021 ◽  
Author(s):  
Manuel Fritz ◽  
Michael Behringer ◽  
Dennis Tschechlov ◽  
Holger Schwarz

AbstractClustering is a fundamental primitive in manifold applications. In order to achieve valuable results in exploratory clustering analyses, parameters of the clustering algorithm have to be set appropriately, which is a tremendous pitfall. We observe multiple challenges for large-scale exploration processes. On the one hand, they require specific methods to efficiently explore large parameter search spaces. On the other hand, they often exhibit large runtimes, in particular when large datasets are analyzed using clustering algorithms with super-polynomial runtimes, which repeatedly need to be executed within exploratory clustering analyses. We address these challenges as follows: First, we present LOG-Means and show that it provides estimates for the number of clusters in sublinear time regarding the defined search space, i.e., provably requiring less executions of a clustering algorithm than existing methods. Second, we demonstrate how to exploit fundamental characteristics of exploratory clustering analyses in order to significantly accelerate the (repetitive) execution of clustering algorithms on large datasets. Third, we show how these challenges can be tackled at the same time. To the best of our knowledge, this is the first work which simultaneously addresses the above-mentioned challenges. In our comprehensive evaluation, we unveil that our proposed methods significantly outperform state-of-the-art methods, thus especially supporting novice analysts for exploratory clustering analyses in large-scale exploration processes.


Author(s):  
Surabhi Kumari

Abstract: MPC (multi-party computation) is a comprehensive cryptographic concept that can be used to do computations while maintaining anonymity. MPC allows a group of people to work together on a function without revealing the plaintext's true input or output. Privacy-preserving voting, arithmetic calculation, and large-scale data processing are just a few of the applications of MPC. Each MPC party can run on a single computing node from a system perspective. Multiple parties' computing nodes could be homogenous or heterogeneous; nevertheless, MPC protocols' distributed workloads are always homogeneous (symmetric). We investigate the system performance of a representative MPC framework and a collection of MPC applications in this paper. On homogeneous and heterogeneous compute nodes, we describe the complete online calculation workflow of a state-of-the-art MPC protocol and examine the fundamental cause of its stall time and performance limitation. Keywords: Cloud Computing, IoT, MPC, Amazon Service, Virtualization.


Author(s):  
G.Aparna Et.al

In this paper an approach for secured digital image transmission with watermark is being proposed.  The tremendous growth in technology for various applications demand secured communications across the wireless channels. Secured image transmission is the one of the prominent process in digital communication applications. A watermark is embedded in to the image data that is to be protected from unauthorized users.  The cryptographic algorithms chosen for secured transmission led to the need for hardware implementation.  In the process of secured image transmission turbo encoder is proposed for error correction.  The proposed approach is realized in terms of hardware for the digital logic size, area and power consumption using Xilinx 14.2 software. Synthesizing and implementation of verilog code on the target device xc6slx150-2fgg484 for timing constraints, device utilization and performance details. © 2020 Elsevier Ltd. All rights reserved. Selection and/or Peer-review under responsibility of International Conference on Mechanical, Electronics and Computer Engineering


2021 ◽  
Author(s):  
Daniel Padilla ◽  
Hatem A. Rashwan ◽  
Domènec Savi Puig

Deep learning (DL) networks have proven to be crucial in commercial solutions with computer vision challenges due to their abilities to extract high-level abstractions of the image data and their capabilities of being easily adapted to many applications. As a result, DL methodologies had become a de facto standard for computer vision problems yielding many new kinds of research, approaches and applications. Recently, the commercial sector is also driving to use of embedded systems to be able to execute DL models, which has caused an important change on the DL panorama and the embedded systems themselves. Consequently, in this paper, we attempt to study the state of the art of embedded systems, such as GPUs, FPGAs and Mobile SoCs, that are able to use DL techniques, to modernize the stakeholders with the new systems available in the market. Besides, we aim at helping them to determine which of these systems can be beneficial and suitable for their applications in terms of upgradeability, price, deployment and performance.


2013 ◽  
Vol 2013 ◽  
pp. 1-11 ◽  
Author(s):  
Lukasz Zwolinski ◽  
Marta Kozak ◽  
Karol Kozak

Technological advancements are constantly increasing the size and complexity of data resulting from large-scale RNA interference screens. This fact has led biologists to ask complex questions, which the existing, fully automated analyses are often not adequate to answer. We present a concept of 1Click1View (1C1V) as a methodology for interactive analytic software tools. 1C1V can be applied for two-dimensional visualization of image-based screening data sets from High Content Screening (HCS). Through an easy-to-use interface, one-click, one-view concept, and workflow based architecture, visualization method facilitates the linking of image data with numeric data. Such method utilizes state-of-the-art interactive visualization tools optimized for fast visualization of large scale image data sets. We demonstrate our method on an HCS dataset consisting of multiple cell features from two screening assays.


AI ◽  
2021 ◽  
Vol 2 (4) ◽  
pp. 684-704
Author(s):  
Karen Panetta ◽  
Landry Kezebou ◽  
Victor Oludare ◽  
James Intriligator ◽  
Sos Agaian

The concept of searching and localizing vehicles from live traffic videos based on descriptive textual input has yet to be explored in the scholarly literature. Endowing Intelligent Transportation Systems (ITS) with such a capability could help solve crimes on roadways. One major impediment to the advancement of fine-grain vehicle recognition models is the lack of video testbench datasets with annotated ground truth data. Additionally, to the best of our knowledge, no metrics currently exist for evaluating the robustness and performance efficiency of a vehicle recognition model on live videos and even less so for vehicle search and localization models. In this paper, we address these challenges by proposing V-Localize, a novel artificial intelligence framework for vehicle search and continuous localization captured from live traffic videos based on input textual descriptions. An efficient hashgraph algorithm is introduced to compute valid target information from textual input. This work further introduces two novel datasets to advance AI research in these challenging areas. These datasets include (a) the most diverse and large-scale Vehicle Color Recognition (VCoR) dataset with 15 color classes—twice as many as the number of color classes in the largest existing such dataset—to facilitate finer-grain recognition with color information; and (b) a Vehicle Recognition in Video (VRiV) dataset, a first of its kind video testbench dataset for evaluating the performance of vehicle recognition models in live videos rather than still image data. The VRiV dataset will open new avenues for AI researchers to investigate innovative approaches that were previously intractable due to the lack of annotated traffic vehicle recognition video testbench dataset. Finally, to address the gap in the field, five novel metrics are introduced in this paper for adequately accessing the performance of vehicle recognition models in live videos. Ultimately, the proposed metrics could also prove intuitively effective at quantitative model evaluation in other video recognition applications. T One major advantage of the proposed vehicle search and continuous localization framework is that it could be integrated in ITS software solution to aid law enforcement, especially in critical cases such as of amber alerts or hit-and-run incidents.


Sensors ◽  
2019 ◽  
Vol 19 (14) ◽  
pp. 3215 ◽  
Author(s):  
Ana Stojkovic ◽  
Ivana Shopovska ◽  
Hiep Luong ◽  
Jan Aelterman ◽  
Ljubomir Jovanov ◽  
...  

Interpolation from a Color Filter Array (CFA) is the most common method for obtaining full color image data. Its success relies on the smart combination of a CFA and a demosaicing algorithm. Demosaicing on the one hand has been extensively studied. Algorithmic development in the past 20 years ranges from simple linear interpolation to modern neural-network-based (NN) approaches that encode the prior knowledge of millions of training images to fill in missing data in an inconspicious way. CFA design, on the other hand, is less well studied, although still recognized to strongly impact demosaicing performance. This is because demosaicing algorithms are typically limited to one particular CFA pattern, impeding straightforward CFA comparison. This is starting to change with newer classes of demosaicing that may be considered generic or CFA-agnostic. In this study, by comparing performance of two state-of-the-art generic algorithms, we evaluate the potential of modern CFA-demosaicing. We test the hypothesis that, with the increasing power of NN-based demosaicing, the influence of optimal CFA design on system performance decreases. This hypothesis is supported with the experimental results. Such a finding would herald the possibility of relaxing CFA requirements, providing more freedom in the CFA design choice and producing high-quality cameras.


Sign in / Sign up

Export Citation Format

Share Document