Large-Scale Multi-modal Distance Metric Learning with Application to Content-Based Information Retrieval and Image Classification

Metric learning algorithms aim to make the conceptually related data items closer and keep dissimilar ones at a distance. The most common approach for metric learning on the Mahalanobis method. Despite its success, this method is limited to find a linear projection and also suffer from scalability respecting both the dimensionality and the size of input data. To address these problems, this paper presents a new scalable metric learning algorithm for multi-modal data. Our method learns an optimal metric for any feature set of the multi-modal data in an online fashion. We also combine the learned metrics with a novel Passive/Aggressive (PA)-based algorithm which results in a higher convergence rate compared to the state-of-the-art methods. To address scalability with respect to dimensionality, Dual Random Projection (DRP) is adopted in this paper. The present method is evaluated on some challenging machine vision datasets for image classification and Content-Based Information Retrieval (CBIR) tasks. The experimental results confirm that the proposed method significantly surpasses other state-of-the-art metric learning methods in most of these datasets in terms of both accuracy and efficiency.

Download Full-text

Evaluating the Performance of the state-of-the-art HybridSN Deep Learning Algorithm for Airborne Hyperspectral Image Classification

IOP Conference Series Earth and Environmental Science ◽

10.1088/1755-1315/767/1/012019 ◽

2021 ◽

Vol 767 (1) ◽

pp. 012019

Author(s):

M A A M Abidin ◽

H Z M Shafri ◽

M M A Al-Habshi ◽

N S N Shaharum

Keyword(s):

Deep Learning ◽

Image Classification ◽

Hyperspectral Image ◽

Learning Algorithm ◽

State Of The Art ◽

The State ◽

Hyperspectral Image Classification ◽

Deep Learning Algorithm

Download Full-text

A Scalable Redefined Stochastic Blockmodel

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3442589 ◽

2021 ◽

Vol 15 (3) ◽

pp. 1-28

Author(s):

Xueyan Liu ◽

Bo Yang ◽

Hechang Chen ◽

Katarzyna Musial ◽

Hongxu Chen ◽

...

Keyword(s):

Large Scale ◽

Network Science ◽

Learning Algorithm ◽

State Of The Art ◽

Real World Data ◽

Computational Overhead ◽

Stochastic Blockmodel ◽

Np Hard Problem ◽

Large Scale Networks ◽

The Cost

Stochastic blockmodel (SBM) is a widely used statistical network representation model, with good interpretability, expressiveness, generalization, and flexibility, which has become prevalent and important in the field of network science over the last years. However, learning an optimal SBM for a given network is an NP-hard problem. This results in significant limitations when it comes to applications of SBMs in large-scale networks, because of the significant computational overhead of existing SBM models, as well as their learning methods. Reducing the cost of SBM learning and making it scalable for handling large-scale networks, while maintaining the good theoretical properties of SBM, remains an unresolved problem. In this work, we address this challenging task from a novel perspective of model redefinition. We propose a novel redefined SBM with Poisson distribution and its block-wise learning algorithm that can efficiently analyse large-scale networks. Extensive validation conducted on both artificial and real-world data shows that our proposed method significantly outperforms the state-of-the-art methods in terms of a reasonable trade-off between accuracy and scalability. 1

Download Full-text

Metric Learning for Large Scale Image Classification: Generalizing to New Classes at Near-Zero Cost

Computer Vision – ECCV 2012 - Lecture Notes in Computer Science ◽

10.1007/978-3-642-33709-3_35 ◽

2012 ◽

pp. 488-501 ◽

Cited By ~ 66

Author(s):

Thomas Mensink ◽

Jakob Verbeek ◽

Florent Perronnin ◽

Gabriela Csurka

Keyword(s):

Image Classification ◽

Large Scale ◽

Metric Learning

Download Full-text

Active Learning Plus Deep Learning Can Establish Cost-Effective and Robust Model for Multichannel Image: A Case on Hyperspectral Image Classification

Sensors ◽

10.3390/s20174975 ◽

2020 ◽

Vol 20 (17) ◽

pp. 4975

Author(s):

Fangyu Shi ◽

Zhaodi Wang ◽

Menghan Hu ◽

Guangtao Zhai

Keyword(s):

Deep Learning ◽

Active Learning ◽

Image Classification ◽

Large Scale ◽

Hyperspectral Image ◽

Image Annotation ◽

Learning Algorithm ◽

Magnetic Resonance Images ◽

Biological Engineering ◽

Hyperspectral Image Classification

Relying on large scale labeled datasets, deep learning has achieved good performance in image classification tasks. In agricultural and biological engineering, image annotation is time-consuming and expensive. It also requires annotators to have technical skills in specific areas. Obtaining the ground truth is difficult because natural images are expensive. In addition, images in these areas are usually stored as multichannel images, such as computed tomography (CT) images, magnetic resonance images (MRI), and hyperspectral images (HSI). In this paper, we present a framework using active learning and deep learning for multichannel image classification. We use three active learning algorithms, including least confidence, margin sampling, and entropy, as the selection criteria. Based on this framework, we further introduce an “image pool” to make full advantage of images generated by data augmentation. To prove the availability of the proposed framework, we present a case study on agricultural hyperspectral image classification. The results show that the proposed framework achieves better performance compared with the deep learning model. Manual annotation of all the training sets achieves an encouraging accuracy. In comparison, using active learning algorithm of entropy and image pool achieves a similar accuracy with only part of the whole training set manually annotated. In practical application, the proposed framework can remarkably reduce labeling effort during the model development and upadting processes, and can be applied to multichannel image classification in agricultural and biological engineering.

Download Full-text

Unsupervised acquisition of entailment relations from the Web

Natural Language Engineering ◽

10.1017/s1351324913000156 ◽

2013 ◽

Vol 21 (1) ◽

pp. 3-47 ◽

Cited By ~ 1

Author(s):

IDAN SZPEKTOR ◽

HRISTO TANEV ◽

IDO DAGAN ◽

BONAVENTURA COPPOLA ◽

MILEN KOUYLEKOV

Keyword(s):

Large Scale ◽

Learning Algorithm ◽

State Of The Art ◽

High Coverage ◽

Web Based ◽

Rule Acquisition ◽

Learning Rules ◽

Entailment Relation ◽

Entailment Relations ◽

The Web

AbstractEntailment recognition is a primary generic task in natural language inference, whose focus is to detect whether the meaning of one expression can be inferred from the meaning of the other. Accordingly, many NLP applications would benefit from high coverage knowledgebases of paraphrases and entailment rules. To this end, learning such knowledgebases from the Web is especially appealing due to its huge size as well as its highly heterogeneous content, allowing for a more scalable rule extraction of various domains. However, the scalability of state-of-the-art entailment rule acquisition approaches from the Web is still limited. We present a fully unsupervised learning algorithm for Web-based extraction of entailment relations. We focus on increased scalability and generality with respect to prior work, with the potential of a large-scale Web-based knowledgebase. Our algorithm takes as its input a lexical–syntactic template and searches the Web for syntactic templates that participate in an entailment relation with the input template. Experiments show promising results, achieving performance similar to a state-of-the-art unsupervised algorithm, operating over an offline corpus, but with the benefit of learning rules for different domains with no additional effort.

Download Full-text

Uncertainty-Aware Few-Shot Image Classification

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/471 ◽

2021 ◽

Author(s):

Zhizheng Zhang ◽

Cuiling Lan ◽

Wenjun Zeng ◽

Zhibo Chen ◽

Shih-Fu Chang

Keyword(s):

Neural Network ◽

Image Classification ◽

State Of The Art ◽

Metric Learning ◽

The State ◽

Modeling Uncertainty ◽

Support Set ◽

Art Performance ◽

Graph Based Model ◽

Query Sample

Few-shot image classification learns to recognize new categories from limited labelled data. Metric learning based approaches have been widely investigated, where a query sample is classified by finding the nearest prototype from the support set based on their feature similarities. A neural network has different uncertainties on its calculated similarities of different pairs. Understanding and modeling the uncertainty on the similarity could promote the exploitation of limited samples in few-shot optimization. In this work, we propose Uncertainty-Aware Few-Shot framework for image classification by modeling uncertainty of the similarities of query-support pairs and performing uncertainty-aware optimization. Particularly, we exploit such uncertainty by converting observed similarities to probabilistic representations and incorporate them to the loss for more effective optimization. In order to jointly consider the similarities between a query and the prototypes in a support set, a graph-based model is utilized to estimate the uncertainty of the pairs. Extensive experiments show our proposed method brings significant improvements on top of a strong baseline and achieves the state-of-the-art performance.

Download Full-text

Deep Multi-Modal Metric Learning with Multi-Scale Correlation for Image-Text Retrieval

Electronics ◽

10.3390/electronics9030466 ◽

2020 ◽

Vol 9 (3) ◽

pp. 466 ◽

Cited By ~ 1

Author(s):

Yan Hua ◽

Yingyun Yang ◽

Jianhe Du

Keyword(s):

State Of The Art ◽

Metric Learning ◽

Heterogeneous Data ◽

Semantic Relationship ◽

Modal Data ◽

Fine Grained ◽

Semantic Correlation ◽

Multi Scale ◽

Deep Model ◽

Relationship Of

Multi-modal retrieval is a challenge due to heterogeneous gap and a complex semantic relationship between different modal data. Typical research map different modalities into a common subspace with a one-to-one correspondence or similarity/dissimilarity relationship of inter-modal data, in which the distances of heterogeneous data can be compared directly; thus, inter-modal retrieval can be achieved by the nearest neighboring search. However, most of them ignore intra-modal relations and complicated semantics between multi-modal data. In this paper, we propose a deep multi-modal metric learning method with multi-scale semantic correlation to deal with the retrieval tasks between image and text modalities. A deep model with two branches is designed to nonlinearly map raw heterogeneous data into comparable representations. In contrast to binary similarity, we formulate semantic relationship with multi-scale similarity to learn fine-grained multi-modal distances. Inter-modal and intra-modal correlations constructed on multi-scale semantic similarity are incorporated to train the deep model in an end-to-end way. Experiments validate the effectiveness of our proposed method on multi-modal retrieval tasks, and our method outperforms state-of-the-art methods on NUS-WIDE, MIR Flickr, and Wikipedia datasets.

Download Full-text

IMPLEMENTING BACK-PROPAGATION- THROUGH-TIME LEARNING ALGORITHM USING CELLULAR NEURAL NETWORKS

International Journal of Bifurcation and Chaos ◽

10.1142/s0218127499000730 ◽

1999 ◽

Vol 09 (06) ◽

pp. 1041-1074 ◽

Cited By ~ 4

Author(s):

TAO YANG ◽

LEON O. CHUA

Keyword(s):

Real Time ◽

Time Complexity ◽

Large Scale ◽

Learning Algorithm ◽

State Of The Art ◽

Back Propagation ◽

Cellular Neural Network ◽

Small Scale ◽

Von Neumann ◽

On Line

In a programmable (multistage) cellular neural network (CNN) structure, the CPU is a CNN universal chip which supports massively parallel computations on patterns and images, including videos. In this paper, we decompose the structure of a class of simultaneous recurrent networks (SRN) into a CNN program and run it on a von Neumann-like stored program CNN structure. To train the SRN, we map the back-propagation-through-time (BTT) learning algorithm into a sequence of CNN subroutines to achieve real-time performance via a CNN universal chip. By computing in parallel, the CNN universal chip can be programmed to implement in real time the BTT learning algorithm, which has a very high time complexity. An estimate of the time complexity of the BTT learning algorithm based on the CNN universal chip is presented. For small-scale problems, our simulation results show that a CNN implementation of the BTT learning algorithm for a two-dimensional SRN is at least 10,000 times faster than that based on state-of-the-art sequential workstations. For the few large-scale problems which we have so far simulated, the CNN implemented BTT learning algorithm maintained virtually the same time complexity with a learning time of a few seconds, while those implemented on state-of-the-art sequential workstations dramatically increased their time complexity, often requiring several days of running time. Several examples are presented to demonstrate how efficiently a CNN universal chip can speed up the learning algorithm for both off-line and on-line applications.

Download Full-text

Neural models for information retrieval without labeled data

ACM SIGIR Forum ◽

10.1145/3458553.3458569 ◽

2019 ◽

Vol 53 (2) ◽

pp. 104-105

Author(s):

Hamed Zamani

Keyword(s):

Neural Network ◽

Information Retrieval ◽

Performance Prediction ◽

Large Scale ◽

Deep Neural Networks ◽

State Of The Art ◽

Training Data ◽

Retrieval Model ◽

Neural Models ◽

Retrieval Models

Recent developments of machine learning models, and in particular deep neural networks, have yielded significant improvements on several computer vision, natural language processing, and speech recognition tasks. Progress with information retrieval (IR) tasks has been slower, however, due to the lack of large-scale training data as well as neural network models specifically designed for effective information retrieval [9]. In this dissertation, we address these two issues by introducing task-specific neural network architectures for a set of IR tasks and proposing novel unsupervised or weakly supervised solutions for training the models. The proposed learning solutions do not require labeled training data. Instead, in our weak supervision approach, neural models are trained on a large set of noisy and biased training data obtained from external resources, existing models, or heuristics. We first introduce relevance-based embedding models [3] that learn distributed representations for words and queries. We show that the learned representations can be effectively employed for a set of IR tasks, including query expansion, pseudo-relevance feedback, and query classification [1, 2]. We further propose a standalone learning to rank model based on deep neural networks [5, 8]. Our model learns a sparse representation for queries and documents. This enables us to perform efficient retrieval by constructing an inverted index in the learned semantic space. Our model outperforms state-of-the-art retrieval models, while performing as efficiently as term matching retrieval models. We additionally propose a neural network framework for predicting the performance of a retrieval model for a given query [7]. Inspired by existing query performance prediction models, our framework integrates several information sources, such as retrieval score distribution and term distribution in the top retrieved documents. This leads to state-of-the-art results for the performance prediction task on various standard collections. We finally bridge the gap between retrieval and recommendation models, as the two key components in most information systems. Search and recommendation often share the same goal: helping people get the information they need at the right time. Therefore, joint modeling and optimization of search engines and recommender systems could potentially benefit both systems [4]. In more detail, we introduce a retrieval model that is trained using user-item interaction (e.g., recommendation data), with no need to query-document relevance information for training [6]. Our solutions and findings in this dissertation smooth the path towards learning efficient and effective models for various information retrieval and related tasks, especially when large-scale training data is not available.

Download Full-text

Big Data Analytics for Train Delay Prediction

Innovative Applications of Big Data in the Railway Industry - Advances in Civil and Industrial Engineering ◽

10.4018/978-1-5225-3176-0.ch014 ◽

2018 ◽

pp. 320-348 ◽

Cited By ~ 1

Author(s):

Emanuele Fumeo ◽

Luca Oneto ◽

Giorgio Clerico ◽

Renzo Canepa ◽

Federico Papa ◽

...

Keyword(s):

Big Data ◽

Large Scale ◽

Learning Algorithm ◽

State Of The Art ◽

Fast Learning ◽

Current State ◽

Big Data Technologies ◽

Large Scale Data Processing ◽

Prediction Systems ◽

Delay Prediction

Current Train Delay Prediction Systems (TDPSs) do not take advantage of state-of-the-art tools and techniques for extracting useful insights from large amounts of historical data collected by the railway information systems. Instead, these systems rely on static rules, based on classical univariate statistic, built by experts of the railway infrastructure. The purpose of this book chapter is to build a data-driven TDPS for large-scale railway networks, which exploits the most recent big data technologies, learning algorithms, and statistical tools. In particular, we propose a fast learning algorithm for Shallow and Deep Extreme Learning Machines that fully exploits the recent in-memory large-scale data processing technologies for predicting train delays. Proposal has been compared with the current state-of-the-art TDPSs. Results on real world data coming from the Italian railway network show that our proposal is able to improve over the current state-of-the-art TDPSs.

Download Full-text