EPOC: Efficient Perception via Optimal Communication

Masoumeh Heidari Kapourchali; Bonny Banerjee

doi:10.1609/aaai.v34i04.5830

EPOC: Efficient Perception via Optimal Communication

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5830 ◽

2020 ◽

Vol 34 (04) ◽

pp. 4107-4114 ◽

Cited By ~ 1

Author(s):

Masoumeh Heidari Kapourchali ◽

Bonny Banerjee

Keyword(s):

Recognition Accuracy ◽

State Of The Art ◽

The State ◽

Belief State ◽

Sensor Data ◽

Communication Policy ◽

Agent Model ◽

Proposed Model ◽

Benchmark Datasets ◽

Communication Policies

We propose an agent model capable of actively and selectively communicating with other agents to predict its environmental state efficiently. Selecting whom to communicate with is a challenge when the internal model of other agents is unobservable. Our agent learns a communication policy as a mapping from its belief state to with whom to communicate in an online and unsupervised manner, without any reinforcement. Human activity recognition from multimodal, multisource and heterogeneous sensor data is used as a testbed to evaluate the proposed model where each sensor is assumed to be monitored by an agent. The recognition accuracy on benchmark datasets is comparable to the state-of-the-art even though our model uses significantly fewer parameters and infers the state in a localized manner. The learned policy reduces number of communications. The agent is tolerant to communication failures and can recognize unreliable agents through their communication messages. To the best of our knowledge, this is the first work on learning communication policies by an agent for predicting its environmental state.

Download Full-text

Representation Learning for Fine-Grained Change Detection

Sensors ◽

10.3390/s21134486 ◽

2021 ◽

Vol 21 (13) ◽

pp. 4486

Author(s):

Niall O’Mahony ◽

Sean Campbell ◽

Lenka Krpalkova ◽

Anderson Carvalho ◽

Joseph Walsh ◽

...

Keyword(s):

Deep Learning ◽

Change Detection ◽

Model Calibration ◽

State Of The Art ◽

Representation Learning ◽

Machine Intelligence ◽

The State ◽

Sensor Data ◽

Fine Grained ◽

Learning Techniques

Fine-grained change detection in sensor data is very challenging for artificial intelligence though it is critically important in practice. It is the process of identifying differences in the state of an object or phenomenon where the differences are class-specific and are difficult to generalise. As a result, many recent technologies that leverage big data and deep learning struggle with this task. This review focuses on the state-of-the-art methods, applications, and challenges of representation learning for fine-grained change detection. Our research focuses on methods of harnessing the latent metric space of representation learning techniques as an interim output for hybrid human-machine intelligence. We review methods for transforming and projecting embedding space such that significant changes can be communicated more effectively and a more comprehensive interpretation of underlying relationships in sensor data is facilitated. We conduct this research in our work towards developing a method for aligning the axes of latent embedding space with meaningful real-world metrics so that the reasoning behind the detection of change in relation to past observations may be revealed and adjusted. This is an important topic in many fields concerned with producing more meaningful and explainable outputs from deep learning and also for providing means for knowledge injection and model calibration in order to maintain user confidence.

Download Full-text

Named Entity Recognition and Relation Extraction

ACM Computing Surveys ◽

10.1145/3445965 ◽

2021 ◽

Vol 54 (1) ◽

pp. 1-39

Author(s):

Zara Nasar ◽

Syed Waqar Jaffry ◽

Muhammad Kamran Malik

Keyword(s):

Deep Learning ◽

State Of The Art ◽

Named Entity Recognition ◽

Relation Extraction ◽

The State ◽

Entity Recognition ◽

Joint Models ◽

Named Entity ◽

Textual Data ◽

Benchmark Datasets

With the advent of Web 2.0, there exist many online platforms that result in massive textual-data production. With ever-increasing textual data at hand, it is of immense importance to extract information nuggets from this data. One approach towards effective harnessing of this unstructured textual data could be its transformation into structured text. Hence, this study aims to present an overview of approaches that can be applied to extract key insights from textual data in a structured way. For this, Named Entity Recognition and Relation Extraction are being majorly addressed in this review study. The former deals with identification of named entities, and the latter deals with problem of extracting relation between set of entities. This study covers early approaches as well as the developments made up till now using machine learning models. Survey findings conclude that deep-learning-based hybrid and joint models are currently governing the state-of-the-art. It is also observed that annotated benchmark datasets for various textual-data generators such as Twitter and other social forums are not available. This scarcity of dataset has resulted into relatively less progress in these domains. Additionally, the majority of the state-of-the-art techniques are offline and computationally expensive. Last, with increasing focus on deep-learning frameworks, there is need to understand and explain the under-going processes in deep architectures.

Download Full-text

Large-scale Semantic Parsing without Question-Answer Pairs

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00190 ◽

2014 ◽

Vol 2 ◽

pp. 377-392 ◽

Cited By ~ 40

Author(s):

Siva Reddy ◽

Mirella Lapata ◽

Mark Steedman

Keyword(s):

Natural Language ◽

Large Scale ◽

Graph Matching ◽

State Of The Art ◽

The State ◽

Semantic Parsing ◽

Matching Problem ◽

Weak Supervision ◽

Benchmark Datasets

In this paper we introduce a novel semantic parsing approach to query Freebase in natural language without requiring manual annotations or question-answer pairs. Our key insight is to represent natural language via semantic graphs whose topology shares many commonalities with Freebase. Given this representation, we conceptualize semantic parsing as a graph matching problem. Our model converts sentences to semantic graphs using CCG and subsequently grounds them to Freebase guided by denotations as a form of weak supervision. Evaluation experiments on a subset of the Free917 and WebQuestions benchmark datasets show our semantic parser improves over the state of the art.

Download Full-text

Two-Level Transformer and Auxiliary Coherence Modeling for Improved Text Segmentation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6284 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7797-7804

Author(s):

Goran Glavašš ◽

Swapna Somasundaran

Keyword(s):

State Of The Art ◽

Language Transfer ◽

Text Segmentation ◽

Word Embeddings ◽

Neural Architecture ◽

Text Coherence ◽

Sentence Level ◽

Proposed Model ◽

Benchmark Datasets ◽

Cross Lingual

Breaking down the structure of long texts into semantically coherent segments makes the texts more readable and supports downstream applications like summarization and retrieval. Starting from an apparent link between text coherence and segmentation, we introduce a novel supervised model for text segmentation with simple but explicit coherence modeling. Our model – a neural architecture consisting of two hierarchically connected Transformer networks – is a multi-task learning model that couples the sentence-level segmentation objective with the coherence objective that differentiates correct sequences of sentences from corrupt ones. The proposed model, dubbed Coherence-Aware Text Segmentation (CATS), yields state-of-the-art segmentation performance on a collection of benchmark datasets. Furthermore, by coupling CATS with cross-lingual word embeddings, we demonstrate its effectiveness in zero-shot language transfer: it can successfully segment texts in languages unseen in training.

Download Full-text

A Comprehensive Taxonomy of Dynamic Texture Representation

ACM Computing Surveys ◽

10.1145/3487892 ◽

2023 ◽

Vol 55 (1) ◽

pp. 1-39

Author(s):

Thanh Tuan Nguyen ◽

Thanh Phuong Nguyen

Keyword(s):

Large Scale ◽

Environmental Changes ◽

State Of The Art ◽

The State ◽

Future Research ◽

Research Activities ◽

Potential Applications ◽

Benchmark Datasets ◽

Negative Impacts ◽

Made In

Representing dynamic textures (DTs) plays an important role in many real implementations in the computer vision community. Due to the turbulent and non-directional motions of DTs along with the negative impacts of different factors (e.g., environmental changes, noise, illumination, etc.), efficiently analyzing DTs has raised considerable challenges for the state-of-the-art approaches. For 20 years, many different techniques have been introduced to handle the above well-known issues for enhancing the performance. Those methods have shown valuable contributions, but the problems have been incompletely dealt with, particularly recognizing DTs on large-scale datasets. In this article, we present a comprehensive taxonomy of DT representation in order to purposefully give a thorough overview of the existing methods along with overall evaluations of their obtained performances. Accordingly, we arrange the methods into six canonical categories. Each of them is then taken in a brief presentation of its principal methodology stream and various related variants. The effectiveness levels of the state-of-the-art methods are then investigated and thoroughly discussed with respect to quantitative and qualitative evaluations in classifying DTs on benchmark datasets. Finally, we point out several potential applications and the remaining challenges that should be addressed in further directions. In comparison with two existing shallow DT surveys (i.e., the first one is out of date as it was made in 2005, while the newer one (published in 2016) is an inadequate overview), we believe that our proposed comprehensive taxonomy not only provides a better view of DT representation for the target readers but also stimulates future research activities.

Download Full-text

SANTM: Efficient Self-attention-driven Network for Text Matching

ACM Transactions on Internet Technology ◽

10.1145/3426971 ◽

2022 ◽

Vol 22 (3) ◽

pp. 1-21

Author(s):

Prayag Tiwari ◽

Amit Kumar Jaiswal ◽

Sahil Garg ◽

Ilsun You

Keyword(s):

Natural Language ◽

State Of The Art ◽

The State ◽

Attention Mechanism ◽

Matching Problems ◽

Attention Model ◽

Extra Information ◽

Textual Entailment ◽

Benchmark Datasets ◽

Text Matching

Self-attention mechanisms have recently been embraced for a broad range of text-matching applications. Self-attention model takes only one sentence as an input with no extra information, i.e., one can utilize the final hidden state or pooling. However, text-matching problems can be interpreted either in symmetrical or asymmetrical scopes. For instance, paraphrase detection is an asymmetrical task, while textual entailment classification and question-answer matching are considered asymmetrical tasks. In this article, we leverage attractive properties of self-attention mechanism and proposes an attention-based network that incorporates three key components for inter-sequence attention: global pointwise features, preceding attentive features, and contextual features while updating the rest of the components. Our model follows evaluation on two benchmark datasets cover tasks of textual entailment and question-answer matching. The proposed efficient Self-attention-driven Network for Text Matching outperforms the state of the art on the Stanford Natural Language Inference and WikiQA datasets with much fewer parameters.

Download Full-text

Coupled CycleGAN: Unsupervised Hashing Network for Cross-Modal Retrieval

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.3301176 ◽

2019 ◽

Vol 33 ◽

pp. 176-183 ◽

Cited By ~ 11

Author(s):

Chao Li ◽

Cheng Deng ◽

Lei Wang ◽

De Xie ◽

Xianglong Liu

Keyword(s):

Large Scale ◽

State Of The Art ◽

The State ◽

Storage Cost ◽

Common Representation ◽

Benchmark Datasets ◽

Query Efficiency ◽

Hash Codes

In recent years, hashing has attracted more and more attention owing to its superior capacity of low storage cost and high query efficiency in large-scale cross-modal retrieval. Benefiting from deep leaning, continuously compelling results in cross-modal retrieval community have been achieved. However, existing deep cross-modal hashing methods either rely on amounts of labeled information or have no ability to learn an accuracy correlation between different modalities. In this paper, we proposed Unsupervised coupled Cycle generative adversarial Hashing networks (UCH), for cross-modal retrieval, where outer-cycle network is used to learn powerful common representation, and inner-cycle network is explained to generate reliable hash codes. Specifically, our proposed UCH seamlessly couples these two networks with generative adversarial mechanism, which can be optimized simultaneously to learn representation and hash codes. Extensive experiments on three popular benchmark datasets show that the proposed UCH outperforms the state-of-the-art unsupervised cross-modal hashing methods.

Download Full-text

Encoding Time Series as Multi-Scale Signed Recurrence Plots for Classification Using Fully Convolutional Networks

Sensors ◽

10.3390/s20143818 ◽

2020 ◽

Vol 20 (14) ◽

pp. 3818

Author(s):

Ye Zhang ◽

Yi Hou ◽

Shilin Zhou ◽

Kewei Ouyang

Keyword(s):

Time Series ◽

State Of The Art ◽

Recurrence Plot ◽

The State ◽

Recurrence Plots ◽

Convolutional Networks ◽

Multi Scale ◽

Fully Convolutional Networks ◽

Benchmark Datasets ◽

Visualization Evaluation

Recent advances in time series classification (TSC) have exploited deep neural networks (DNN) to improve the performance. One promising approach encodes time series as recurrence plot (RP) images for the sake of leveraging the state-of-the-art DNN to achieve accuracy. Such an approach has been shown to achieve impressive results, raising the interest of the community in it. However, it remains unsolved how to handle not only the variability in the distinctive region scale and the length of sequences but also the tendency confusion problem. In this paper, we tackle the problem using Multi-scale Signed Recurrence Plots (MS-RP), an improvement of RP, and propose a novel method based on MS-RP images and Fully Convolutional Networks (FCN) for TSC. This method first introduces phase space dimension and time delay embedding of RP to produce multi-scale RP images; then, with the use of asymmetrical structure, constructed RP images can represent very long sequences (>700 points). Next, MS-RP images are obtained by multiplying designed sign masks in order to remove the tendency confusion. Finally, FCN is trained with MS-RP images to perform classification. Experimental results on 45 benchmark datasets demonstrate that our method improves the state-of-the-art in terms of classification accuracy and visualization evaluation.

Download Full-text

Appearance and Motion Enhancement for Video-Based Person Re-Identification

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6802 ◽

2020 ◽

Vol 34 (07) ◽

pp. 11394-11401

Author(s):

Shuzhao Li ◽

Huimin Yu ◽

Haoji Hu

Keyword(s):

Semantic Information ◽

State Of The Art ◽

Complex Model ◽

The State ◽

Final Model ◽

Backbone Network ◽

Proposed Model ◽

Art Performance ◽

Attribute Recognition

In this paper, we propose an Appearance and Motion Enhancement Model (AMEM) for video-based person re-identification to enrich the two kinds of information contained in the backbone network in a more interpretable way. Concretely, human attribute recognition under the supervision of pseudo labels is exploited in an Appearance Enhancement Module (AEM) to help enrich the appearance and semantic information. A Motion Enhancement Module (MEM) is designed to capture the identity-discriminative walking patterns through predicting future frames. Despite a complex model with several auxiliary modules during training, only the backbone model plus two small branches are kept for similarity evaluation which constitute a simple but effective final model. Extensive experiments conducted on three popular video-based person ReID benchmarks demonstrate the effectiveness of our proposed model and the state-of-the-art performance compared with existing methods.

Download Full-text

From Data Acquisition to Data Fusion: A Comprehensive Review and a Roadmap for the Identification of Activities of Daily Living Using Mobile Devices

Sensors ◽

10.3390/s16020184 ◽

2016 ◽

Vol 16 (2) ◽

pp. 184 ◽

Cited By ~ 77

Author(s):

Ivan Pires ◽

Nuno Garcia ◽

Nuno Pombo ◽

Francisco Flórez-Revuelta

Keyword(s):

Data Fusion ◽

Activities Of Daily Living ◽

Mobile Devices ◽

Daily Living ◽

State Of The Art ◽

The State ◽

Sensor Data ◽

Battery Life ◽

Sensor Data Fusion ◽

Processing Power

This paper focuses on the research on the state of the art for sensor fusion techniques, applied to the sensors embedded in mobile devices, as a means to help identify the mobile device user’s daily activities. Sensor data fusion techniques are used to consolidate the data collected from several sensors, increasing the reliability of the algorithms for the identification of the different activities. However, mobile devices have several constraints, e.g., low memory, low battery life and low processing power, and some data fusion techniques are not suited to this scenario. The main purpose of this paper is to present an overview of the state of the art to identify examples of sensor data fusion techniques that can be applied to the sensors available in mobile devices aiming to identify activities of daily living (ADLs).

Download Full-text