MS-NET: modular selective network

Abstract We propose a modular architecture of Deep Neural Network (DNN) for multi-class classification task. The architecture consists of two parts, a router network and a set of expert networks. In this architecture, for a C-class classification problem, we have exactly C experts. The backbone network for these experts and the router are built with simple and identical DNN architecture. For each class, the modular network has a certain number $$\rho$$ ρ of expert networks specializing in that particular class, where $$\rho$$ ρ is called the redundancy rate in this study. We demonstrate that $$\rho$$ ρ plays a vital role in the performance of the network. Although these experts are light weight and weak learners alone, together they match the performance of more complex DNNs. We train the network in two phase wherein, first the router is trained on the whole set of training data followed by training each expert network enforced by a new stochastic objective function that facilitates alternative training on a small subset of expert data and the whole set of data. This alternative training provides an additional form of regularization and avoids over-fitting the expert network on subset data. During the testing phase, the router dynamically selects a fixed number of experts for further evaluation of the input datum. The modular nature and low parameter requirement of the network makes it very suitable in distributed and low computational environments. Extensive empirical study and theoretical analysis on CIFAR-10, CIFAR-100 and F-MNIST substantiate the effectiveness and efficiency of our proposed modular network.

Download Full-text

Brain Activity-Based Metrics for Assessing Learning States in VR under Stress among Firefighters: An Explorative Machine Learning Approach in Neuroergonomics

Brain Sciences ◽

10.3390/brainsci11070885 ◽

2021 ◽

Vol 11 (7) ◽

pp. 885

Author(s):

Maher Abujelala ◽

Rohith Karthikeyan ◽

Oshin Tyagi ◽

Jing Du ◽

Ranjana K. Mehta

Keyword(s):

Machine Learning ◽

Environmental Conditions ◽

Brain Activity ◽

Memory Task ◽

Classification Problem ◽

Brain Regions ◽

Training Data ◽

Information Encoding ◽

Machine Learning Approach ◽

Encoding And Retrieval

The nature of firefighters` duties requires them to work for long periods under unfavorable conditions. To perform their jobs effectively, they are required to endure long hours of extensive, stressful training. Creating such training environments is very expensive and it is difficult to guarantee trainees’ safety. In this study, firefighters are trained in a virtual environment that includes virtual perturbations such as fires, alarms, and smoke. The objective of this paper is to use machine learning methods to discern encoding and retrieval states in firefighters during a visuospatial episodic memory task and explore which regions of the brain provide suitable signals to solve this classification problem. Our results show that the Random Forest algorithm could be used to distinguish between information encoding and retrieval using features extracted from fNIRS data. Our algorithm achieved an F-1 score of 0.844 and an accuracy of 79.10% if the training and testing data are obtained at similar environmental conditions. However, the algorithm’s performance dropped to an F-1 score of 0.723 and accuracy of 60.61% when evaluated on data collected under different environmental conditions than the training data. We also found that if the training and evaluation data were recorded under the same environmental conditions, the RPM, LDLPFC, RDLPFC were the most relevant brain regions under non-stressful, stressful, and a mix of stressful and non-stressful conditions, respectively.

Download Full-text

Exemplar Guided Neural Dialogue Generation

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/498 ◽

2020 ◽

Author(s):

Hengyi Cai ◽

Hongshen Chen ◽

Yonghao Song ◽

Xiaofang Zhao ◽

Dawei Yin

Keyword(s):

Large Scale ◽

State Of The Art ◽

Training Data ◽

Small Subset ◽

Generation Model ◽

Retrieval Model ◽

Training Set ◽

Dialogue Model ◽

Quantitative Metrics ◽

The Given

Humans benefit from previous experiences when taking actions. Similarly, related examples from the training data also provide exemplary information for neural dialogue models when responding to a given input message. However, effectively fusing such exemplary information into dialogue generation is non-trivial: useful exemplars are required to be not only literally-similar, but also topic-related with the given context. Noisy exemplars impair the neural dialogue models understanding the conversation topics and even corrupt the response generation. To address the issues, we propose an exemplar guided neural dialogue generation model where exemplar responses are retrieved in terms of both the text similarity and the topic proximity through a two-stage exemplar retrieval model. In the first stage, a small subset of conversations is retrieved from a training set given a dialogue context. These candidate exemplars are then finely ranked regarding the topical proximity to choose the best-matched exemplar response. To further induce the neural dialogue generation model consulting the exemplar response and the conversation topics more faithfully, we introduce a multi-source sampling mechanism to provide the dialogue model with both local exemplary semantics and global topical guidance during decoding. Empirical evaluations on a large-scale conversation dataset show that the proposed approach significantly outperforms the state-of-the-art in terms of both the quantitative metrics and human evaluations.

Download Full-text

A Novel Convolutional Neural Network for Classifying Indian Coins by Denomination

10.31224/osf.io/znxrg ◽

2021 ◽

Author(s):

Yash Chauhan ◽

Prateek Singh

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Academic Research ◽

High Volume ◽

Classification Problem ◽

General Purpose ◽

Training Data ◽

Slot Machines ◽

Validation Data ◽

Starting Point

Coins recognition systems have humungous applications from vending and slot machines to banking and management firms which directly translate to a high volume of research regarding the development of methods for such classification. In recent years, academic research has shifted towards a computer vision approach for sorting coins due to the advancement in the field of deep learning. However, most of the documented work utilizes what is known as ‘Transfer Learning’ in which we reuse a pre-trained model of a fixed architecture as a starting point for our training. While such an approach saves us a lot of time and effort, the generic nature of the pre-trained model can often become a bottleneck for performance on a specialized problem such as coin classification. This study develops a convolutional neural network (CNN) model from scratch and tests it against a widely-used general-purpose architecture known as Googlenet. We have shown in this study by comparing the performance of our model with that of Googlenet (documented in various previous studies) that a more straightforward and specialized architecture is more optimal than a more complex general architecture for the coin classification problem. The model developed in this study is trained and tested on 720 and 180 images of Indian coins of different denominations, respectively. The final accuracy gained by the model is 91.62% on the training data, while the accuracy is 90.55% on the validation data.

Download Full-text

On the binary classification problem in discriminant analysis using linear programming methods

Operations Research and Decisions ◽

10.37190/ord200107 ◽

2020 ◽

Vol 30 (1) ◽

Author(s):

Michael O. Olusola ◽

Sydney I. Onyeagu

Keyword(s):

Linear Programming ◽

Discriminant Analysis ◽

Binary Classification ◽

Classification Problem ◽

Solution Technique ◽

Phase Method ◽

Bound Constraints ◽

Two Phase ◽

Linear Discriminant ◽

Binary Classification Problem

This paper is centred on a binary classification problem in which it is desired to assign a new object with multivariate features to one of two distinct populations as based on historical sets of samples from two populations. A linear discriminant analysis framework has been proposed, called the minimised sum of deviations by proportion (MSDP) to model the binary classification problem. In the MSDP formulation, the sum of the proportion of exterior deviations is minimised subject to the group separation constraints, the normalisation constraint, the upper bound constraints on proportions of exterior deviations and the sign unrestriction vis-à-vis the non-negativity constraints. The two-phase method in linear programming is adopted as a solution technique to generate the discriminant function. The decision rule on group-membership prediction is constructed using the apparent error rate. The performance of the MSDP has been compared with some existing linear discriminant models using a previously published dataset on road casualties. The MSDP model was more promising and well suited for the imbalanced dataset on road casualties.

Download Full-text

Experiment of OCITN: Considering Appropriate Goal Images and Metric for One-Class Image Transformation Network

10.3233/faia210045 ◽

2021 ◽

Author(s):

Toshitaka Hayashi ◽

Hamido Fujita

Keyword(s):

Image Data ◽

Classification Problem ◽

Model Error ◽

Input Image ◽

Training Data ◽

Distance Metric ◽

Image Transformation ◽

Image Entropy ◽

Image Metrics ◽

One Class Classification

One-class classification (OCC) is a classification problem where training data includes only one class. In such a problem, two types of classes exist, seen class and unseen class, and classifying these classes is a challenge. Besides, One-class Image Transformation Network (OCITN) is an OCC algorithm for image data. In which, image transformation network (ITN) is trained. ITN aims to transform all input image into one image, namely goal image. Moreover, the model error of ITN is computed as a distance metric between ITN output and a goal image. Besides, OCITN accuracy is related to goal image, and finding an appropriate goal image is challenging. In this paper, 234 goal images are experimented with in OCITN using the CIFAR10 dataset. Experiment results are analyzed with three image metrics: image entropy, similarity with seen images, and image derivatives.

Download Full-text

Harnessing GANs for Zero-Shot Learning of New Classes in Visual Speech Recognition

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i03.5649 ◽

2020 ◽

Vol 34 (03) ◽

pp. 2645-2652 ◽

Cited By ~ 2

Author(s):

Yaman Kumar ◽

Dhruva Sahrawat ◽

Shubham Maheshwari ◽

Debanjan Mahata ◽

Amanda Stent ◽

...

Keyword(s):

Speech Recognition ◽

Classification Problem ◽

Visual Speech ◽

Training Data ◽

Generative Adversarial Networks ◽

Adversarial Networks ◽

Novel Approach ◽

Visual Speech Recognition ◽

Training Samples ◽

English Training

Visual Speech Recognition (VSR) is the process of recognizing or interpreting speech by watching the lip movements of the speaker. Recent machine learning based approaches model VSR as a classification problem; however, the scarcity of training data leads to error-prone systems with very low accuracies in predicting unseen classes. To solve this problem, we present a novel approach to zero-shot learning by generating new classes using Generative Adversarial Networks (GANs), and show how the addition of unseen class samples increases the accuracy of a VSR system by a significant margin of 27% and allows it to handle speaker-independent out-of-vocabulary phrases. We also show that our models are language agnostic and therefore capable of seamlessly generating, using English training data, videos for a new language (Hindi). To the best of our knowledge, this is the first work to show empirical evidence of the use of GANs for generating training samples of unseen classes in the domain of VSR, hence facilitating zero-shot learning. We make the added videos for new classes publicly available along with our code1.

Download Full-text

A Kernel Two-Phase Test Sample Sparse Representation for Face Recognition

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001416560012 ◽

2015 ◽

Vol 30 (01) ◽

pp. 1656001

Author(s):

Zhonghua Liu ◽

Jiexin Pu ◽

Yong Qiu ◽

Moli Zhang ◽

Xiaoli Zhang ◽

...

Keyword(s):

Face Recognition ◽

Sparse Representation ◽

Recognition Performance ◽

Mapping Function ◽

Feature Space ◽

Test Sample ◽

Training Data ◽

Two Phase ◽

Kernel Parameter ◽

Representation Method

Sparse representation is a new hot technique in recent years. The two-phase test sample sparse representation method (TPTSSR) achieved an excellent performance in face recognition. In this paper, a kernel two-phase test sample sparse representation method (KTPTSSR) is proposed. Firstly, the input data are mapped into an implicit high-dimensional feature space by a nonlinear mapping function. Secondly, the data are analyzed by means of the TPTSSR method in the feature space. If an appropriate kernel function and the corresponding kernel parameter are selected, a test sample can be accurately represented as the linear combination of the training data with the same label information of the test sample. Therefore, the proposed method could have better recognition performance than TPTSSR. Experiments on the face databases demonstrate the effectiveness of our methods.

Download Full-text

Few-shot Classification of Aerial Scene Images via Meta-learning

10.20944/preprints202010.0033.v1 ◽

2020 ◽

Cited By ~ 1

Author(s):

Pei Zhang ◽

YIng Li ◽

Dong Wang ◽

Yunpeng Bai

Keyword(s):

Classification Problem ◽

Training Data ◽

Scene Classification ◽

Shot Classification ◽

Training Stage ◽

Testing Stage ◽

Meta Learning ◽

Cosine Distance ◽

Query Sample

CNN-based methods have dominated the field of aerial scene classification for the past few years. While achieving remarkable success, CNN-based methods suffer from excessive parameters and notoriously rely on large amounts of training data. In this work, we introduce few-shot learning to the aerial scene classification problem. Few-shot learning aims to learn a model on base-set that can quickly adapt to unseen categories in novel-set, using only a few labeled samples. To this end, we proposed a meta-learning method for few-shot classification of aerial scene images. First, we train a feature extractor on all base categories to learn a representation of inputs. Then in the meta-training stage, the classifier is optimized in the metric space by cosine distance with a learnable scale parameter. At last, in the meta-testing stage, the query sample in the unseen category is predicted by the adapted classifier given a few support samples. We conduct extensive experiments on two challenging datasets: NWPU-RESISC45 and RSD46-WHU. The experimental results show that our method outperforms three state-of-the-art few-shot algorithms and one typical CNN-based method, D-CNN. Furthermore, several ablation experiments are conducted to investigate the effects of dataset scale and support shots; the experiment results confirm that our model is specifically effective in few-shot settings.

Download Full-text

SPADE: A Semi-supervised Probabilistic Approach for Detecting Errors in Tables

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/488 ◽

2021 ◽

Author(s):

Minh Pham ◽

Craig A. Knoblock ◽

Muhao Chen ◽

Binh Vu ◽

Jay Pujara

Keyword(s):

Error Detection ◽

Data Augmentation ◽

State Of The Art ◽

Probabilistic Approach ◽

Human Interaction ◽

Training Data ◽

Two Phase ◽

Learning Classifier ◽

Supervised Methods ◽

Phase Data

Error detection is one of the most important steps in data cleaning and usually requires extensive human interaction to ensure quality. Existing supervised methods in error detection require a significant amount of training data while unsupervised methods rely on fixed inductive biases, which are usually hard to generalize, to solve the problem. In this paper, we present SPADE, a novel semi-supervised probabilistic approach for error detection. SPADE introduces a novel probabilistic active learning model, where the system suggests examples to be labeled based on the agreements between user labels and indicative signals, which are designed to capture potential errors. SPADE uses a two-phase data augmentation process to enrich a dataset before training a deep learning classifier to detect unlabeled errors. In our evaluation, SPADE achieves an average F1-score of 0.91 over five datasets and yields a 10% improvement compared with the state-of-the-art systems.

Download Full-text

An improved method to determine basic probability assignment with interval number and its application in classification

International Journal of Distributed Sensor Networks ◽

10.1177/1550147718820524 ◽

2019 ◽

Vol 15 (1) ◽

pp. 155014771882052 ◽

Cited By ~ 2

Author(s):

Bowen Qin ◽

Fuyuan Xiao

Keyword(s):

Evidence Theory ◽

Classification Problem ◽

Interval Number ◽

Training Data ◽

Cluster Method ◽

Uncertain Information ◽

Data Set ◽

Probability Assignment ◽

Basic Probability Assignment ◽

Basic Probability

Due to its efficiency to handle uncertain information, Dempster–Shafer evidence theory has become the most important tool in many information fusion systems. However, how to determine basic probability assignment, which is the first step in evidence theory, is still an open issue. In this article, a new method integrating interval number theory and k-means++ cluster method is proposed to determine basic probability assignment. At first, k-means++ clustering method is used to calculate lower and upper bound values of interval number with training data. Then, the differentiation degree based on distance and similarity of interval number between the test sample and constructed models are defined to generate basic probability assignment. Finally, Dempster’s combination rule is used to combine multiple basic probability assignments to get the final basic probability assignment. The experiments on Iris data set that is widely used in classification problem illustrated that the proposed method is effective in determining basic probability assignment and classification problem, and the proposed method shows more accurate results in which the classification accuracy reaches 96.7%.

Download Full-text