scholarly journals MS-NET: modular selective network

Author(s):  
Intisar Md Chowdhury ◽  
Kai Su ◽  
Qiangfu Zhao

Abstract We propose a modular architecture of Deep Neural Network (DNN) for multi-class classification task. The architecture consists of two parts, a router network and a set of expert networks. In this architecture, for a C-class classification problem, we have exactly C experts. The backbone network for these experts and the router are built with simple and identical DNN architecture. For each class, the modular network has a certain number $$\rho$$ ρ of expert networks specializing in that particular class, where $$\rho$$ ρ is called the redundancy rate in this study. We demonstrate that $$\rho$$ ρ plays a vital role in the performance of the network. Although these experts are light weight and weak learners alone, together they match the performance of more complex DNNs. We train the network in two phase wherein, first the router is trained on the whole set of training data followed by training each expert network enforced by a new stochastic objective function that facilitates alternative training on a small subset of expert data and the whole set of data. This alternative training provides an additional form of regularization and avoids over-fitting the expert network on subset data. During the testing phase, the router dynamically selects a fixed number of experts for further evaluation of the input datum. The modular nature and low parameter requirement of the network makes it very suitable in distributed and low computational environments. Extensive empirical study and theoretical analysis on CIFAR-10, CIFAR-100 and F-MNIST substantiate the effectiveness and efficiency of our proposed modular network.

2021 ◽  
Vol 11 (7) ◽  
pp. 885
Author(s):  
Maher Abujelala ◽  
Rohith Karthikeyan ◽  
Oshin Tyagi ◽  
Jing Du ◽  
Ranjana K. Mehta

The nature of firefighters` duties requires them to work for long periods under unfavorable conditions. To perform their jobs effectively, they are required to endure long hours of extensive, stressful training. Creating such training environments is very expensive and it is difficult to guarantee trainees’ safety. In this study, firefighters are trained in a virtual environment that includes virtual perturbations such as fires, alarms, and smoke. The objective of this paper is to use machine learning methods to discern encoding and retrieval states in firefighters during a visuospatial episodic memory task and explore which regions of the brain provide suitable signals to solve this classification problem. Our results show that the Random Forest algorithm could be used to distinguish between information encoding and retrieval using features extracted from fNIRS data. Our algorithm achieved an F-1 score of 0.844 and an accuracy of 79.10% if the training and testing data are obtained at similar environmental conditions. However, the algorithm’s performance dropped to an F-1 score of 0.723 and accuracy of 60.61% when evaluated on data collected under different environmental conditions than the training data. We also found that if the training and evaluation data were recorded under the same environmental conditions, the RPM, LDLPFC, RDLPFC were the most relevant brain regions under non-stressful, stressful, and a mix of stressful and non-stressful conditions, respectively.


Author(s):  
Hengyi Cai ◽  
Hongshen Chen ◽  
Yonghao Song ◽  
Xiaofang Zhao ◽  
Dawei Yin

Humans benefit from previous experiences when taking actions. Similarly, related examples from the training data also provide exemplary information for neural dialogue models when responding to a given input message. However, effectively fusing such exemplary information into dialogue generation is non-trivial: useful exemplars are required to be not only literally-similar, but also topic-related with the given context. Noisy exemplars impair the neural dialogue models understanding the conversation topics and even corrupt the response generation. To address the issues, we propose an exemplar guided neural dialogue generation model where exemplar responses are retrieved in terms of both the text similarity and the topic proximity through a two-stage exemplar retrieval model. In the first stage, a small subset of conversations is retrieved from a training set given a dialogue context. These candidate exemplars are then finely ranked regarding the topical proximity to choose the best-matched exemplar response. To further induce the neural dialogue generation model consulting the exemplar response and the conversation topics more faithfully, we introduce a multi-source sampling mechanism to provide the dialogue model with both local exemplary semantics and global topical guidance during decoding. Empirical evaluations on a large-scale conversation dataset show that the proposed approach significantly outperforms the state-of-the-art in terms of both the quantitative metrics and human evaluations.


2021 ◽  
Author(s):  
Yash Chauhan ◽  
Prateek Singh

Coins recognition systems have humungous applications from vending and slot machines to banking and management firms which directly translate to a high volume of research regarding the development of methods for such classification. In recent years, academic research has shifted towards a computer vision approach for sorting coins due to the advancement in the field of deep learning. However, most of the documented work utilizes what is known as ‘Transfer Learning’ in which we reuse a pre-trained model of a fixed architecture as a starting point for our training. While such an approach saves us a lot of time and effort, the generic nature of the pre-trained model can often become a bottleneck for performance on a specialized problem such as coin classification. This study develops a convolutional neural network (CNN) model from scratch and tests it against a widely-used general-purpose architecture known as Googlenet. We have shown in this study by comparing the performance of our model with that of Googlenet (documented in various previous studies) that a more straightforward and specialized architecture is more optimal than a more complex general architecture for the coin classification problem. The model developed in this study is trained and tested on 720 and 180 images of Indian coins of different denominations, respectively. The final accuracy gained by the model is 91.62% on the training data, while the accuracy is 90.55% on the validation data.


2020 ◽  
Vol 30 (1) ◽  
Author(s):  
Michael O. Olusola ◽  
Sydney I. Onyeagu

This paper is centred on a binary classification problem in which it is desired to assign a new object with multivariate features to one of two distinct populations as based on historical sets of samples from two populations. A linear discriminant analysis framework has been proposed, called the minimised sum of deviations by proportion (MSDP) to model the binary classification problem. In the MSDP formulation, the sum of the proportion of exterior deviations is minimised subject to the group separation constraints, the normalisation constraint, the upper bound constraints on proportions of exterior deviations and the sign unrestriction vis-à-vis the non-negativity constraints. The two-phase method in linear programming is adopted as a solution technique to generate the discriminant function. The decision rule on group-membership prediction is constructed using the apparent error rate. The performance of the MSDP has been compared with some existing linear discriminant models using a previously published dataset on road casualties. The MSDP model was more promising and well suited for the imbalanced dataset on road casualties.


2021 ◽  
Author(s):  
Toshitaka Hayashi ◽  
Hamido Fujita

One-class classification (OCC) is a classification problem where training data includes only one class. In such a problem, two types of classes exist, seen class and unseen class, and classifying these classes is a challenge. Besides, One-class Image Transformation Network (OCITN) is an OCC algorithm for image data. In which, image transformation network (ITN) is trained. ITN aims to transform all input image into one image, namely goal image. Moreover, the model error of ITN is computed as a distance metric between ITN output and a goal image. Besides, OCITN accuracy is related to goal image, and finding an appropriate goal image is challenging. In this paper, 234 goal images are experimented with in OCITN using the CIFAR10 dataset. Experiment results are analyzed with three image metrics: image entropy, similarity with seen images, and image derivatives.


2020 ◽  
Vol 34 (03) ◽  
pp. 2645-2652 ◽  
Author(s):  
Yaman Kumar ◽  
Dhruva Sahrawat ◽  
Shubham Maheshwari ◽  
Debanjan Mahata ◽  
Amanda Stent ◽  
...  

Visual Speech Recognition (VSR) is the process of recognizing or interpreting speech by watching the lip movements of the speaker. Recent machine learning based approaches model VSR as a classification problem; however, the scarcity of training data leads to error-prone systems with very low accuracies in predicting unseen classes. To solve this problem, we present a novel approach to zero-shot learning by generating new classes using Generative Adversarial Networks (GANs), and show how the addition of unseen class samples increases the accuracy of a VSR system by a significant margin of 27% and allows it to handle speaker-independent out-of-vocabulary phrases. We also show that our models are language agnostic and therefore capable of seamlessly generating, using English training data, videos for a new language (Hindi). To the best of our knowledge, this is the first work to show empirical evidence of the use of GANs for generating training samples of unseen classes in the domain of VSR, hence facilitating zero-shot learning. We make the added videos for new classes publicly available along with our code1.


Author(s):  
Zhonghua Liu ◽  
Jiexin Pu ◽  
Yong Qiu ◽  
Moli Zhang ◽  
Xiaoli Zhang ◽  
...  

Sparse representation is a new hot technique in recent years. The two-phase test sample sparse representation method (TPTSSR) achieved an excellent performance in face recognition. In this paper, a kernel two-phase test sample sparse representation method (KTPTSSR) is proposed. Firstly, the input data are mapped into an implicit high-dimensional feature space by a nonlinear mapping function. Secondly, the data are analyzed by means of the TPTSSR method in the feature space. If an appropriate kernel function and the corresponding kernel parameter are selected, a test sample can be accurately represented as the linear combination of the training data with the same label information of the test sample. Therefore, the proposed method could have better recognition performance than TPTSSR. Experiments on the face databases demonstrate the effectiveness of our methods.


Author(s):  
Pei Zhang ◽  
YIng Li ◽  
Dong Wang ◽  
Yunpeng Bai

CNN-based methods have dominated the field of aerial scene classification for the past few years. While achieving remarkable success, CNN-based methods suffer from excessive parameters and notoriously rely on large amounts of training data. In this work, we introduce few-shot learning to the aerial scene classification problem. Few-shot learning aims to learn a model on base-set that can quickly adapt to unseen categories in novel-set, using only a few labeled samples. To this end, we proposed a meta-learning method for few-shot classification of aerial scene images. First, we train a feature extractor on all base categories to learn a representation of inputs. Then in the meta-training stage, the classifier is optimized in the metric space by cosine distance with a learnable scale parameter. At last, in the meta-testing stage, the query sample in the unseen category is predicted by the adapted classifier given a few support samples. We conduct extensive experiments on two challenging datasets: NWPU-RESISC45 and RSD46-WHU. The experimental results show that our method outperforms three state-of-the-art few-shot algorithms and one typical CNN-based method, D-CNN. Furthermore, several ablation experiments are conducted to investigate the effects of dataset scale and support shots; the experiment results confirm that our model is specifically effective in few-shot settings.


Author(s):  
Minh Pham ◽  
Craig A. Knoblock ◽  
Muhao Chen ◽  
Binh Vu ◽  
Jay Pujara

Error detection is one of the most important steps in data cleaning and usually requires extensive human interaction to ensure quality. Existing supervised methods in error detection require a significant amount of training data while unsupervised methods rely on fixed inductive biases, which are usually hard to generalize, to solve the problem. In this paper, we present SPADE, a novel semi-supervised probabilistic approach for error detection. SPADE introduces a novel probabilistic active learning model, where the system suggests examples to be labeled based on the agreements between user labels and indicative signals, which are designed to capture potential errors. SPADE uses a two-phase data augmentation process to enrich a dataset before training a deep learning classifier to detect unlabeled errors. In our evaluation, SPADE achieves an average F1-score of 0.91 over five datasets and yields a 10% improvement compared with the state-of-the-art systems.


2019 ◽  
Vol 15 (1) ◽  
pp. 155014771882052 ◽  
Author(s):  
Bowen Qin ◽  
Fuyuan Xiao

Due to its efficiency to handle uncertain information, Dempster–Shafer evidence theory has become the most important tool in many information fusion systems. However, how to determine basic probability assignment, which is the first step in evidence theory, is still an open issue. In this article, a new method integrating interval number theory and k-means++ cluster method is proposed to determine basic probability assignment. At first, k-means++ clustering method is used to calculate lower and upper bound values of interval number with training data. Then, the differentiation degree based on distance and similarity of interval number between the test sample and constructed models are defined to generate basic probability assignment. Finally, Dempster’s combination rule is used to combine multiple basic probability assignments to get the final basic probability assignment. The experiments on Iris data set that is widely used in classification problem illustrated that the proposed method is effective in determining basic probability assignment and classification problem, and the proposed method shows more accurate results in which the classification accuracy reaches 96.7%.


Sign in / Sign up

Export Citation Format

Share Document