Gravity Control-Based Data Augmentation Technique for Improving VR User Activity Recognition

The neural-network-based human activity recognition (HAR) technique is being increasingly used for activity recognition in virtual reality (VR) users. The major issue of a such technique is the collection large-scale training datasets which are key for deriving a robust recognition model. However, collecting large-scale data is a costly and time-consuming process. Furthermore, increasing the number of activities to be classified will require a much larger number of training datasets. Since training the model with a sparse dataset can only provide limited features to recognition models, it can cause problems such as overfitting and suboptimal results. In this paper, we present a data augmentation technique named gravity control-based augmentation (GCDA) to alleviate the sparse data problem by generating new training data based on the existing data. The benefits of the symmetrical structure of the data are that it increased the number of data while preserving the properties of the data. The core concept of GCDA is two-fold: (1) decomposing the acceleration data obtained from the inertial measurement unit (IMU) into zero-gravity acceleration and gravitational acceleration, and augmenting them separately, and (2) exploiting gravity as a directional feature and controlling it to augment training datasets. Through the comparative evaluations, we validated that the application of GCDA to training datasets showed a larger improvement in classification accuracy (96.39%) compared to the typical data augmentation methods (92.29%) applied and those that did not apply the augmentation method (85.21%).

Download Full-text

Rethinking the Random Cropping Data Augmentation Method Used in the Training of CNN-Based SAR Image Ship Detector

Remote Sensing ◽

10.3390/rs13010034 ◽

2020 ◽

Vol 13 (1) ◽

pp. 34

Author(s):

Rong Yang ◽

Robert Wang ◽

Yunkai Deng ◽

Xiaoxue Jia ◽

Heng Zhang

Keyword(s):

Neural Network ◽

Data Augmentation ◽

Back Propagation ◽

Detection Performance ◽

Training Data ◽

Sar Image ◽

Optical Images ◽

The Neural Network ◽

Effective Training ◽

Standard Configuration

The random cropping data augmentation method is widely used to train convolutional neural network (CNN)-based target detectors to detect targets in optical images (e.g., COCO datasets). It can expand the scale of the dataset dozens of times while consuming only a small amount of calculations when training the neural network detector. In addition, random cropping can also greatly enhance the spatial robustness of the model, because it can make the same target appear in different positions of the sample image. Nowadays, random cropping and random flipping have become the standard configuration for those tasks with limited training data, which makes it natural to introduce them into the training of CNN-based synthetic aperture radar (SAR) image ship detectors. However, in this paper, we show that the introduction of traditional random cropping methods directly in the training of the CNN-based SAR image ship detector may generate a lot of noise in the gradient during back propagation, which hurts the detection performance. In order to eliminate the noise in the training gradient, a simple and effective training method based on feature map mask is proposed. Experiments prove that the proposed method can effectively eliminate the gradient noise introduced by random cropping and significantly improve the detection performance under a variety of evaluation indicators without increasing inference cost.

Download Full-text

Attention-Aware Adversarial Network for Person Re-Identification

Applied Sciences ◽

10.3390/app9081550 ◽

2019 ◽

Vol 9 (8) ◽

pp. 1550 ◽

Cited By ~ 1

Author(s):

Aihong Shen ◽

Huasheng Wang ◽

Junjie Wang ◽

Hongchen Tan ◽

Xiuping Liu ◽

...

Keyword(s):

High Performance ◽

Large Scale ◽

Data Augmentation ◽

Fundamental Problem ◽

Training Data ◽

Specific Data ◽

Training Strategy ◽

Adversarial Network ◽

Benchmark Datasets ◽

Adversarial Training

Person re-identification (re-ID) is a fundamental problem in the field of computer vision. The performance of deep learning-based person re-ID models suffers from a lack of training data. In this work, we introduce a novel image-specific data augmentation method on the feature map level to enforce feature diversity in the network. Furthermore, an attention assignment mechanism is proposed to enforce that the person re-ID classifier focuses on nearly all important regions of the input person image. To achieve this, a three-stage framework is proposed. First, a baseline classification network is trained for person re-ID. Second, an attention assignment network is proposed based on the baseline network, in which the attention module learns to suppress the response of the current detected regions and re-assign attentions to other important locations. By this means, multiple important regions for classification are highlighted by the attention map. Finally, the attention map is integrated in the attention-aware adversarial network (AAA-Net), which generates high-performance classification results with an adversarial training strategy. We evaluate the proposed method on two large-scale benchmark datasets, including Market1501 and DukeMTMC-reID. Experimental results show that our algorithm performs favorably against the state-of-the-art methods.

Download Full-text

Learning Nonlinear Brain Dynamics: van der Pol Meets LSTM

10.1101/330548 ◽

2018 ◽

Cited By ~ 1

Author(s):

Germán Abrevaya ◽

Aleksandr Aravkin ◽

Guillermo Cecchi ◽

Irina Rish ◽

Pablo Polosecki ◽

...

Keyword(s):

Large Scale ◽

Data Augmentation ◽

Predictive Accuracy ◽

Training Data ◽

Van Der Pol Oscillator ◽

Brain Dynamics ◽

Optimization Approach ◽

Imaging Data ◽

Van Der Pol ◽

Temporal Models

AbstractMany real-world data sets, especially in biology, are produced by highly multivariate and nonlinear complex dynamical systems. In this paper, we focus on brain imaging data, including both calcium imaging and functional MRI data. Standard vector-autoregressive models are limited by their linearity assumptions, while nonlinear general-purpose, large-scale temporal models, such as LSTM networks, typically require large amounts of training data, not always readily available in biological applications; furthermore, such models have limited interpretability. We introduce here a novel approach for learning a nonlinear differential equation model aimed at capturing brain dynamics. Specifically, we propose a variable-projection optimization approach to estimate the parameters of the multivariate (coupled) van der Pol oscillator, and demonstrate that such a model can accurately represent nonlinear dynamics of the brain data. Furthermore, in order to improve the predictive accuracy when forecasting future brain-activity time series, we use this analytical model as an unlimited source of simulated data for pretraining LSTM; such model-specific data augmentation approach consistently improves LSTM performance on both calcium and fMRI imaging data.

Download Full-text

Data set entity recognition based on distant supervision

The Electronic Library ◽

10.1108/el-10-2020-0301 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Pengcheng Li ◽

Qikai Liu ◽

Qikai Cheng ◽

Wei Lu

Keyword(s):

Supervised Learning ◽

Large Scale ◽

Data Augmentation ◽

Scientific Literature ◽

Neural Model ◽

Training Data ◽

Entity Recognition ◽

Data Set ◽

Content Type ◽

Augmentation Techniques

Purpose This paper aims to identify data set entities in scientific literature. To address poor recognition caused by a lack of training corpora in existing studies, a distant supervised learning-based approach is proposed to identify data set entities automatically from large-scale scientific literature in an open domain. Design/methodology/approach Firstly, the authors use a dictionary combined with a bootstrapping strategy to create a labelled corpus to apply supervised learning. Secondly, a bidirectional encoder representation from transformers (BERT)-based neural model was applied to identify data set entities in the scientific literature automatically. Finally, two data augmentation techniques, entity replacement and entity masking, were introduced to enhance the model generalisability and improve the recognition of data set entities. Findings In the absence of training data, the proposed method can effectively identify data set entities in large-scale scientific papers. The BERT-based vectorised representation and data augmentation techniques enable significant improvements in the generality and robustness of named entity recognition models, especially in long-tailed data set entity recognition. Originality/value This paper provides a practical research method for automatically recognising data set entities in scientific literature. To the best of the authors’ knowledge, this is the first attempt to apply distant learning to the study of data set entity recognition. The authors introduce a robust vectorised representation and two data augmentation strategies (entity replacement and entity masking) to address the problem inherent in distant supervised learning methods, which the existing research has mostly ignored. The experimental results demonstrate that our approach effectively improves the recognition of data set entities, especially long-tailed data set entities.

Download Full-text

MapReduce Based Parallel Neural Networks in Enabling Large Scale Machine Learning

Computational Intelligence and Neuroscience ◽

10.1155/2015/297672 ◽

2015 ◽

Vol 2015 ◽

pp. 1-13 ◽

Cited By ~ 22

Author(s):

Yang Liu ◽

Jie Yang ◽

Yuan Huang ◽

Lixiong Xu ◽

Siguang Li ◽

...

Keyword(s):

Neural Networks ◽

Big Data ◽

Large Scale ◽

Training Data ◽

Computer Cluster ◽

Data Intensive ◽

Big Data Applications ◽

The Neural Network ◽

Computation Process ◽

Data Intensive Applications

Artificial neural networks (ANNs) have been widely used in pattern recognition and classification applications. However, ANNs are notably slow in computation especially when the size of data is large. Nowadays, big data has received a momentum from both industry and academia. To fulfill the potentials of ANNs for big data applications, the computation process must be speeded up. For this purpose, this paper parallelizes neural networks based on MapReduce, which has become a major computing model to facilitate data intensive applications. Three data intensive scenarios are considered in the parallelization process in terms of the volume of classification data, the size of the training data, and the number of neurons in the neural network. The performance of the parallelized neural networks is evaluated in an experimental MapReduce computer cluster from the aspects of accuracy in classification and efficiency in computation.

Download Full-text

Data Driven Prognostics With Lack of Training Data Sets

Volume 2A: 41st Design Automation Conference ◽

10.1115/detc2015-46932 ◽

2015 ◽

Cited By ~ 2

Author(s):

Zhimin Xi ◽

Xiangxue Zhao

Keyword(s):

Lithium Ion ◽

Remaining Useful Life ◽

Training Data ◽

Data Driven ◽

Data Sets ◽

The Neural Network ◽

Typical Data ◽

Capacity Degradation ◽

Battery Capacity ◽

Network Similarity

Data-driven prognostics typically requires sufficient offline training data sets for accurate remaining useful life (RUL) prediction of engineering products. This paper investigates performances of typical data-driven methodologies when the amount of training data sets is insufficient. The purpose is to better understand these methodologies especially when offline training datasets are insufficient. The neural network, similarity-based approach, and copula-based sampling approach were investigated when only three run-to-failure training units were available. The example of lithium-ion (Li-ion) battery capacity degradation was employed for the demonstration.

Download Full-text

Gait Activity Classification on Unbalanced Data from Inertial Sensors Using Shallow and Deep Learning

Sensors ◽

10.3390/s20174756 ◽

2020 ◽

Vol 20 (17) ◽

pp. 4756

Author(s):

Irvin Hussein Lopez-Nava ◽

Luis M. Valentín-Coronado ◽

Matias Garcia-Constantino ◽

Jesus Favela

Keyword(s):

Deep Learning ◽

Activity Recognition ◽

Large Scale ◽

Data Augmentation ◽

Inertial Sensors ◽

Synthetic Data ◽

Classification Performance ◽

Unbalanced Data ◽

Learning Approach ◽

Sampled Data

Activity recognition is one of the most active areas of research in ubiquitous computing. In particular, gait activity recognition is useful to identify various risk factors in people’s health that are directly related to their physical activity. One of the issues in activity recognition, and gait in particular, is that often datasets are unbalanced (i.e., the distribution of classes is not uniform), and due to this disparity, the models tend to categorize into the class with more instances. In the present study, two methods for classifying gait activities using accelerometer and gyroscope data from a large-scale public dataset were evaluated and compared. The gait activities in this dataset are: (i) going down an incline, (ii) going up an incline, (iii) walking on level ground, (iv) going down stairs, and (v) going up stairs. The proposed methods are based on conventional (shallow) and deep learning techniques. In addition, data were evaluated from three data treatments: original unbalanced data, sampled data, and augmented data. The latter was based on the generation of synthetic data according to segmented gait data. The best results were obtained with classifiers built with augmented data, with F-measure results of 0.812 (σ = 0.078) for the shallow learning approach, and of 0.927 (σ = 0.033) for the deep learning approach. In addition, the data augmentation strategy proposed to deal with the unbalanced problem resulted in increased classification performance using both techniques.

Download Full-text

Deep Metallic Surface Defect Detection: The New Benchmark and Detection Network

Sensors ◽

10.3390/s20061562 ◽

2020 ◽

Vol 20 (6) ◽

pp. 1562 ◽

Cited By ~ 1

Author(s):

Xiaoming Lv ◽

Fajie Duan ◽

Jia-jia Jiang ◽

Xiao Fu ◽

Lin Gan

Keyword(s):

Defect Detection ◽

Large Scale ◽

Surface Defect ◽

Data Augmentation ◽

Training Data ◽

Metallic Surface ◽

Single Shot ◽

Limited Data ◽

Detection Model ◽

Surface Defect Detection

Metallic surface defect detection is an essential and necessary process to control the qualities of industrial products. However, due to the limited data scale and defect categories, existing defect datasets are generally unavailable for the deployment of the detection model. To address this problem, we contribute a new dataset called GC10-DET for large-scale metallic surface defect detection. The GC10-DET dataset has great challenges on defect categories, image number, and data scale. Besides, traditional detection approaches are poor in both efficiency and accuracy for the complex real-world environment. Thus, we also propose a novel end-to-end defect detection network (EDDN) based on the Single Shot MultiBox Detector. The EDDN model can deal with defects with different scales. Furthermore, a hard negative mining method is designed to alleviate the problem of data imbalance, while some data augmentation methods are adopted to enrich the training data for the expensive data collection problem. Finally, the extensive experiments on two datasets demonstrate that the proposed method is robust and can meet accuracy requirements for metallic defect detection.

Download Full-text

Translating Videos into Synthetic Training Data for Wearable Sensor-Based Activity Recognition Systems Using Residual Deep Convolutional Networks

Applied Sciences ◽

10.3390/app11073094 ◽

2021 ◽

Vol 11 (7) ◽

pp. 3094

Author(s):

Vitor Fortes Rey ◽

Kamalveer Kaur Garewal ◽

Paul Lukowicz

Keyword(s):

Computer Vision ◽

Regression Model ◽

Activity Recognition ◽

Language Processing ◽

Large Scale ◽

Simulated Data ◽

Training Data ◽

Sensor Data ◽

Activity Data ◽

Data Set

Human activity recognition (HAR) using wearable sensors has benefited much less from recent advances in Deep Learning than fields such as computer vision and natural language processing. This is, to a large extent, due to the lack of large scale (as compared to computer vision) repositories of labeled training data for sensor-based HAR tasks. Thus, for example, ImageNet has images for around 100,000 categories (based on WordNet) with on average 1000 images per category (therefore up to 100,000,000 samples). The Kinetics-700 video activity data set has 650,000 video clips covering 700 different human activities (in total over 1800 h). By contrast, the total length of all sensor-based HAR data sets in the popular UCI machine learning repository is less than 63 h, with around 38 of those consisting of simple mode of locomotion activities like walking, standing or cycling. In our research we aim to facilitate the use of online videos, which exist in ample quantities for most activities and are much easier to label than sensor data, to simulate labeled wearable motion sensor data. In previous work we already demonstrated some preliminary results in this direction, focusing on very simple, activity specific simulation models and a single sensor modality (acceleration norm). In this paper, we show how we can train a regression model on generic motions for both accelerometer and gyro signals and then apply it to videos of the target activities to generate synthetic Inertial Measurement Units (IMU) data (acceleration and gyro norms) that can be used to train and/or improve HAR models. We demonstrate that systems trained on simulated data generated by our regression model can come to within around 10% of the mean F1 score of a system trained on real sensor data. Furthermore, we show that by either including a small amount of real sensor data for model calibration or simply leveraging the fact that (in general) we can easily generate much more simulated data from video than we can collect its real version, the advantage of the latter can eventually be equalized.

Download Full-text

Improving Object Tracking by Added Noise and Channel Attention

Sensors ◽

10.3390/s20133780 ◽

2020 ◽

Vol 20 (13) ◽

pp. 3780 ◽

Cited By ~ 2

Author(s):

Mustansar Fiaz ◽

Arif Mahmood ◽

Ki Yeol Baek ◽

Sehar Shahzad Farooq ◽

Soon Ki Jung

Keyword(s):

Large Scale ◽

Data Augmentation ◽

Feature Fusion ◽

State Of The Art ◽

Computational Cost ◽

Training Data ◽

Superior Performance ◽

Input Noise ◽

Offline Learning ◽

Benchmark Datasets

CNN-based trackers, especially those based on Siamese networks, have recently attracted considerable attention because of their relatively good performance and low computational cost. For many Siamese trackers, learning a generic object model from a large-scale dataset is still a challenging task. In the current study, we introduce input noise as regularization in the training data to improve generalization of the learned model. We propose an Input-Regularized Channel Attentional Siamese (IRCA-Siam) tracker which exhibits improved generalization compared to the current state-of-the-art trackers. In particular, we exploit offline learning by introducing additive noise for input data augmentation to mitigate the overfitting problem. We propose feature fusion from noisy and clean input channels which improves the target localization. Channel attention integrated with our framework helps finding more useful target features resulting in further performance improvement. Our proposed IRCA-Siam enhances the discrimination of the tracker/background and improves fault tolerance and generalization. An extensive experimental evaluation on six benchmark datasets including OTB2013, OTB2015, TC128, UAV123, VOT2016 and VOT2017 demonstrate superior performance of the proposed IRCA-Siam tracker compared to the 30 existing state-of-the-art trackers.

Download Full-text