scholarly journals Insufficient Data Can Also Rock! Learning to Converse Using Smaller Data with Augmentation

Author(s):  
Juntao Li ◽  
Lisong Qiu ◽  
Bo Tang ◽  
Dongmin Chen ◽  
Dongyan Zhao ◽  
...  

Recent successes of open-domain dialogue generation mainly rely on the advances of deep neural networks. The effectiveness of deep neural network models depends on the amount of training data. As it is laboursome and expensive to acquire a huge amount of data in most scenarios, how to effectively utilize existing data is the crux of this issue. In this paper, we use data augmentation techniques to improve the performance of neural dialogue models on the condition of insufficient data. Specifically, we propose a novel generative model to augment existing data, where the conditional variational autoencoder (CVAE) is employed as the generator to output more training data with diversified expressions. To improve the correlation of each augmented training pair, we design a discriminator with adversarial training to supervise the augmentation process. Moreover, we thoroughly investigate various data augmentation schemes for neural dialogue system with generative models, both GAN and CVAE. Experimental results on two open corpora, Weibo and Twitter, demonstrate the superiority of our proposed data augmentation model.

2021 ◽  
Author(s):  
Saman Motamed ◽  
Patrik Rogalla ◽  
Farzad Khalvati

Abstract Successful training of convolutional neural networks (CNNs) requires a substantial amount of data. With small datasets networks generalize poorly. Data Augmentation techniques improve the generalizability of neural networks by using existing training data more effectively. Standard data augmentation methods, however, produce limited plausible alternative data. Generative Adversarial Networks (GANs) have been utilized to generate new data and improve the performance of CNNs. Nevertheless, data augmentation techniques for training GANs are under-explored compared to CNNs. In this work, we propose a new GAN architecture for augmentation of chest X-rays for semi-supervised detection of pneumonia and COVID-19 using generative models. We show that the proposed GAN can be used to effectively augment data and improve classification accuracy of disease in chest X-rays for pneumonia and COVID-19. We compare our augmentation GAN model with Deep Convolutional GAN and traditional augmentation methods (rotate, zoom, etc) on two different X-ray datasets and show our GAN-based augmentation method surpasses other augmentation methods for training a GAN in detecting anomalies in X-ray images.


2020 ◽  
Vol 10 (3) ◽  
pp. 62
Author(s):  
Tittaya Mairittha ◽  
Nattaya Mairittha ◽  
Sozo Inoue

The integration of digital voice assistants in nursing residences is becoming increasingly important to facilitate nursing productivity with documentation. A key idea behind this system is training natural language understanding (NLU) modules that enable the machine to classify the purpose of the user utterance (intent) and extract pieces of valuable information present in the utterance (entity). One of the main obstacles when creating robust NLU is the lack of sufficient labeled data, which generally relies on human labeling. This process is cost-intensive and time-consuming, particularly in the high-level nursing care domain, which requires abstract knowledge. In this paper, we propose an automatic dialogue labeling framework of NLU tasks, specifically for nursing record systems. First, we apply data augmentation techniques to create a collection of variant sample utterances. The individual evaluation result strongly shows a stratification rate, with regard to both fluency and accuracy in utterances. We also investigate the possibility of applying deep generative models for our augmented dataset. The preliminary character-based model based on long short-term memory (LSTM) obtains an accuracy of 90% and generates various reasonable texts with BLEU scores of 0.76. Secondly, we introduce an idea for intent and entity labeling by using feature embeddings and semantic similarity-based clustering. We also empirically evaluate different embedding methods for learning good representations that are most suitable to use with our data and clustering tasks. Experimental results show that fastText embeddings produce strong performances both for intent labeling and on entity labeling, which achieves an accuracy level of 0.79 and 0.78 f1-scores and 0.67 and 0.61 silhouette scores, respectively.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Huu-Thanh Duong ◽  
Tram-Anh Nguyen-Thi

AbstractIn literature, the machine learning-based studies of sentiment analysis are usually supervised learning which must have pre-labeled datasets to be large enough in certain domains. Obviously, this task is tedious, expensive and time-consuming to build, and hard to handle unseen data. This paper has approached semi-supervised learning for Vietnamese sentiment analysis which has limited datasets. We have summarized many preprocessing techniques which were performed to clean and normalize data, negation handling, intensification handling to improve the performances. Moreover, data augmentation techniques, which generate new data from the original data to enrich training data without user intervention, have also been presented. In experiments, we have performed various aspects and obtained competitive results which may motivate the next propositions.


2022 ◽  
Vol 18 (1) ◽  
pp. 1-24
Author(s):  
Yi Zhang ◽  
Yue Zheng ◽  
Guidong Zhang ◽  
Kun Qian ◽  
Chen Qian ◽  
...  

Gait, the walking manner of a person, has been perceived as a physical and behavioral trait for human identification. Compared with cameras and wearable sensors, Wi-Fi-based gait recognition is more attractive because Wi-Fi infrastructure is almost available everywhere and is able to sense passively without the requirement of on-body devices. However, existing Wi-Fi sensing approaches impose strong assumptions of fixed user walking trajectories, sufficient training data, and identification of already known users. In this article, we present GaitSense , a Wi-Fi-based human identification system, to overcome the above unrealistic assumptions. To deal with various walking trajectories and speeds, GaitSense first extracts target specific features that best characterize gait patterns and applies novel normalization algorithms to eliminate gait irrelevant perturbation in signals. On this basis, GaitSense reduces the training efforts in new deployment scenarios by transfer learning and data augmentation techniques. GaitSense also enables a distinct feature of illegal user identification by anomaly detection, making the system readily available for real-world deployment. Our implementation and evaluation with commodity Wi-Fi devices demonstrate a consistent identification accuracy across various deployment scenarios with little training samples, pushing the limit of gait recognition with Wi-Fi signals.


2000 ◽  
Author(s):  
Arturo Pacheco-Vega ◽  
Mihir Sen ◽  
Rodney L. McClain

Abstract In the current study we consider the problem of accuracy in heat rate estimations from artificial neural network models of heat exchangers used for refrigeration applications. The network configuration is of the feedforward type with a sigmoid activation function and a backpropagation algorithm. Limited experimental measurements from a manufacturer are used to show the capability of the neural network technique in modeling the heat transfer in these systems. Results from this exercise show that a well-trained network correlates the data with errors of the same order as the uncertainty of the measurements. It is also shown that the number and distribution of the training data are linked to the performance of the network when estimating the heat rates under different operating conditions, and that networks trained from few tests may give large errors. A methodology based on the cross-validation technique is presented to find regions where not enough data are available to construct a reliable neural network. The results from three tests show that the proposed methodology gives an upper bound of the estimated error in the heat rates.


2019 ◽  
Vol 9 (13) ◽  
pp. 2683 ◽  
Author(s):  
Sang-Ki Ko ◽  
Chang Jo Kim ◽  
Hyedong Jung ◽  
Choongsang Cho

We propose a sign language translation system based on human keypoint estimation. It is well-known that many problems in the field of computer vision require a massive dataset to train deep neural network models. The situation is even worse when it comes to the sign language translation problem as it is far more difficult to collect high-quality training data. In this paper, we introduce the KETI (Korea Electronics Technology Institute) sign language dataset, which consists of 14,672 videos of high resolution and quality. Considering the fact that each country has a different and unique sign language, the KETI sign language dataset can be the starting point for further research on the Korean sign language translation. Using the KETI sign language dataset, we develop a neural network model for translating sign videos into natural language sentences by utilizing the human keypoints extracted from the face, hands, and body parts. The obtained human keypoint vector is normalized by the mean and standard deviation of the keypoints and used as input to our translation model based on the sequence-to-sequence architecture. As a result, we show that our approach is robust even when the size of the training data is not sufficient. Our translation model achieved 93.28% (55.28%, respectively) translation accuracy on the validation set (test set, respectively) for 105 sentences that can be used in emergency situations. We compared several types of our neural sign translation models based on different attention mechanisms in terms of classical metrics for measuring the translation performance.


2020 ◽  
Vol 10 (22) ◽  
pp. 8079
Author(s):  
Sanglee Park ◽  
Jungmin So

State-of-the-art neural network models are actively used in various fields, but it is well-known that they are vulnerable to adversarial example attacks. Throughout the efforts to make the models robust against adversarial example attacks, it has been found to be a very difficult task. While many defense approaches were shown to be not effective, adversarial training remains as one of the promising methods. In adversarial training, the training data are augmented by “adversarial” samples generated using an attack algorithm. If the attacker uses a similar attack algorithm to generate adversarial examples, the adversarially trained network can be quite robust to the attack. However, there are numerous ways of creating adversarial examples, and the defender does not know what algorithm the attacker may use. A natural question is: Can we use adversarial training to train a model robust to multiple types of attack? Previous work have shown that, when a network is trained with adversarial examples generated from multiple attack methods, the network is still vulnerable to white-box attacks where the attacker has complete access to the model parameters. In this paper, we study this question in the context of black-box attacks, which can be a more realistic assumption for practical applications. Experiments with the MNIST dataset show that adversarially training a network with an attack method helps defending against that particular attack method, but has limited effect for other attack methods. In addition, even if the defender trains a network with multiple types of adversarial examples and the attacker attacks with one of the methods, the network could lose accuracy to the attack if the attacker uses a different data augmentation strategy on the target network. These results show that it is very difficult to make a robust network using adversarial training, even for black-box settings where the attacker has restricted information on the target network.


2019 ◽  
Vol 5 (1) ◽  
pp. 239-244
Author(s):  
Jingrui Yu ◽  
Roman Seidel ◽  
Gangolf Hirtz

AbstractWe propose a one-step person detector for topview omnidirectional indoor scenes based on convolutional neural networks (CNNs). While state of the art person detectors reach competitive results on perspective images, missing CNN architectures as well as training data that follows the distortion of omnidirectional images makes current approaches not applicable to our data. The method predicts bounding boxes of multiple persons directly in omnidirectional images without perspective transformation, which reduces overhead of pre- and post-processing and enables realtime performance. The basic idea is to utilize transfer learning to fine-tune CNNs trained on perspective images with data augmentation techniques for detection in omnidirectional images. We fine-tune two variants of Single Shot MultiBox detectors (SSDs). The first one uses Mobilenet v1 FPN as feature extractor (moSSD). The second one uses ResNet50 v1 FPN (resSSD). Both models are pre-trained on Microsoft Common Objects in Context (COCO) dataset. We fine-tune both models on PASCAL VOC07 and VOC12 datasets, specifically on class person. Random 90-degree rotation and random vertical flipping are used for data augmentation in addition to the methods proposed by original SSD. We reach an average precision (AP) of 67.3%with moSSD and 74.9%with resSSD on the evaluation dataset. To enhance the fine-tuning process, we add a subset of HDA Person dataset and a subset of PIROPO database and reduce the number of perspective images to PASCAL VOC07. The AP rises to 83.2% for moSSD and 86.3% for resSSD, respectively. The average inference speed is 28 ms per image for moSSD and 38 ms per image for resSSD using Nvidia Quadro P6000. Our method is applicable to other CNN-based object detectors and can potentially generalize for detecting other objects in omnidirectional images.


2020 ◽  
Vol 34 (04) ◽  
pp. 4264-4271
Author(s):  
Siddhartha Jain ◽  
Ge Liu ◽  
Jonas Mueller ◽  
David Gifford

The inaccuracy of neural network models on inputs that do not stem from the distribution underlying the training data is problematic and at times unrecognized. Uncertainty estimates of model predictions are often based on the variation in predictions produced by a diverse ensemble of models applied to the same input. Here we describe Maximize Overall Diversity (MOD), an approach to improve ensemble-based uncertainty estimates by encouraging larger overall diversity in ensemble predictions across all possible inputs. We apply MOD to regression tasks including 38 Protein-DNA binding datasets, 9 UCI datasets, and the IMDB-Wiki image dataset. We also explore variants that utilize adversarial training techniques and data density estimation. For out-of-distribution test examples, MOD significantly improves predictive performance and uncertainty calibration without sacrificing performance on test data drawn from same distribution as the training data. We also find that in Bayesian optimization tasks, the performance of UCB acquisition is improved via MOD uncertainty estimates.


2022 ◽  
Vol 2161 (1) ◽  
pp. 012005
Author(s):  
C R Karthik ◽  
Raghunandan ◽  
B Ashwath Rao ◽  
N V Subba Reddy

Abstract A time series is an order of observations engaged serially in time. The prime objective of time series analysis is to build mathematical models that provide reasonable descriptions from training data. The goal of time series analysis is to forecast the forthcoming values of a series based on the history of the same series. Forecasting of stock markets is a thought-provoking problem because of the number of possible variables as well as volatile noise that may contribute to the prices of the stock. However, the capability to analyze stock market leanings could be vital to investors, traders and researchers, hence has been of continued interest. Plentiful arithmetical and machine learning practices have been discovered for stock analysis and forecasting/prediction. In this paper, we perform a comparative study on two very capable artificial neural network models i) Deep Neural Network (DNN) and ii) Long Short-Term Memory (LSTM) a type of recurrent neural network (RNN) in predicting the daily variance of NIFTYIT in BSE (Bombay Stock Exchange) and NSE (National Stock Exchange) markets. DNN was chosen due to its capability to handle complex data with substantial performance and better generalization without being saturated. LSTM model was decided, as it contains intermediary memory which can hold the historic patterns and occurrence of the next prediction depends on the values that preceded it. With both networks, measures were taken to reduce overfitting. Daily predictions of the NIFTYIT index were made to test the generalizability of the models. Both networks performed well at making daily predictions, and both generalized admirably to make daily predictions of the NiftyIT data. The LSTM-RNN outpaced the DNN in terms of forecasting and thus, grips more potential for making longer-term estimates.


Sign in / Sign up

Export Citation Format

Share Document