Learning and Evaluation of Dialogue Strategies for New Applications: Empirical Methods for Optimization from Small Data Sets

2011 ◽  
Vol 37 (1) ◽  
pp. 153-196 ◽  
Author(s):  
Verena Rieser ◽  
Oliver Lemon

We present a new data-driven methodology for simulation-based dialogue strategy learning, which allows us to address several problems in the field of automatic optimization of dialogue strategies: learning effective dialogue strategies when no initial data or system exists, and determining a data-driven reward function. In addition, we evaluate the result with real users, and explore how results transfer between simulated and real interactions. We use Reinforcement Learning (RL) to learn multimodal dialogue strategies by interaction with a simulated environment which is “bootstrapped” from small amounts of Wizard-of-Oz (WOZ) data. This use of WOZ data allows data-driven development of optimal strategies for domains where no working prototype is available. Using simulation-based RL allows us to find optimal policies which are not (necessarily) present in the original data. Our results show that simulation-based RL significantly outperforms the average (human wizard) strategy as learned from the data by using Supervised Learning. The bootstrapped RL-based policy gains on average 50 times more reward when tested in simulation, and almost 18 times more reward when interacting with real users. Users also subjectively rate the RL-based policy on average 10% higher. We also show that results from simulated interaction do transfer to interaction with real users, and we explicitly evaluate the stability of the data-driven reward function.

2020 ◽  
Vol 54 (4/2020) ◽  
pp. 217-229
Author(s):  
CHANG CHE-JUNG ◽  
LI GUIPING ◽  
GUO JIANHONG ◽  
YU KUN-PENG

2007 ◽  
Vol 26 (2) ◽  
Author(s):  
Jason Baldridge ◽  
Nicholas Asher ◽  
Julie Hunter

AbstractPredicting discourse structure on naturally occurring texts and dialogs is challenging and computationally intensive. Attempts to construct hand-built systems have run into problems both in how to specify the required knowledge and how to perform the necessary computations in an efficient manner. Data-driven approaches have recently been shown to be successful for handling challenging aspects of discourse without using lots of fine-grained semantic detail, but they require annotated material for training. We describe our effort to annotate Segmented Discourse Representation Structures on Wall Street Journal texts, arguing that graph-based representations are necessary for adequately capturing the dependencies found in the data. We then explore two data-driven parsing strategies for recovering discourse structures. We show that the generative PCFG model of Baldridge & Lascarides (2005b) is inherently limited by its inability to incorporate new features when learning from small data sets, and we show how recent developments in dependency parsing and discriminative learning can be utilized to get around this problem and thereby improve parsing accuracy. Results from exploratory experiments on Verbmobil dialogs and our annotated news wire texts are given; these results suggest that these methods do indeed enhance performance and have the potential for significant further improvements by developing richer feature sets.


2021 ◽  
Vol 20 (1) ◽  
pp. 43-54
Author(s):  
J. Perl ◽  
J. Imkamp ◽  
D. Memmert

Abstract Introduction: Recognition and optimization of strategies in sport games is difficult in particular in case of team games, where a number of players are acting “independently” of each other. One way to improve the situation is to cluster the teams into a small number of tactical groups and to analyze the interaction of those groups. The aim of the study is the evaluation of the applicability of SOCCER© simulation in professional soccer by analyzing and simulation of the tactical group interaction. Methods: The players’ positions of tactical groups in soccer can be mapped to formation-patterns and then reflect strategic behaviour and interaction. Based on this information, Monte Carlo-Simulation allows for generating strategies, which – at least from the mathematical point of view – are optimal. In practice, behaviour can be orientated in those optimal strategies but normally is changing depending on the opponent team’s activities. Analyzing the game under the aspect of such simulated strategies revealed how strictly resp. flexible a team follows resp. varies strategic patterns. Approach: A Simulation- and Validation-Study on the basis of 40 position data sets of the 2014/15 German Bundesliga has been conducted to analyze and to optimize such strategic team behaviour in professional soccer. Results: The Validation-Study demonstrated the applicability of our tactical model. The results of the Simulation-Study revealed that offensive player groups need less tactical strictness in order to gain successful ball possession whereas defensive player groups need tactical strictness to do so. Conclusion: The strategic behaviour could be recognized and served as basis for optimization analysis: offensive players should play with a more flexible tactical orientation to stay in possession of the ball, whereas defensive players should play with a more planned orientation in order to be successful. The strategic behaviour of tactical groups can be recognized and optimized using Monte Carlo-based analysis, proposing a new and innovative approach to quantify tactical performance in soccer.


Author(s):  
Šinkovec ◽  
Geroldinger ◽  
Heinze

The parameters of logistic regression models are usually obtained by the method of maximum likelihood (ML). However, in analyses of small data sets or data sets with unbalanced outcomes or exposures, ML parameter estimates may not exist. This situation has been termed ‘separation’ as the two outcome groups are separated by the values of a covariate or a linear combination of covariates. To overcome the problem of non-existing ML parameter estimates, applying Firth’s correction (FC) was proposed. In practice, however, a principal investigator might be advised to ‘bring more data’ in order to solve a separation issue. We illustrate the problem by means of examples from colorectal cancer screening and ornithology. It is unclear if such an increasing sample size (ISS) strategy that keeps sampling new observations until separation is removed improves estimation compared to applying FC to the original data set. We performed an extensive simulation study where the main focus was to estimate the cost-adjusted relative efficiency of ML combined with ISS compared to FC. FC yielded reasonably small root mean squared errors and proved to be the more efficient estimator. Given our findings, we propose not to adapt the sample size when separation is encountered but to use FC as the default method of analysis whenever the number of observations or outcome events is critically low.


Author(s):  
Hana Šinkovec ◽  
Angelika Geroldinger ◽  
Georg Heinze

The parameters of logistic regression models are usually obtained by the method of maximum likelihood (ML). However, in analyses of small data sets or data sets with unbalanced outcomes or exposures, ML parameter estimates may not exist. This situation has been termed “separation” as the two outcome groups are separated by the values of a covariate or a linear combination of covariates. To overcome the problem of non-existing ML parameter estimates, applying Firth’s correction (FC) was proposed. In practice, however, a principal investigator might be advised to “bring more data” in order to solve a separation issue. We illustrate the problem by means of an examples from colorectal cancer screening and ornithology. It is unclear if such an increasing sample size (ISS) strategy that keeps sampling new observations until separation is removed improves estimation compared to applying FC to the original data set. We performed an extensive simulation study where the main focus was to estimate the cost-adjusted relative efficiency of ML combined with ISS compared to FC. FC yielded reasonably small root mean squared errors and proved to be the more efficient estimator. Given our findings, we propose not to adapt the sample size when separation is encountered but to use FC as the default method of analysis whenever the number of observations or outcome events is critically low.


2021 ◽  
Vol 72 (2) ◽  
pp. 603-617
Author(s):  
Moulay Zaidan Lahjouji-Seppälä ◽  
Achim Rabus

Abstract Quantitative, corpus based research on spontaneous spoken Carpathian Rusyn language can cause several data-related problems: Speakers are using ambivalent forms in different quantities, resulting in a biased data set – while a stricter data-cleaning process would lead to a large scale data loss. On top of that, polytomous categorical dependent variables are hard to analyze due to methodological limitations. This paper provides several approaches to face unbalanced and biased data sets containing variation of conjugational forms of the verb maty ‘to have’ and (po-)znaty ‘to know’ in Carpathian Rusyn language. Using resampling based methods like Cross-Validation, Bootstrapping and Random Forests, we provide a strategy for circumventing possible methodological pitfalls and gaining the most information from our precious data, without trying to p-hack the results. Calculating the predictive power of several sociolinguistic factors on linguistic variation, we can make valid statements about the (sociolinguistic) status of Rusyn and the stability of the old dialect continuum of Rusyn varieties.


2021 ◽  
Vol 11 (4) ◽  
pp. 1829
Author(s):  
Davide Grande ◽  
Catherine A. Harris ◽  
Giles Thomas ◽  
Enrico Anderlini

Recurrent Neural Networks (RNNs) are increasingly being used for model identification, forecasting and control. When identifying physical models with unknown mathematical knowledge of the system, Nonlinear AutoRegressive models with eXogenous inputs (NARX) or Nonlinear AutoRegressive Moving-Average models with eXogenous inputs (NARMAX) methods are typically used. In the context of data-driven control, machine learning algorithms are proven to have comparable performances to advanced control techniques, but lack the properties of the traditional stability theory. This paper illustrates a method to prove a posteriori the stability of a generic neural network, showing its application to the state-of-the-art RNN architecture. The presented method relies on identifying the poles associated with the network designed starting from the input/output data. Providing a framework to guarantee the stability of any neural network architecture combined with the generalisability properties and applicability to different fields can significantly broaden their use in dynamic systems modelling and control.


Author(s):  
Jianping Ju ◽  
Hong Zheng ◽  
Xiaohang Xu ◽  
Zhongyuan Guo ◽  
Zhaohui Zheng ◽  
...  

AbstractAlthough convolutional neural networks have achieved success in the field of image classification, there are still challenges in the field of agricultural product quality sorting such as machine vision-based jujube defects detection. The performance of jujube defect detection mainly depends on the feature extraction and the classifier used. Due to the diversity of the jujube materials and the variability of the testing environment, the traditional method of manually extracting the features often fails to meet the requirements of practical application. In this paper, a jujube sorting model in small data sets based on convolutional neural network and transfer learning is proposed to meet the actual demand of jujube defects detection. Firstly, the original images collected from the actual jujube sorting production line were pre-processed, and the data were augmented to establish a data set of five categories of jujube defects. The original CNN model is then improved by embedding the SE module and using the triplet loss function and the center loss function to replace the softmax loss function. Finally, the depth pre-training model on the ImageNet image data set was used to conduct training on the jujube defects data set, so that the parameters of the pre-training model could fit the parameter distribution of the jujube defects image, and the parameter distribution was transferred to the jujube defects data set to complete the transfer of the model and realize the detection and classification of the jujube defects. The classification results are visualized by heatmap through the analysis of classification accuracy and confusion matrix compared with the comparison models. The experimental results show that the SE-ResNet50-CL model optimizes the fine-grained classification problem of jujube defect recognition, and the test accuracy reaches 94.15%. The model has good stability and high recognition accuracy in complex environments.


Symmetry ◽  
2021 ◽  
Vol 13 (4) ◽  
pp. 525 ◽  
Author(s):  
Mehdi Keshavarz-Ghorabaee ◽  
Maghsoud Amiri ◽  
Edmundas Kazimieras Zavadskas ◽  
Zenonas Turskis ◽  
Jurgita Antucheviciene

The weights of criteria in multi-criteria decision-making (MCDM) problems are essential elements that can significantly affect the results. Accordingly, researchers developed and presented several methods to determine criteria weights. Weighting methods could be objective, subjective, and integrated. This study introduces a new method, called MEREC (MEthod based on the Removal Effects of Criteria), to determine criteria’ objective weights. This method uses a novel idea for weighting criteria. After systematically introducing the method, we present some computational analyses to confirm the efficiency of the MEREC. Firstly, an illustrative example demonstrates the procedure of the MEREC for calculation of the weights of criteria. Secondly, a comparative analysis is presented through an example for validation of the introduced method’s results. Additionally, we perform a simulation-based analysis to verify the reliability of MEREC and the stability of its results. The data of the MCDM problems generated for making this analysis follow a prevalent symmetric distribution (normal distribution). We compare the results of the MEREC with some other objective weighting methods in this analysis, and the analysis of means (ANOM) for variances shows the stability of its results. The conducted analyses demonstrate that the MEREC is efficient to determine objective weights of criteria.


Sign in / Sign up

Export Citation Format

Share Document