Improving Dialog Evaluation with a Multi-reference Adversarial Dataset and Large Scale Pretraining

There is an increasing focus on model-based dialog evaluation metrics such as ADEM, RUBER, and the more recent BERT-based metrics. These models aim to assign a high score to all relevant responses and a low score to all irrelevant responses. Ideally, such models should be trained using multiple relevant and irrelevant responses for any given context. However, no such data is publicly available, and hence existing models are usually trained using a single relevant response and multiple randomly selected responses from other contexts (random negatives). To allow for better training and robust evaluation of model-based metrics, we introduce the DailyDialog++ dataset, consisting of (i) five relevant responses for each context and (ii) five adversarially crafted irrelevant responses for each context. Using this dataset, we first show that even in the presence of multiple correct references, n-gram based metrics and embedding based metrics do not perform well at separating relevant responses from even random negatives. While model-based metrics perform better than n-gram and embedding based metrics on random negatives, their performance drops substantially when evaluated on adversarial examples. To check if large scale pretraining could help, we propose a new BERT-based evaluation metric called DEB, which is pretrained on 727M Reddit conversations and then finetuned on our dataset. DEB significantly outperforms existing models, showing better correlation with human judgments and better performance on random negatives (88.27% accuracy). However, its performance again drops substantially when evaluated on adversarial responses, thereby highlighting that even large-scale pretrained evaluation models are not robust to the adversarial examples in our dataset. The dataset 1 and code 2 are publicly available.

Download Full-text

Customer Churn Analysis Model in Manufacturing Industry

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.69-70.675 ◽

2009 ◽

Vol 69-70 ◽

pp. 675-679

Author(s):

D.S. Liu ◽

Chun Hua Ju

Keyword(s):

Large Scale ◽

Manufacturing Industry ◽

Principal Component ◽

Support Vector ◽

Redundant Information ◽

Analysis Model ◽

Customer Churn ◽

Model Based ◽

Churn Analysis ◽

Better Than

To address the problem of customer churn in CRM in manufacturing industry, this paper proposes a prediction model based on Support Vector Machine (SVM). Considering the large-scale and imbalanced churn data, principal component analysis (PCA) is adopted to reduce dimensions and eliminate redundant information, which makes the sample space for SVM more compact and reasonable. An improved SVM is used to predict customer churn. Firstly, PCA is adopted to process 17 dimensional feature vectors of customer churn data, and then the application in manufacturing industry verifies that this model based on both PCA and SVM performs better than the model based on SVM only and other traditional models.

Download Full-text

Application of Morphosyntactic and Class-Based Language Models in Automatic Speech Recognition of Polish

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213016500068 ◽

2016 ◽

Vol 25 (02) ◽

pp. 1650006

Author(s):

Aleksander Smywinski-Pohl ◽

Bartosz Ziółko

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Language Model ◽

Language Models ◽

Clustering Method ◽

Training Corpus ◽

Model Based ◽

N Gram ◽

Better Than

In this paper we investigate the usefulness of morphosyntactic information as well as clustering in modeling Polish for automatic speech recognition. Polish is an inflectional language, thus we investigate the usefulness of an N-gram model based on morphosyntactic features. We present how individual types of features influence the model and which types of features are best suited for building a language model for automatic speech recognition. We compared the results of applying them with a class-based model that is automatically derived from the training corpus. We show that our approach towards clustering performs significantly better than frequently used SRI LM clustering method. However, this difference is apparent only for smaller corpora.

Download Full-text

Model-based Control Techniques for Large-Scale High-Precision Stage

IEEJ Transactions on Industry Applications ◽

10.1541/ieejias.140.272 ◽

2020 ◽

Vol 140 (4) ◽

pp. 272-280

Author(s):

Wataru Ohnishi ◽

Hiroshi Fujimoto ◽

Koichi Sakata

Keyword(s):

High Precision ◽

Large Scale ◽

Model Based Control ◽

Precision Stage ◽

Control Techniques ◽

Model Based

Download Full-text

Model-based Identification, Estimation, and Control for Large-scale Urban Road Networks

2020 European Control Conference (ECC) ◽

10.23919/ecc51009.2020.9143995 ◽

2020 ◽

Author(s):

Isik Ilber Sirmatel ◽

Nikolas Geroliminis

Keyword(s):

Large Scale ◽

Road Networks ◽

Urban Road ◽

Model Based ◽

Estimation And Control ◽

And Control

Download Full-text

PENGEMBANGAN MEDIA BIG BOOK UNTUK MENINGKATKAN HASIL BELAJAR PECAHAN SENILAI SISWA SD

Jurnal Litbang Provinsi Jawa Tengah ◽

10.36762/litbangjateng.v16i1.751 ◽

2018 ◽

Vol 16 (1) ◽

pp. 67-76

Author(s):

Disyacitta Neolia Firdana ◽

Trimurtini Trimurtini

Keyword(s):

Learning Outcomes ◽

Large Scale ◽

Fourth Grade ◽

Learning Activities ◽

Equivalent Fractions ◽

Fourth Grade Students ◽

Teacher Needs ◽

The Media ◽

Test Result ◽

Better Than

This research aimed to determine the properness and effectiveness of the big book media on learning equivalent fractions of fourth grade students. The method of research is Research and Development (R&D). This study was conducted in fourth grade of SDN Karanganyar 02 Kota Semarang. Data sources from media validation, material validation, learning outcomes, and teacher and students responses on developed media. Pre-experimental research design with one group pretest-posttest design. Big book developed consist of equivalent fractions material, students learning activities sheets with rectangle and circle shape pictures, and questions about equivalent fractions. Big book was developed based on students and teacher needs. This big book fulfill the media validity of 3,75 with very good criteria and scored 3 by material experts with good criteria. In large-scale trial, the result of students posttest have learning outcomes completness 82,14%. The result of N-gain calculation with result 0,55 indicates the criterion “medium”. The t-test result 9,6320 > 2,0484 which means the average of posttest outcomes is better than the average of pretest outcomes. Based on that data, this study has produced big book media which proper and effective as a media of learning equivalent fractions of fourth grade elementary school.

Download Full-text

A Model-Based Real-Time Intrusion Detection System for Large Scale Heterogeneous Networks

10.21236/ada420824 ◽

2003 ◽

Cited By ~ 1

Author(s):

Richard A. Kemmer ◽

Giovanni Vigna

Keyword(s):

Intrusion Detection ◽

Real Time ◽

Heterogeneous Networks ◽

Intrusion Detection System ◽

Large Scale ◽

Detection System ◽

Model Based

Download Full-text

DEVELOPMENT AND VALIDATION OF A LARGE-SCALE GLACIER MODEL BASED ON AN ENERGY BALANCE APPROACH OVER CENTRAL EUROPE

Journal of Japan Society of Civil Engineers Ser B1 (Hydraulic Engineering) ◽

10.2208/jscejhe.75.2_i_919 ◽

2019 ◽

Vol 75 (2) ◽

pp. I_919-I_924

Author(s):

Orie SASAKI ◽

Koji FUJITA ◽

Akiko SAKAI ◽

Yukiko HIRABAYASHI ◽

Shinjiro KANAE

Keyword(s):

Energy Balance ◽

Central Europe ◽

Large Scale ◽

Model Based ◽

Balance Approach ◽

Development And Validation ◽

Energy Balance Approach

Download Full-text

Model-based Evaluations Combining Autonomous Cars and a Large-scale Passenger Drone Service: The Bavarian Case Study

2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC) ◽

10.1109/itsc45102.2020.9294183 ◽

2020 ◽

Author(s):

Christoph Maget ◽

Sebastian Gutmann ◽

Klaus Bogenberger

Keyword(s):

Large Scale ◽

Model Based ◽

Autonomous Cars

Download Full-text

A Combined Approach for Model-Based PV Power Plant Failure Detection and Diagnostic

Energies ◽

10.3390/en14051261 ◽

2021 ◽

Vol 14 (5) ◽

pp. 1261

Author(s):

Christopher Gradwohl ◽

Vesna Dimitrievska ◽

Federico Pittino ◽

Wolfgang Muehleisen ◽

András Montvay ◽

...

Keyword(s):

Power Plants ◽

Large Scale ◽

Failure Detection ◽

Energy Yield ◽

Combined Approach ◽

Levelized Cost Of Electricity ◽

Term Operation ◽

Model Based

Photovoltaic (PV) technology allows large-scale investments in a renewable power-generating system at a competitive levelized cost of electricity (LCOE) and with a low environmental impact. Large-scale PV installations operate in a highly competitive market environment where even small performance losses have a high impact on profit margins. Therefore, operation at maximum performance is the key for long-term profitability. This can be achieved by advanced performance monitoring and instant or gradual failure detection methodologies. We present in this paper a combined approach on model-based fault detection by means of physical and statistical models and failure diagnosis based on physics of failure. Both approaches contribute to optimized PV plant operation and maintenance based on typically available supervisory control and data acquisition (SCADA) data. The failure detection and diagnosis capabilities were demonstrated in a case study based on six years of SCADA data from a PV plant in Slovenia. In this case study, underperforming values of the inverters of the PV plant were reliably detected and possible root causes were identified. Our work has led us to conclude that the combined approach can contribute to an efficient and long-term operation of photovoltaic power plants with a maximum energy yield and can be applied to the monitoring of photovoltaic plants.

Download Full-text

Assessment of Numerical Methods for Plunging Breaking Wave Predictions

Journal of Marine Science and Engineering ◽

10.3390/jmse9030264 ◽

2021 ◽

Vol 9 (3) ◽

pp. 264

Author(s):

Shanti Bhushan ◽

Oumnia El Fajri ◽

Graham Hubbard ◽

Bradley Chambers ◽

Christopher Kees

Keyword(s):

Solitary Wave ◽

Wave Breaking ◽

Large Scale ◽

Turbulence Models ◽

Navier Stokes ◽

Wave Crest ◽

Breaking Wave ◽

Rans Turbulence Models ◽

Run Up ◽

Better Than

This study evaluates the capability of Navier–Stokes solvers in predicting forward and backward plunging breaking, including assessment of the effect of grid resolution, turbulence model, and VoF, CLSVoF interface models on predictions. For this purpose, 2D simulations are performed for four test cases: dam break, solitary wave run up on a slope, flow over a submerged bump, and solitary wave over a submerged rectangular obstacle. Plunging wave breaking involves high wave crest, plunger formation, and splash up, followed by second plunger, and chaotic water motions. Coarser grids reasonably predict the wave breaking features, but finer grids are required for accurate prediction of the splash up events. However, instabilities are triggered at the air–water interface (primarily for the air flow) on very fine grids, which induces surface peel-off or kinks and roll-up of the plunger tips. Reynolds averaged Navier–Stokes (RANS) turbulence models result in high eddy-viscosity in the air–water region which decays the fluid momentum and adversely affects the predictions. Both VoF and CLSVoF methods predict the large-scale plunging breaking characteristics well; however, they vary in the prediction of the finer details. The CLSVoF solver predicts the splash-up event and secondary plunger better than the VoF solver; however, the latter predicts the plunger shape better than the former for the solitary wave run-up on a slope case.

Download Full-text