Learning with mitigating random consistency from the accuracy measure

Jieting Wang; Yuhua Qian; Feijiang Li

doi:10.1007/s10994-020-05914-3

Learning with mitigating random consistency from the accuracy measure

Machine Learning ◽

10.1007/s10994-020-05914-3 ◽

2020 ◽

Vol 109 (12) ◽

pp. 2247-2281

Author(s):

Jieting Wang ◽

Yuhua Qian ◽

Feijiang Li

Keyword(s):

Performance Measure ◽

Data Sets ◽

Human Beings ◽

Bayes Risk ◽

Generalization Bounds ◽

Benchmark Data ◽

Accuracy Measure ◽

Machine Leaning ◽

Comparable Performance ◽

Better Than

AbstractHuman beings may make random guesses in decision-making. Occasionally, their guesses may generate consistency with the real situation. This kind of consistency is termed random consistency. In the area of machine leaning, the randomness is unavoidable and ubiquitous in learning algorithms. However, the accuracy (A), which is a fundamental performance measure for machine learning, does not recognize the random consistency. This causes that the classifiers learnt by A contain the random consistency. The random consistency may cause an unreliable evaluation and harm the generalization performance. To solve this problem, the pure accuracy (PA) is defined to eliminate the random consistency from the A. In this paper, we mainly study the necessity, learning consistency and leaning method of the PA. We show that the PA is insensitive to the class distribution of classifier and is more fair to the majority and the minority than A. Subsequently, some novel generalization bounds on the PA and A are given. Furthermore, we show that the PA is Bayes-risk consistent in finite and infinite hypothesis space. We design a plug-in rule that maximizes the PA, and the experiments on twenty benchmark data sets demonstrate that the proposed method performs statistically better than the kernel logistic regression in terms of PA and comparable performance in terms of A. Compared with the other plug-in rules, the proposed method obtains much better performance.

Position Regularized Core Vector Machines

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.574.728 ◽

2014 ◽

Vol 574 ◽

pp. 728-733

Author(s):

Shu Xia Lu ◽

Cai Hong Jiao ◽

Le Tong ◽

Yang Fan Zhou

Keyword(s):

Large Data ◽

Experimental Results ◽

Large Data Sets ◽

Data Sets ◽

Benchmark Data ◽

Vector Machines ◽

Data Points ◽

Minimum Enclosing Ball ◽

Better Than

Core Vector Machine (CVM) can be used to deal with large data sets by find minimum enclosing ball (MEB), but one drawback is that CVM is very sensitive to the outliers. To tackle this problem, we propose a novel Position Regularized Core Vector Machine (PCVM).In the proposed PCVM, the data points are regularized by assigning a position-based weighting. Experimental results on several benchmark data sets show that the performance of PCVM is much better than CVM.

Towards a Universal Semantic Dictionary

Applied Sciences ◽

10.3390/app9194060 ◽

2019 ◽

Vol 9 (19) ◽

pp. 4060 ◽

Cited By ~ 1

Author(s):

Maria Jose Castro-Bleda ◽

Eszter Iklódi ◽

Gábor Recski ◽

Gábor Borbély

Keyword(s):

Arbitrary Number ◽

Training Data ◽

Word Embeddings ◽

Italian Translation ◽

Benchmark Data ◽

First Case ◽

Comparable Performance ◽

Baseline System ◽

Novel Method ◽

Better Than

A novel method for finding linear mappings among word embeddings for several languages, taking as pivot a shared, multilingual embedding space, is proposed in this paper. Previous approaches learned translation matrices between two specific languages, while this method learns translation matrices between a given language and a shared, multilingual space. The system was first trained on bilingual, and later on multilingual corpora as well. In the first case, two different training data were applied: Dinu’s English–Italian benchmark data, and English–Italian translation pairs extracted from the PanLex database. In the second case, only the PanLex database was used. The system performs on English–Italian languages with the best setting significantly better than the baseline system given by Mikolov, and it provides a comparable performance with more sophisticated systems. Exploiting the richness of the PanLex database, the proposed method makes it possible to learn linear mappings among an arbitrary number of languages.

Towards a Universal Semantic Dictionary

10.20944/preprints201907.0336.v1 ◽

2019 ◽

Author(s):

María José Castro-Bleda ◽

Eszter Iklodi ◽

Gabor Recski ◽

Gabor Borbely

Keyword(s):

Training Data ◽

Italian Translation ◽

Benchmark Data ◽

First Case ◽

Universal Space ◽

Comparable Performance ◽

Baseline System ◽

Novel Method ◽

Universal Embedding ◽

Better Than

A novel method for finding linear mappings among word embeddings for several languages, taking as pivot a shared, universal embedding space, is proposed in this paper. Previous approaches learn translation matrices between two specific languages, but this method learn translation matrices between a given language and a shared, universal space. The system was first trained on bilingual, and later on multilingual corpora as well. In the first case two different training data were applied; Dinu’s English-Italian benchmark data, and English-Italian translation pairs extracted from the PanLex database. In the second case only the PanLex database was used. The system performs on English-Italian languages with the best setting significantly better than the baseline system of Mikolov et al. [1], and it provides a comparable performance with the more sophisticated systems of Faruqui and Dyer [2] and Dinu et al. [3]. Exploiting the richness of the PanLex database, the proposed method makes it possible to learn linear mappings among an arbitrary number of languages.

Reduction from Cost-Sensitive Ordinal Ranking to Weighted Binary Classification

Neural Computation ◽

10.1162/neco_a_00265 ◽

2012 ◽

Vol 24 (5) ◽

pp. 1329-1367 ◽

Cited By ~ 55

Author(s):

Hsuan-Tien Lin ◽

Ling Li

Keyword(s):

Binary Classification ◽

Upper Bounds ◽

Support Vector ◽

Data Sets ◽

Binary Classifier ◽

Generalization Bounds ◽

Ordinal Ranking ◽

Benchmark Data ◽

Ranking Algorithms ◽

Ranking Performance

We present a reduction framework from ordinal ranking to binary classification. The framework consists of three steps: extracting extended examples from the original examples, learning a binary classifier on the extended examples with any binary classification algorithm, and constructing a ranker from the binary classifier. Based on the framework, we show that a weighted 0/1 loss of the binary classifier upper-bounds the mislabeling cost of the ranker, both error-wise and regret-wise. Our framework allows not only the design of good ordinal ranking algorithms based on well-tuned binary classification approaches, but also the derivation of new generalization bounds for ordinal ranking from known bounds for binary classification. In addition, our framework unifies many existing ordinal ranking algorithms, such as perceptron ranking and support vector ordinal regression. When compared empirically on benchmark data sets, some of our newly designed algorithms enjoy advantages in terms of both training speed and generalization performance over existing algorithms. In addition, the newly designed algorithms lead to better cost-sensitive ordinal ranking performance, as well as improved listwise ranking performance.

Faculty Opinions recommendation of Benchmark data sets for structure-based computational target prediction.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.718516631.793500133 ◽

2014 ◽

Author(s):

Vytas Bankaitis ◽

Ashutosh Tripathi

Keyword(s):

Target Prediction ◽

Data Sets ◽

Benchmark Data

Non-Covalent Interactions Atlas Benchmark Data Sets 3: Repulsive Contacts

Journal of Chemical Theory and Computation ◽

10.1021/acs.jctc.0c01341 ◽

2021 ◽

Vol 17 (3) ◽

pp. 1548-1561

Author(s):

Kristian Kříž ◽

Martin Nováček ◽

Jan Řezáč

Keyword(s):

Data Sets ◽

Benchmark Data ◽

Non Covalent Interactions ◽

Covalent Interactions

A Novel LSTM Model with Interaction Dual Attention for Radar Echo Extrapolation

Remote Sensing ◽

10.3390/rs13020164 ◽

2021 ◽

Vol 13 (2) ◽

pp. 164

Author(s):

Chuyao Luo ◽

Xutao Li ◽

Yongliang Wen ◽

Yunming Ye ◽

Xiaofeng Zhang

Keyword(s):

Short Term Memory ◽

Weather Forecast ◽

Vital Role ◽

Data Sets ◽

Short Term ◽

Learning Techniques ◽

Radar Echo ◽

Hidden States ◽

Better Than

The task of precipitation nowcasting is significant in the operational weather forecast. The radar echo map extrapolation plays a vital role in this task. Recently, deep learning techniques such as Convolutional Recurrent Neural Network (ConvRNN) models have been designed to solve the task. These models, albeit performing much better than conventional optical flow based approaches, suffer from a common problem of underestimating the high echo value parts. The drawback is fatal to precipitation nowcasting, as the parts often lead to heavy rains that may cause natural disasters. In this paper, we propose a novel interaction dual attention long short-term memory (IDA-LSTM) model to address the drawback. In the method, an interaction framework is developed for the ConvRNN unit to fully exploit the short-term context information by constructing a serial of coupled convolutions on the input and hidden states. Moreover, a dual attention mechanism on channels and positions is developed to recall the forgotten information in the long term. Comprehensive experiments have been conducted on CIKM AnalytiCup 2017 data sets, and the results show the effectiveness of the IDA-LSTM in addressing the underestimation drawback. The extrapolation performance of IDA-LSTM is superior to that of the state-of-the-art methods.

Bayesian Trigonometric Support Vector Classifier

Neural Computation ◽

10.1162/089976603322297368 ◽

2003 ◽

Vol 15 (9) ◽

pp. 2227-2254 ◽

Cited By ~ 20

Author(s):

Wei Chu ◽

S. Sathiya Keerthi ◽

Chong Jin Ong

Keyword(s):

Loss Function ◽

Gaussian Processes ◽

Likelihood Function ◽

Support Vector ◽

Data Sets ◽

Model Adaptation ◽

Bayesian Techniques ◽

Benchmark Data ◽

Support Vector Classifier ◽

Set Up

This letter describes Bayesian techniques for support vector classification. In particular, we propose a novel differentiable loss function, called the trigonometric loss function, which has the desirable characteristic of natural normalization in the likelihood function, and then follow standard gaussian processes techniques to set up a Bayesian framework. In this framework, Bayesian inference is used to implement model adaptation, while keeping the merits of support vector classifier, such as sparseness and convex programming. This differs from standard gaussian processes for classification. Moreover, we put forward class probability in making predictions. Experimental results on benchmark data sets indicate the usefulness of this approach.

AA-HMM: An Anti-Adversarial Hidden Markov Model for Network-Based Intrusion Detection

Applied Sciences ◽

10.3390/app8122421 ◽

2018 ◽

Vol 8 (12) ◽

pp. 2421 ◽

Cited By ~ 1

Author(s):

Chongya Song ◽

Alexander Pons ◽

Kang Yen

Keyword(s):

Markov Model ◽

Hidden Markov Model ◽

Performance Metrics ◽

Hidden Markov ◽

Data Sets ◽

Learning Abilities ◽

Benchmark Data ◽

Malicious Behavior ◽

Network Intrusion ◽

The Common

In the field of network intrusion, malware usually evades anomaly detection by disguising malicious behavior as legitimate access. Therefore, detecting these attacks from network traffic has become a challenge in this an adversarial setting. In this paper, an enhanced Hidden Markov Model, called the Anti-Adversarial Hidden Markov Model (AA-HMM), is proposed to effectively detect evasion pattern, using the Dynamic Window and Threshold techniques to achieve adaptive, anti-adversarial, and online-learning abilities. In addition, a concept called Pattern Entropy is defined and acts as the foundation of AA-HMM. We evaluate the effectiveness of our approach employing two well-known benchmark data sets, NSL-KDD and CTU-13, in terms of the common performance metrics and the algorithm’s adaptation and anti-adversary abilities.

Quantifying paleogeography using biogeography: a test case for the Ordovician and Silurian of Avalonia based on brachiopods and trilobites

Paleobiology ◽

10.1666/0094-8373(2002)028<0343:qpubat>2.0.co;2 ◽

2002 ◽

Vol 28 (3) ◽

pp. 343-363 ◽

Cited By ~ 33

Author(s):

David C. Lees ◽

Richard A. Fortey ◽

L. Robin M. Cocks

Keyword(s):

Goodness Of Fit ◽

Total Evidence ◽

Pairwise Distance ◽

Test Case ◽

Data Sets ◽

Plate Tectonic ◽

Plate Model ◽

Optimal Arrangement ◽

Evidence Analysis ◽

Better Than

Despite substantial advances in plate tectonic modeling in the last three decades, the postulated position of terranes in the Paleozoic has seldom been validated by faunal data. Fewer studies still have attempted a quantitative approach to distance based on explicit data sets. As a test case, we examine the position of Avalonia in the Ordovician (Arenig, Llanvirn, early Caradoc, and Ashgill) to mid-Silurian (Wenlock) with respect to Laurentia, Baltica, and West Gondwana. Using synoptic lists of 623 trilobite genera and 622 brachiopod genera for these four plates, summarized as Venn diagrams, we have devised proportional indices of mean endemism (ME, normalized by individual plate faunas to eliminate area biogeographic effects) and complementarity (C) for objective paleobiogeographic comparisons. These can discriminate the relative position of Avalonia by assessing the optimal arrangement of inter-centroid distances (measured as great circles) between relevant pairs of continental masses. The proportional indices are used to estimate the “goodness-of-fit” of the faunal data to two widely used dynamic plate tectonic models for these time slices, those of Smith and Rush (1998) and Ross and Scotese (1997). Our faunal data are more consistent with the latter model, which we use to suggest relationships between faunal indices for the five time slices and new rescaled inter-centroid distances between all six plate pairs. We have examined linear and exponential models in relation to continental separation for these indices. For our generic data, the linear model fits distinctly better overall. The fits of indices generated by using independent trilobite and brachiopod lists are mostly similar to each other at each time slice and for a given plate, reflecting a common biogeographic signal; however, the indices vary across the time slices. Combining groups into the same matrix in a “total evidence” analysis performs better still as a measure of distance for mean endemism in the “Scotese” plate model. Four-plate mean endemism performs much better than complementarity as an indicator of pairwise distance for either plate model in the test case.