scholarly journals Closing the BIG-LID: An Effective Local Intrinsic Dimensionality Defense for Nonlinear Regression Poisoning

Author(s):  
Sandamal Weerasinghe ◽  
Tamas Abraham ◽  
Tansu Alpcan ◽  
Sarah M. Erfani ◽  
Christopher Leckie ◽  
...  

Nonlinear regression, although widely used in engineering, financial and security applications for automated decision making, is known to be vulnerable to training data poisoning. Targeted poisoning attacks may cause learning algorithms to fit decision functions with poor predictive performance. This paper presents a new analysis of local intrinsic dimensionality (LID) of nonlinear regression under such poisoning attacks within a Stackelberg game, leading to a practical defense. After adapting a gradient-based attack on linear regression that significantly impairs prediction capabilities to nonlinear settings, we consider a multi-step unsupervised black-box defense. The first step identifies samples that have the greatest influence on the learner's validation error; we then use the theory of local intrinsic dimensionality, which reveals the degree of being an outlier of data samples, to iteratively identify poisoned samples via a generative probabilistic model, and suppress their influence on the prediction function. Empirical validation demonstrates superior performance compared to a range of recent defenses.

2020 ◽  
Vol 10 (5) ◽  
pp. 1557
Author(s):  
Weijia Feng ◽  
Xiaohui Li

Ultra-dense and highly heterogeneous network (HetNet) deployments make the allocation of limited wireless resources among ubiquitous Internet of Things (IoT) devices an unprecedented challenge in 5G and beyond (B5G) networks. The interactions among mobile users and HetNets remain to be analyzed, where mobile users choose optimal networks to access and the HetNets adopt proper methods for allocating their own network resource. Existing works always need complete information among mobile users and HetNets. However, it is not practical in a realistic situation where important individual information is protected and will not be public to others. This paper proposes a distributed pricing and resource allocation scheme based on a Stackelberg game with incomplete information. The proposed model proves to be more practical by solving the problem that important information of either mobile users or HetNets is difficult to acquire during the resource allocation process. Considering the unknowability of channel gain information, the follower game among users is modeled as an incomplete information game, and channel gain is regarded as the type of each player. Given the pricing strategies of networks, users will adjust their bandwidth requesting strategies to maximize their expected utility. While based on the sub-equilibrium obtained in the follower game, networks will correspondingly update their pricing strategies to be optimal. The existence and uniqueness of Bayesian Nash equilibrium is proved. A probabilistic prediction method realizes the feasibility of the incomplete information game, and a reverse deduction method is utilized to obtain the game equilibrium. Simulation results show the superior performance of the proposed method.


Sensors ◽  
2020 ◽  
Vol 20 (20) ◽  
pp. 5966
Author(s):  
Ke Wang ◽  
Gong Zhang

The challenge of small data has emerged in synthetic aperture radar automatic target recognition (SAR-ATR) problems. Most SAR-ATR methods are data-driven and require a lot of training data that are expensive to collect. To address this challenge, we propose a recognition model that incorporates meta-learning and amortized variational inference (AVI). Specifically, the model consists of global parameters and task-specific parameters. The global parameters, trained by meta-learning, construct a common feature extractor shared between all recognition tasks. The task-specific parameters, modeled by probability distributions, can adapt to new tasks with a small amount of training data. To reduce the computation and storage cost, the task-specific parameters are inferred by AVI implemented with set-to-set functions. Extensive experiments were conducted on a real SAR dataset to evaluate the effectiveness of the model. The results of the proposed approach compared with those of the latest SAR-ATR methods show the superior performance of our model, especially on recognition tasks with limited data.


2006 ◽  
Vol 18 (10) ◽  
pp. 2509-2528 ◽  
Author(s):  
Yoshua Bengio ◽  
Martin Monperrus ◽  
Hugo Larochelle

We claim and present arguments to the effect that a large class of manifold learning algorithms that are essentially local and can be framed as kernel learning algorithms will suffer from the curse of dimensionality, at the dimension of the true underlying manifold. This observation invites an exploration of nonlocal manifold learning algorithms that attempt to discover shared structure in the tangent planes at different positions. A training criterion for such an algorithm is proposed, and experiments estimating a tangent plane prediction function are presented, showing its advantages with respect to local manifold learning algorithms: it is able to generalize very far from training data (on learning handwritten character image rotations), where local nonparametric methods fail.


2011 ◽  
Vol 2011 ◽  
pp. 1-28 ◽  
Author(s):  
Zhongqiang Chen ◽  
Zhanyan Liang ◽  
Yuan Zhang ◽  
Zhongrong Chen

Grayware encyclopedias collect known species to provide information for incident analysis, however, the lack of categorization and generalization capability renders them ineffective in the development of defense strategies against clustered strains. A grayware categorization framework is therefore proposed here to not only classify grayware according to diverse taxonomic features but also facilitate evaluations on grayware risk to cyberspace. Armed with Support Vector Machines, the framework builds learning models based on training data extracted automatically from grayware encyclopedias and visualizes categorization results with Self-Organizing Maps. The features used in learning models are selected with information gain and the high dimensionality of feature space is reduced by word stemming and stopword removal process. The grayware categorizations on diversified features reveal that grayware typically attempts to improve its penetration rate by resorting to multiple installation mechanisms and reduced code footprints. The framework also shows that grayware evades detection by attacking victims' security applications and resists being removed by enhancing its clotting capability with infected hosts. Our analysis further points out that species in categoriesSpywareandAdwarecontinue to dominate the grayware landscape and impose extremely critical threats to the Internet ecosystem.


2021 ◽  
Author(s):  
Eva van der Kooij ◽  
Marc Schleiss ◽  
Riccardo Taormina ◽  
Francesco Fioranelli ◽  
Dorien Lugt ◽  
...  

<p>Accurate short-term forecasts, also known as nowcasts, of heavy precipitation are desirable for creating early warning systems for extreme weather and its consequences, e.g. urban flooding. In this research, we explore the use of machine learning for short-term prediction of heavy rainfall showers in the Netherlands.</p><p>We assess the performance of a recurrent, convolutional neural network (TrajGRU) with lead times of 0 to 2 hours. The network is trained on a 13-year archive of radar images with 5-min temporal and 1-km spatial resolution from the precipitation radars of the Royal Netherlands Meteorological Institute (KNMI). We aim to train the model to predict the formation and dissipation of dynamic, heavy, localized rain events, a task for which traditional Lagrangian nowcasting methods still come up short.</p><p>We report on different ways to optimize predictive performance for heavy rainfall intensities through several experiments. The large dataset available provides many possible configurations for training. To focus on heavy rainfall intensities, we use different subsets of this dataset through using different conditions for event selection and varying the ratio of light and heavy precipitation events present in the training data set and change the loss function used to train the model.</p><p>To assess the performance of the model, we compare our method to current state-of-the-art Lagrangian nowcasting system from the pySTEPS library, like S-PROG, a deterministic approximation of an ensemble mean forecast. The results of the experiments are used to discuss the pros and cons of machine-learning based methods for precipitation nowcasting and possible ways to further increase performance.</p>


2021 ◽  
Author(s):  
Konstantinos Slavakis ◽  
Gaurav Shetty ◽  
Loris Cannelli ◽  
Gesualdo Scutari ◽  
Ukash Nakarmi ◽  
...  

This paper introduces a non-parametric kernel-based modeling framework for imputation by regression on data that are assumed to lie close to an unknown-to-the-user smooth manifold in a Euclidean space. The proposed framework, coined kernel regression imputation in manifolds (KRIM), needs no training data to operate. Aiming at computationally efficient solutions, KRIM utilizes a small number of ``landmark'' data-points to extract geometric information from the measured data via parsimonious affine combinations (``linear patches''), which mimic the concept of tangent spaces to smooth manifolds and take place in functional approximation spaces, namely reproducing kernel Hilbert spaces (RKHSs). Multiple complex RKHSs are combined in a data-driven way to surmount the obstacle of pin-pointing the ``optimal'' parameters of a single kernel through cross-validation. The extracted geometric information is incorporated into the design via a novel bi-linear data-approximation model, and the imputation-by-regression task takes the form of an inverse problem which is solved by an iterative algorithm with guaranteed convergence to a stationary point of the non-convex loss function. To showcase the modular character and wide applicability of KRIM, this paper highlights the application of KRIM to dynamic magnetic resonance imaging (dMRI), where reconstruction of high-resolution images from severely under-sampled dMRI data is desired. Extensive numerical tests on synthetic and real dMRI data demonstrate the superior performance of KRIM over state-of-the-art approaches under several metrics and with a small computational footprint.<br>


2021 ◽  
Vol 61 (SA) ◽  
pp. SA1011
Author(s):  
Akira Kusaba ◽  
Tetsuji Kuboyama ◽  
Kilho Shin ◽  
Makoto Sasaki ◽  
Shigeru Inagaki

Abstract A new combined use of dynamic mode decomposition algorithms is proposed, which is suitable for the analysis of spatiotemporal data from experiments with few observation points, unlike computational fluid dynamics with many observation points. The method was applied to our data from a plasma turbulence experiment. As a result, we succeeded in constructing a quite accurate model for our training data and it made progress in predictive performance as well. In addition, modal patterns from the longer-term analysis help to understand the underlying mechanism more clearly, which is demonstrated in the case of plasma streamer structure. This method is expected to be a powerful tool for the data-driven construction of a reduced-order model and a predictor in plasma turbulence research and also any nonlinear dynamics researches of other applied physics fields.


2015 ◽  
Vol 3 ◽  
pp. 461-473 ◽  
Author(s):  
Daniel Beck ◽  
Trevor Cohn ◽  
Christian Hardmeier ◽  
Lucia Specia

Structural kernels are a flexible learning paradigm that has been widely used in Natural Language Processing. However, the problem of model selection in kernel-based methods is usually overlooked. Previous approaches mostly rely on setting default values for kernel hyperparameters or using grid search, which is slow and coarse-grained. In contrast, Bayesian methods allow efficient model selection by maximizing the evidence on the training data through gradient-based methods. In this paper we show how to perform this in the context of structural kernels by using Gaussian Processes. Experimental results on tree kernels show that this procedure results in better prediction performance compared to hyperparameter optimization via grid search. The framework proposed in this paper can be adapted to other structures besides trees, e.g., strings and graphs, thereby extending the utility of kernel-based methods.


2020 ◽  
Vol 2020 ◽  
pp. 1-7 ◽  
Author(s):  
Aboubakar Nasser Samatin Njikam ◽  
Huan Zhao

This paper introduces an extremely lightweight (with just over around two hundred thousand parameters) and computationally efficient CNN architecture, named CharTeC-Net (Character-based Text Classification Network), for character-based text classification problems. This new architecture is composed of four building blocks for feature extraction. Each of these building blocks, except the last one, uses 1 × 1 pointwise convolutional layers to add more nonlinearity to the network and to increase the dimensions within each building block. In addition, shortcut connections are used in each building block to facilitate the flow of gradients over the network, but more importantly to ensure that the original signal present in the training data is shared across each building block. Experiments on eight standard large-scale text classification and sentiment analysis datasets demonstrate CharTeC-Net’s superior performance over baseline methods and yields competitive accuracy compared with state-of-the-art methods, although CharTeC-Net has only between 181,427 and 225,323 parameters and weighs less than 1 megabyte.


2020 ◽  
Vol 8 ◽  
Author(s):  
Daniel Claudino ◽  
Jerimiah Wright ◽  
Alexander J. McCaskey ◽  
Travis S. Humble

By design, the variational quantum eigensolver (VQE) strives to recover the lowest-energy eigenvalue of a given Hamiltonian by preparing quantum states guided by the variational principle. In practice, the prepared quantum state is indirectly assessed by the value of the associated energy. Novel adaptive derivative-assembled pseudo-trotter (ADAPT) ansatz approaches and recent formal advances now establish a clear connection between the theory of quantum chemistry and the quantum state ansatz used to solve the electronic structure problem. Here we benchmark the accuracy of VQE and ADAPT-VQE to calculate the electronic ground states and potential energy curves for a few selected diatomic molecules, namely H2, NaH, and KH. Using numerical simulation, we find both methods provide good estimates of the energy and ground state, but only ADAPT-VQE proves to be robust to particularities in optimization methods. Another relevant finding is that gradient-based optimization is overall more economical and delivers superior performance than analogous simulations carried out with gradient-free optimizers. The results also identify small errors in the prepared state fidelity which show an increasing trend with molecular size.


Sign in / Sign up

Export Citation Format

Share Document