Computing Robust Principal Components by A* Search

2018 ◽  
Vol 27 (07) ◽  
pp. 1860013 ◽  
Author(s):  
Swair Shah ◽  
Baokun He ◽  
Crystal Maung ◽  
Haim Schweitzer

Principal Component Analysis (PCA) is a classical dimensionality reduction technique that computes a low rank representation of the data. Recent studies have shown how to compute this low rank representation from most of the data, excluding a small amount of outlier data. We show how to convert this problem into graph search, and describe an algorithm that solves this problem optimally by applying a variant of the A* algorithm to search for the outliers. The results obtained by our algorithm are optimal in terms of accuracy, and are shown to be more accurate than results obtained by the current state-of-the- art algorithms which are shown not to be optimal. This comes at the cost of running time, which is typically slower than the current state of the art. We also describe a related variant of the A* algorithm that runs much faster than the optimal variant and produces a solution that is guaranteed to be near the optimal. This variant is shown experimentally to be more accurate than the current state-of-the-art and has a comparable running time.

2021 ◽  
Vol 15 (6) ◽  
pp. 1-27
Author(s):  
Marco Bressan ◽  
Stefano Leucci ◽  
Alessandro Panconesi

We address the problem of computing the distribution of induced connected subgraphs, aka graphlets or motifs , in large graphs. The current state-of-the-art algorithms estimate the motif counts via uniform sampling by leveraging the color coding technique by Alon, Yuster, and Zwick. In this work, we extend the applicability of this approach by introducing a set of algorithmic optimizations and techniques that reduce the running time and space usage of color coding and improve the accuracy of the counts. To this end, we first show how to optimize color coding to efficiently build a compact table of a representative subsample of all graphlets in the input graph. For 8-node motifs, we can build such a table in one hour for a graph with 65M nodes and 1.8B edges, which is times larger than the state of the art. We then introduce a novel adaptive sampling scheme that breaks the “additive error barrier” of uniform sampling, guaranteeing multiplicative approximations instead of just additive ones. This allows us to count not only the most frequent motifs, but also extremely rare ones. For instance, on one graph we accurately count nearly 10.000 distinct 8-node motifs whose relative frequency is so small that uniform sampling would literally take centuries to find them. Our results show that color coding is still the most promising approach to scalable motif counting.


2019 ◽  
Vol 11 (8) ◽  
pp. 911 ◽  
Author(s):  
Yong Ma ◽  
Qiwen Jin ◽  
Xiaoguang Mei ◽  
Xiaobing Dai ◽  
Fan Fan ◽  
...  

Gaussian mixture model (GMM) has been one of the most representative models for hyperspectral unmixing while considering endmember variability. However, the GMM unmixing models only have proper smoothness and sparsity prior constraints on the abundances and thus do not take into account the possible local spatial correlation. When the pixels that lie on the boundaries of different materials or the inhomogeneous region, the abundances of the neighboring pixels do not have those prior constraints. Thus, we propose a novel GMM unmixing method based on superpixel segmentation (SS) and low-rank representation (LRR), which is called GMM-SS-LRR. we adopt the SS in the first principal component of HSI to get the homogeneous regions. Moreover, the HSI to be unmixed is partitioned into regions where the statistical property of the abundance coefficients have the underlying low-rank property. Then, to further exploit the spatial data structure, under the Bayesian framework, we use GMM to formulate the unmixing problem, and put the low-rank property into the objective function as a prior knowledge, using generalized expectation maximization to solve the objection function. Experiments on synthetic datasets and real HSIs demonstrated that the proposed GMM-SS-LRR is efficient compared with other current popular methods.


10.29007/73n4 ◽  
2018 ◽  
Author(s):  
Martin Aigner ◽  
Armin Biere ◽  
Christoph Kirsch ◽  
Aina Niemetz ◽  
Mathias Preiner

Effectively parallelizing SAT solving is an open andimportant issue. The current state-of-the-art isbased on parallel portfolios. This technique relieson running multiple solvers on the same instance inparallel. As soon one instance finishes the entirerun stops. Several succesful systems even use plainparallel portfolio (PPP), where the individual solversdo not exchange any information. This paper containsa thorough experimental evaluation which shows that PPPcan improve wall-clock running time because memory accessis still local, respectively the memory system can hidethe latency of memory access. In particular, there doesnot seem as much cache congestion as one might imagine.We also present some limits on the scalibility of PPP.Thus this paper gives one argument why PPP solvers are agood fit for todays multi-core architectures.


2019 ◽  
Vol 64 ◽  
pp. 197-242 ◽  
Author(s):  
Peta Masters ◽  
Sebastian Sardina

Goal recognition is the problem of determining an agent's intent by observing her behaviour. Contemporary solutions for general task-planning relate the probability of a goal to the cost of reaching it. We adapt this approach to goal recognition in the strict context of path-planning. We show (1) that a simpler formula provides an identical result to current state-of-the-art in less than half the time under all but one set of conditions. Further, we prove (2) that the probability distribution based on this technique is independent of an agent's past behaviour and present a revised formula that achieves goal recognition by reference to the agent's starting point and current location only. Building on this, we demonstrate (3) that a Radius of Maximum Probability (i.e., the distance from a goal within which that goal is guaranteed to be the most probable) can be calculated from relative cost-distances between the candidate goals and a start location, without needing to calculate any actual probabilities. In this extended version of earlier work, we generalise our framework to the continuous domain and discuss our results, including the conditions under which our findings can be generalised back to goal recognition in general task-planning.


Author(s):  
Syeda Warda Zahra ◽  

In this review, we summarize the current “state of the art” of carbapenem antibiotics and their role in our antimicrobial armamentarium. Among the beta-lactams currently available, carbapenems are unique because they are relatively resistant to hydrolysis by most beta-lactamases. Herein, we described the cost effectiveness, safety, and advantages of carbapenems as compared to other antibiotics. We also highlight important features of the carbapenems that are presently in clinical use: imipenem-cilastatin, meropenem, ertapenem, doripenem, panipenem-betamipron, and biapenem. In closing, we emphasize some major challenges related to oral formulatuion of carbapenems and different strategies to overcome these challenges.


2021 ◽  
Vol 2022 (1) ◽  
pp. 148-165
Author(s):  
Thomas Cilloni ◽  
Wei Wang ◽  
Charles Walter ◽  
Charles Fleming

Abstract Facial recognition tools are becoming exceptionally accurate in identifying people from images. However, this comes at the cost of privacy for users of online services with photo management (e.g. social media platforms). Particularly troubling is the ability to leverage unsupervised learning to recognize faces even when the user has not labeled their images. In this paper we propose Ulixes, a strategy to generate visually non-invasive facial noise masks that yield adversarial examples, preventing the formation of identifiable user clusters in the embedding space of facial encoders. This is applicable even when a user is unmasked and labeled images are available online. We demonstrate the effectiveness of Ulixes by showing that various classification and clustering methods cannot reliably label the adversarial examples we generate. We also study the effects of Ulixes in various black-box settings and compare it to the current state of the art in adversarial machine learning. Finally, we challenge the effectiveness of Ulixes against adversarially trained models and show that it is robust to countermeasures.


2020 ◽  
Vol 2020 (3) ◽  
pp. 42-61
Author(s):  
Hayim Shaul ◽  
Dan Feldman ◽  
Daniela Rus

AbstractThe k-nearest neighbors (kNN) classifier predicts a class of a query, q, by taking the majority class of its k neighbors in an existing (already classified) database, S. In secure kNN, q and S are owned by two different parties and q is classified without sharing data. In this work we present a classifier based on kNN, that is more efficient to implement with homomorphic encryption (HE). The efficiency of our classifier comes from a relaxation we make to consider κ nearest neighbors for κ ≈k with probability that increases as the statistical distance between Gaussian and the distribution of the distances from q to S decreases. We call our classifier k-ish Nearest Neighbors (k-ish NN). For the implementation we introduce double-blinded coin-toss where the bias and output of the toss are encrypted. We use it to approximate the average and variance of the distances from q to S in a scalable circuit whose depth is independent of |S|. We believe these to be of independent interest. We implemented our classifier in an open source library based on HElib and tested it on a breast tumor database. Our classifier has accuracy and running time comparable to current state of the art (non-HE) MPC solution that have better running time but worse communication complexity. It also has communication complexity similar to naive HE implementation that have worse running time.


Author(s):  
Kan Xie ◽  
Wei Liu ◽  
Yue Lai ◽  
Weijun Li

Subspace learning has been widely utilized to extract discriminative features for classification task, such as face recognition, even when facial images are occluded or corrupted. However, the performance of most existing methods would be degraded significantly in the scenario of that data being contaminated with severe noise, especially when the magnitude of the gross corruption can be arbitrarily large. To this end, in this paper, a novel discriminative subspace learning method is proposed based on the well-known low-rank representation (LRR). Specifically, a discriminant low-rank representation and the projecting subspace are learned simultaneously, in a supervised way. To avoid the deviation from the original solution by using some relaxation, we adopt the Schatten [Formula: see text]-norm and [Formula: see text]-norm, instead of the nuclear norm and [Formula: see text]-norm, respectively. Experimental results on two famous databases, i.e. PIE and ORL, demonstrate that the proposed method achieves better classification scores than the state-of-the-art approaches.


Author(s):  
Aritra Dutta ◽  
Filip Hanzely ◽  
Peter Richtàrik

Robust principal component analysis (RPCA) is a well-studied problem whose goal is to decompose a matrix into the sum of low-rank and sparse components. In this paper, we propose a nonconvex feasibility reformulation of RPCA problem and apply an alternating projection method to solve it. To the best of our knowledge, this is the first paper proposing a method that solves RPCA problem without considering any objective function, convex relaxation, or surrogate convex constraints. We demonstrate through extensive numerical experiments on a variety of applications, including shadow removal, background estimation, face detection, and galaxy evolution, that our approach matches and often significantly outperforms current state-of-the-art in various ways.


2021 ◽  
pp. 1-15
Author(s):  
Zhixuan xu ◽  
Caikou Chen ◽  
Guojiang Han ◽  
Jun Gao

As a successful improvement on Low Rank Representation (LRR), Latent Low Rank Representation (LatLRR) has been one of the state-of-the-art models for subspace clustering due to the capability of discovering the low dimensional subspace structures of data, especially when the data samples are insufficient and/or extremely corrupted. However, the LatLRR method does not consider the nonlinear geometric structures within data, which leads to the loss of the locality information among data in the learning phase. Moreover, the coefficients of the learnt representation matrix can be negative, which lack the interpretability. To solve the above drawbacks of LatLRR, this paper introduces Laplacian, sparsity and non-negativity to LatLRR model and proposes a novel subspace clustering method, termed latent low rank representation with non-negative, sparse and laplacian constraints (NNSLLatLRR), in which we jointly take into account non-negativity, sparsity and laplacian properties of the learnt representation. As a result, the NNSLLatLRR can not only capture the global low dimensional structure and intrinsic non-linear geometric information of the data, but also enhance the interpretability of the learnt representation. Extensive experiments on two face benchmark datasets and a handwritten digit dataset show that our proposed method outperforms existing state-of-the-art subspace clustering methods.


Sign in / Sign up

Export Citation Format

Share Document