scholarly journals On the Parameterized Complexity of Clustering Incomplete Data into Subspaces of Small Rank

2020 ◽  
Vol 34 (04) ◽  
pp. 3906-3913
Author(s):  
Robert Ganian ◽  
Iyad Kanj ◽  
Sebastian Ordyniak ◽  
Stefan Szeider

We consider a fundamental matrix completion problem where we are given an incomplete matrix and a set of constraints modeled as a CSP instance. The goal is to complete the matrix subject to the input constraints and in such a way that the complete matrix can be clustered into few subspaces with low rank. This problem generalizes several problems in data mining and machine learning, including the problem of completing a matrix into one with minimum rank. In addition to its ubiquitous applications in machine learning, the problem has strong connections to information theory, related to binary linear codes, and variants of it have been extensively studied from that perspective. We formalize the problem mentioned above and study its classical and parameterized complexity. We draw a detailed landscape of the complexity and parameterized complexity of the problem with respect to several natural parameters that are desirably small and with respect to several well-studied CSP fragments.

2018 ◽  
Vol 2018 ◽  
pp. 1-12 ◽  
Author(s):  
Wendong Wang ◽  
Jianjun Wang

In this paper, we propose a new method to deal with the matrix completion problem. Different from most existing matrix completion methods that only pursue the low rank of underlying matrices, the proposed method simultaneously optimizes their low rank and smoothness such that they mutually help each other and hence yield a better performance. In particular, the proposed method becomes very competitive with the introduction of a modified second-order total variation, even when it is compared with some recently emerged matrix completion methods that also combine the low rank and smoothness priors of matrices together. An efficient algorithm is developed to solve the induced optimization problem. The extensive experiments further confirm the superior performance of the proposed method over many state-of-the-art methods.


2013 ◽  
Vol 756-759 ◽  
pp. 3977-3981 ◽  
Author(s):  
Hua Xing Yu ◽  
Xiao Fei Zhang ◽  
Jian Feng Li ◽  
De Ben

In this paper, we address the angle estimation problem in linear array with some ill sensors (partially-well sensors), which only work well randomly. The output of the array will miss some values, and this can be regarded as a low-rank matrix completion problem due to the property that the number of sources is smaller than the number of the total sensors. The output of the array, which is corrupted by the missing values and the noise, can be complete via the Optspace method, and then the angles can be estimated according to the complete output. The proposed algorithm works well for the array with some ill sensors; moreover, it is suitable for non-uniform linear array. Simulation results illustrate performance of the algorithm.


2017 ◽  
Vol 21 (2) ◽  
Author(s):  
Tatiana Gelvez ◽  
Hoover Rueda ◽  
Henry Arguello

<p>Spectral imaging aims to capture and process a 3-dimensional spectral image with a large amount of spectral information for each spatial location. Compressive spectral imaging techniques (CSI) increases the sensing speed and reduces the amount of collected data compared to traditional spectral imaging methods. The coded aperture snapshot spectral imager (CASSI) is an optical architecture to sense a spectral image in a single 2D coded projection by applying CSI. Typically, the 3D scene is recovered by solving an L1-based optimization problem that assumes the scene is sparse in some known orthonormal basis. In contrast, the matrix completion technique (MC) allows to recover the scene without such prior knowledge. The MC reconstruction algorithms rely on a low-rank structure of the scene. Moreover, the CASSI system uses coded aperture patterns that determine the quality of the estimated scene. Therefore, this paper proposes the design of an optimal coded aperture set for the MC methodology. The designed set is attained by maximizing the distance between the translucent elements in the coded aperture. Visualization of the recovered spectral signals and simulations over different databases show average improvement when the designed coded set is used between 1-3 dBs compared to the complementary coded aperture set, and between 3-9 dBs compared to the conventional random coded aperture set.</p>


Author(s):  
Andrew D McRae ◽  
Mark A Davenport

Abstract This paper considers the problem of estimating a low-rank matrix from the observation of all or a subset of its entries in the presence of Poisson noise. When we observe all entries, this is a problem of matrix denoising; when we observe only a subset of the entries, this is a problem of matrix completion. In both cases, we exploit an assumption that the underlying matrix is low-rank. Specifically, we analyse several estimators, including a constrained nuclear-norm minimization program, nuclear-norm regularized least squares and a non-convex constrained low-rank optimization problem. We show that for all three estimators, with high probability, we have an upper error bound (in the Frobenius norm error metric) that depends on the matrix rank, the fraction of the elements observed and the maximal row and column sums of the true matrix. We furthermore show that the above results are minimax optimal (within a universal constant) in classes of matrices with low-rank and bounded row and column sums. We also extend these results to handle the case of matrix multinomial denoising and completion.


2020 ◽  
Author(s):  
Aanchal Mongia ◽  
Emilie Chouzenoux ◽  
Angshul Majumdar

AbstractMotivationInvestigation of existing drugs is an effective alternative to discovery of new drugs for treating diseases. This task of drug re-positioning can be assisted by various kinds of computational methods to predict the best indication for a drug given the open-source biological datasets. Owing to the fact that similar drugs tend to have common pathways and disease indications, the association matrix is assumed to be of low-rank structure. Hence, the problem of drug-disease association prediction can been modelled as a low-rank matrix-completion problem.ResultsIn this work, we propose a novel matrix completion framework which makes use of the sideinformation associated with drugs/diseases for the prediction of drug-disease indications modelled as neighborhood graph: Graph regularized 1-bit matrix compeltion (GR1BMC). The algorithm is specially designed for binary data and uses parallel proximal algorithm to solve the aforesaid minimization problem taking into account all the constraints including the neighborhood graph incorporation and restricting predicted scores within the specified range. The results of the proposed algorithm have been validated on two standard drug-disease association databases (Fdataset and Cdataset) by evaluating the AUC across the 10-fold cross validation splits. The usage of the method is also evaluated through a case study where top 5 indications are predicted for novel drugs and diseases, which then are verified with the CTD database. The results of these experiments demonstrate the practical usage and superiority of the proposed approach over the benchmark [email protected]


2017 ◽  
Vol 5 (1) ◽  
pp. 73-81
Author(s):  
Konstantin Fackeldey ◽  
Amir Niknejad ◽  
Marcus Weber

Abstract In order to fully characterize the state-transition behaviour of finite Markov chains one needs to provide the corresponding transition matrix P. In many applications such as molecular simulation and drug design, the entries of the transition matrix P are estimated by generating realizations of the Markov chain and determining the one-step conditional probability Pij for a transition from one state i to state j. This sampling can be computational very demanding. Therefore, it is a good idea to reduce the sampling effort. The main purpose of this paper is to design a sampling strategy, which provides a partial sampling of only a subset of the rows of such a matrix P. Our proposed approach fits very well to stochastic processes stemming from simulation of molecular systems or random walks on graphs and it is different from the matrix completion approaches which try to approximate the transition matrix by using a low-rank-assumption. It will be shown how Markov chains can be analyzed on the basis of a partial sampling. More precisely. First, we will estimate the stationary distribution from a partially given matrix P. Second, we will estimate the infinitesimal generator Q of P on the basis of this stationary distribution. Third, from the generator we will compute the leading invariant subspace, which should be identical to the leading invariant subspace of P. Forth, we will apply Robust Perron Cluster Analysis (PCCA+) in order to identify metastabilities using this subspace.


Author(s):  
Antonio Agudo ◽  
Vincent Lepetit ◽  
Francesc Moreno-Noguer

AbstractGiven an unordered list of 2D or 3D point trajectories corrupted by noise and partial observations, in this paper we introduce a framework to simultaneously recover the incomplete motion tracks and group the points into spatially and temporally coherent clusters. This advances existing work, which only addresses partial problems and without considering a unified and unsupervised solution. We cast this problem as a matrix completion one, in which point tracks are arranged into a matrix with the missing entries set as zeros. In order to perform the double clustering, the measurement matrix is assumed to be drawn from a dual union of spatiotemporal subspaces. The bases and the dimensionality for these subspaces, the affinity matrices used to encode the temporal and spatial clusters to which each point belongs, and the non-visible tracks, are then jointly estimated via augmented Lagrange multipliers in polynomial time. A thorough evaluation on incomplete motion tracks for multiple-object typologies shows that the accuracy of the matrix we recover compares favorably to that obtained with existing low-rank matrix completion methods, specially under noisy measurements. In addition, besides recovering the incomplete tracks, the point trajectories are directly grouped into different object instances, and a number of semantically meaningful temporal primitive actions are automatically discovered.


Author(s):  
Jean Walrand

AbstractOnline learning algorithms update their estimates as additional observations are made. Section 12.1 explains a simple example: online linear regression. The stochastic gradient projection algorithm is a general technique to update estimates based on additional observations; it is widely used in machine learning. Section 12.2 presents the theory behind that algorithm. When analyzing large amounts of data, one faces the problems of identifying the most relevant data and of how to use efficiently the available data. Section 12.3 explains three examples of how these questions are addressed: the LASSO algorithm, compressed sensing, and the matrix completion problem. Section 12.4 discusses deep neural networks for which the stochastic gradient projection algorithm is easy to implement.


Sign in / Sign up

Export Citation Format

Share Document