GNS: Forge High Anonymity Graph by Nonlinear Scaling Spectrum

It is crucial to generate random graphs with specific structural properties from real graphs, which could anonymize graphs or generate targeted graph data sets. The state-of-the-art method called spectral graph forge (SGF) was proposed at INFOCOM 2018. This method uses a low-rank approximation of the matrix by throwing away some spectrums, which provides privacy protection after distributing graphs while ensuring data availability to a certain extent. As shown in SGF, it needs to discard at least 20% spectrum to defend against deanonymous attacks. However, the data availability will be significantly decreased after more spectrum discarding. Thus, is there a way to generate a graph that guarantees maximum spectrum and anonymity at the same time? To solve this problem, this paper proposes graph nonlinear scaling (GNS). We firmly prove that GNS can preserve all eigenvectors meanwhile providing high anonymity for the forged graph. Precisely, the GNS scales the eigenvalues of the original spectrum and constructs the forged graph with scaled eigenvalues and original eigenvectors. This approach maximizes the preservation of spectrum information to guarantee data availability. Meanwhile, it provides high robustness towards deanonymous attacks. The experimental results show that when SGF discards only 10% of the spectrum, the forged graph has high data availability. At this time, if the distance vector deanonymity algorithm is used to attack the forged graph, almost 100% of the nodes can be identified, while when achieving the same availability, only about 20% of the nodes in the forged graph obtained from GNS can be identified. Moreover, our method is better than SGF in capturing the real graph’s structure in terms of modularity, the number of partitions, and average clustering.

Download Full-text

A Low-Rank Approximation for Computing the Matrix Exponential Norm

SIAM Journal on Matrix Analysis and Applications ◽

10.1137/100789774 ◽

2011 ◽

Vol 32 (2) ◽

pp. 349-363 ◽

Cited By ~ 11

Author(s):

Yuri M. Nechepurenko ◽

Miloud Sadkane

Keyword(s):

Matrix Exponential ◽

Low Rank ◽

Low Rank Approximation ◽

The Matrix ◽

Rank Approximation

Download Full-text

A parallel implementation of the matrix cross approximation method

Numerical Methods and Programming (Vychislitel'nye Metody i Programmirovanie) ◽

10.26089/nummet.v16r336 ◽

2015 ◽

pp. 369-375

Author(s):

Д.А. Желтков ◽

Е.Е. Тыртышников

Keyword(s):

Approximation Method ◽

Parallel Implementation ◽

Computational Cost ◽

Low Rank ◽

Fast Method ◽

Low Rank Approximation ◽

Rank Matrix ◽

The Matrix ◽

Rank Approximation ◽

Low Rank Matrix

Матричный крестовый метод является быстрым методом аппроксимации матриц матрицами малого ранга, его сложность составляет $O((m+n)r^2)$ операций. Важной особенностью является то, что если матрица задана не как хранящийся в памяти массив, а как функция от двух целочисленных аргументов, то можно найти еe малоранговое приближение, вычислив лишь $O((m+n)r)$ значений этой функции. Однако в случае сверхбольших размеров матрицы или крайней затратности вычисления еe элементов аппроксимация может занимать существенное время. Ускорить метод для подобных случаев можно с помощью параллельных алгоритмов. В настоящей статье предложен эффективный параллельный алгоритм для случая одинаковой сложности вычисления любого элемента матрицы. The matrix cross approximation method is a fast method based on low-rank matrix approximations with complexity $O((m+n)r^2)$ arithmetic operations. Its main feature consists in the following: if a matrix is not given as an array but is given as a function of two integer arguments, then this method allows one to compute the low-rank approximation of the given matrix by evaluating only $O((m+n)r)$ values of this function. However, if the matrix is extremely large or the evaluation of its elements is computationally expensive, then such an approximation becomes timeconsuming. For such cases, the performance of the method can be improved via parallelization. In this paper we propose an efficient parallel algorithm for the case of an equal computational cost for the evaluation of each matrix element.

Download Full-text

PLANC

ACM Transactions on Mathematical Software ◽

10.1145/3432185 ◽

2021 ◽

Vol 47 (3) ◽

pp. 1-37

Author(s):

Srinivas Eswar ◽

Koby Hayashi ◽

Grey Ballard ◽

Ramakrishnan Kannan ◽

Michael A. Matheson ◽

...

Keyword(s):

Sparse Matrices ◽

Computation Time ◽

Low Rank ◽

Data Sets ◽

Low Rank Approximation ◽

Real World Data ◽

Alternating Direction ◽

Nonnegativity Constraints ◽

Rank Approximation ◽

Tensor Data

We consider the problem of low-rank approximation of massive dense nonnegative tensor data, for example, to discover latent patterns in video and imaging applications. As the size of data sets grows, single workstations are hitting bottlenecks in both computation time and available memory. We propose a distributed-memory parallel computing solution to handle massive data sets, loading the input data across the memories of multiple nodes, and performing efficient and scalable parallel algorithms to compute the low-rank approximation. We present a software package called Parallel Low-rank Approximation with Nonnegativity Constraints, which implements our solution and allows for extension in terms of data (dense or sparse, matrices or tensors of any order), algorithm (e.g., from multiplicative updating techniques to alternating direction method of multipliers), and architecture (we exploit GPUs to accelerate the computation in this work). We describe our parallel distributions and algorithms, which are careful to avoid unnecessary communication and computation, show how to extend the software to include new algorithms and/or constraints, and report efficiency and scalability results for both synthetic and real-world data sets.

Download Full-text

An unconventional robust integrator for dynamical low-rank approximation

BIT Numerical Mathematics ◽

10.1007/s10543-021-00873-0 ◽

2021 ◽

Author(s):

Gianluca Ceruti ◽

Christian Lubich

Keyword(s):

Differential Equation ◽

Time Integration ◽

Singular Values ◽

Time Dependent ◽

Low Rank ◽

Low Rank Approximation ◽

Numerical Integrator ◽

The Matrix ◽

Rank Approximation ◽

Matrix Differential

AbstractWe propose and analyse a numerical integrator that computes a low-rank approximation to large time-dependent matrices that are either given explicitly via their increments or are the unknown solution to a matrix differential equation. Furthermore, the integrator is extended to the approximation of time-dependent tensors by Tucker tensors of fixed multilinear rank. The proposed low-rank integrator is different from the known projector-splitting integrator for dynamical low-rank approximation, but it retains the important robustness to small singular values that has so far been known only for the projector-splitting integrator. The new integrator also offers some potential advantages over the projector-splitting integrator: It avoids the backward time integration substep of the projector-splitting integrator, which is a potentially unstable substep for dissipative problems. It offers more parallelism, and it preserves symmetry or anti-symmetry of the matrix or tensor when the differential equation does. Numerical experiments illustrate the behaviour of the proposed integrator.

Download Full-text

Missing-Data Handling Methods for Lifelogs-Based Wellness Index Estimation: Comparative Analysis With Panel Data

JMIR Medical Informatics ◽

10.2196/20597 ◽

2020 ◽

Vol 8 (12) ◽

pp. e20597

Author(s):

Ki-Hun Kim ◽

Kwang-Jae Kim

Keyword(s):

Missing Data ◽

Panel Data ◽

Health Behavior ◽

Multiple Imputation ◽

Low Rank ◽

Data Sets ◽

Data Handling ◽

Low Rank Approximation ◽

Data Set ◽

Rank Approximation

Background A lifelogs-based wellness index (LWI) is a function for calculating wellness scores based on health behavior lifelogs (eg, daily walking steps and sleep times collected via a smartwatch). A wellness score intuitively shows the users of smart wellness services the overall condition of their health behaviors. LWI development includes estimation (ie, estimating coefficients in LWI with data). A panel data set comprising health behavior lifelogs allows LWI estimation to control for unobserved variables, thereby resulting in less bias. However, these data sets typically have missing data due to events that occur in daily life (eg, smart devices stop collecting data when batteries are depleted), which can introduce biases into LWI coefficients. Thus, the appropriate choice of method to handle missing data is important for reducing biases in LWI estimations with panel data. However, there is a lack of research in this area. Objective This study aims to identify a suitable missing-data handling method for LWI estimation with panel data. Methods Listwise deletion, mean imputation, expectation maximization–based multiple imputation, predictive-mean matching–based multiple imputation, k-nearest neighbors–based imputation, and low-rank approximation–based imputation were comparatively evaluated by simulating an existing case of LWI development. A panel data set comprising health behavior lifelogs of 41 college students over 4 weeks was transformed into a reference data set without any missing data. Then, 200 simulated data sets were generated by randomly introducing missing data at proportions from 1% to 80%. The missing-data handling methods were each applied to transform the simulated data sets into complete data sets, and coefficients in a linear LWI were estimated for each complete data set. For each proportion for each method, a bias measure was calculated by comparing the estimated coefficient values with values estimated from the reference data set. Results Methods performed differently depending on the proportion of missing data. For 1% to 30% proportions, low-rank approximation–based imputation, predictive-mean matching–based multiple imputation, and expectation maximization–based multiple imputation were superior. For 31% to 60% proportions, low-rank approximation–based imputation and predictive-mean matching–based multiple imputation performed best. For over 60% proportions, only low-rank approximation–based imputation performed acceptably. Conclusions Low-rank approximation–based imputation was the best of the 6 data-handling methods regardless of the proportion of missing data. This superiority is generalizable to other panel data sets comprising health behavior lifelogs given their verified low-rank nature, for which low-rank approximation–based imputation is known to perform effectively. This result will guide missing-data handling in reducing coefficient biases in new development cases of linear LWIs with panel data.

Download Full-text

Simulation of Nonlinear Magnetic Systems by the Finite Element Method Using BLR-Factorization

Известия высших учебных заведений Электромеханика ◽

10.17213/0136-3360-2021-4-5-14-19 ◽

2021 ◽

Vol 64 (4-5) ◽

pp. 14-19

Author(s):

Artem Khoroshev ◽

Keyword(s):

Finite Element ◽

Electromagnetic Field ◽

Finite Element Modeling ◽

Low Rank ◽

Low Rank Approximation ◽

Magnetic Systems ◽

Numerical Problem ◽

Element Modeling ◽

The Matrix ◽

Rank Approximation

The possibility of practical application of BLR-factorization (low-rank approximation of the matrix of un-knowns of a system of linear equations) for finite element modeling of the electromagnetic field topology of nonlinear magnetic systems is considered. A method for estimating the accuracy of the computed solution of the SLAE and the nature of the influence of the given accuracy of the low-rank approximation of the matrix of un-knowns on the upper limit of the relative forward error of the computed solution of the SLAE are shown. Using a model problem as an example, the dependence of the accuracy of calculating the integral characteristics of an electromechanical apparatus on the tolerance of the low-rank approximation of the matrix of unknowns is shown, as well as its effect on the convergence of the process of solving a nonlinear numerical problem. A quantitative assessment of the reduction in the computational complexity of the process of solving a numerical problem and the required amount of computer memory for solving the SLAE is carried out. The applicability of BLR-factorization for finite element modeling of the topology of the electromagnetic field without the use of numerical methods of the Krylov subspace is estimated.

Download Full-text

A Scalable Kernel-Based Semisupervised Metric Learning Algorithm with Out-of-Sample Generalization Ability

Neural Computation ◽

10.1162/neco.2008.05-07-528 ◽

2008 ◽

Vol 20 (11) ◽

pp. 2839-2861 ◽

Cited By ~ 11

Author(s):

Dit-Yan Yeung ◽

Hong Chang ◽

Guang Dai

Keyword(s):

Learning Algorithm ◽

Metric Learning ◽

Low Rank ◽

Data Sets ◽

Low Rank Approximation ◽

Real World Data ◽

Data Set ◽

Out Of Sample ◽

Rank Approximation ◽

Kernel Approach

In recent years, metric learning in the semisupervised setting has aroused a lot of research interest. One type of semisupervised metric learning utilizes supervisory information in the form of pairwise similarity or dissimilarity constraints. However, most methods proposed so far are either limited to linear metric learning or unable to scale well with the data set size. In this letter, we propose a nonlinear metric learning method based on the kernel approach. By applying low-rank approximation to the kernel matrix, our method can handle significantly larger data sets. Moreover, our low-rank approximation scheme can naturally lead to out-of-sample generalization. Experiments performed on both artificial and real-world data show very promising results.

Download Full-text

Detection of core–periphery structure in networks using spectral methods and geodesic paths

European Journal of Applied Mathematics ◽

10.1017/s095679251600022x ◽

2016 ◽

Vol 27 (6) ◽

pp. 846-887 ◽

Cited By ~ 26

Author(s):

MIHAI CUCURINGU ◽

PUCK ROMBACH ◽

SANG HOON LEE ◽

MASON A. PORTER

Keyword(s):

Goodness Of Fit ◽

Low Rank ◽

Data Sets ◽

Computationally Efficient ◽

Low Rank Approximation ◽

Real World Data ◽

Mesoscale Structure ◽

Product Matrix ◽

Geodesic Paths ◽

Rank Approximation

We introduce several novel and computationally efficient methods for detecting “core–periphery structure” in networks. Core–periphery structure is a type of mesoscale structure that consists of densely connected core vertices and sparsely connected peripheral vertices. Core vertices tend to be well-connected both among themselves and to peripheral vertices, which tend not to be well-connected to other vertices. Our first method, which is based on transportation in networks, aggregates information from many geodesic paths in a network and yields a score for each vertex that reflects the likelihood that that vertex is a core vertex. Our second method is based on a low-rank approximation of a network's adjacency matrix, which we express as a perturbation of a tensor-product matrix. Our third approach uses the bottom eigenvector of the random-walk Laplacian to infer a coreness score and a classification into core and peripheral vertices. We also design an objective function to (1) help classify vertices into core or peripheral vertices and (2) provide a goodness-of-fit criterion for classifications into core versus peripheral vertices. To examine the performance of our methods, we apply our algorithms to both synthetically generated networks and a variety of networks constructed from real-world data sets.

Download Full-text