matrix sketching
Recently Published Documents


TOTAL DOCUMENTS

19
(FIVE YEARS 8)

H-INDEX

6
(FIVE YEARS 0)

2022 ◽  
Vol 16 (5) ◽  
Author(s):  
Cheng Chen ◽  
Weinan Zhang ◽  
Yong Yu

Author(s):  
Roberta Falcone ◽  
Laura Anderlucci ◽  
Angela Montanari

AbstractThe presence of imbalanced classes is more and more common in practical applications and it is known to heavily compromise the learning process. In this paper we propose a new method aimed at addressing this issue in binary supervised classification. Re-balancing the class sizes has turned out to be a fruitful strategy to overcome this problem. Our proposal performs re-balancing through matrix sketching. Matrix sketching is a recently developed data compression technique that is characterized by the property of preserving most of the linear information that is present in the data. Such property is guaranteed by the Johnson-Lindenstrauss’ Lemma (1984) and allows to embed an n-dimensional space into a reduced one without distorting, within an $$\epsilon $$ ϵ -size interval, the distances between any pair of points. We propose to use matrix sketching as an alternative to the standard re-balancing strategies that are based on random under-sampling the majority class or random over-sampling the minority one. We assess the properties of our method when combined with linear discriminant analysis (LDA), classification trees (C4.5) and Support Vector Machines (SVM) on simulated and real data. Results show that sketching can represent a sound alternative to the most widely used rebalancing methods.


2021 ◽  
Author(s):  
Fumito Tagashira ◽  
Tomoyuki Obuchi ◽  
Toshiyuki Tanaka
Keyword(s):  
Rank One ◽  

2021 ◽  
Vol 14 (6) ◽  
pp. 1102-1110
Author(s):  
Anton Tsitsulin ◽  
Marina Munkhoeva ◽  
Davide Mottin ◽  
Panagiotis Karras ◽  
Ivan Oseledets ◽  
...  

Low-dimensional representations, or embeddings , of a graph's nodes facilitate several practical data science and data engineering tasks. As such embeddings rely, explicitly or implicitly, on a similarity measure among nodes, they require the computation of a quadratic similarity matrix, inducing a tradeoff between space complexity and embedding quality. To date, no graph embedding work combines (i) linear space complexity, (ii) a nonlinear transform as its basis, and (iii) nontrivial quality guarantees. In this paper we introduce FREDE ( FREquent Directions Embedding ), a graph embedding based on matrix sketching that combines those three desiderata. Starting out from the observation that embedding methods aim to preserve the covariance among the rows of a similarity matrix, FREDE iteratively improves on quality while individually processing rows of a nonlinearly transformed PPR similarity matrix derived from a state-of-the-art graph embedding method and provides, at any iteration , column-covariance approximation guarantees in due course almost indistinguishable from those of the optimal approximation by SVD. Our experimental evaluation on variably sized networks shows that FREDE performs almost as well as SVD and competitively against state-of-the-art embedding methods in diverse data science tasks, even when it is based on as little as 10% of node similarities.


Author(s):  
Changsheng Li ◽  
Rongqing Li ◽  
Ye Yuan ◽  
Guoren Wang ◽  
Dong Xu

2020 ◽  
Author(s):  
Qianli Liao

We consider the task of matrix sketching, which is obtaining a significantly smaller representation of matrix A while retaining most of its information (or in other words, approximates A well). In particular, we investigate a recent approach called Frequent Directions (FD) initially proposed by Liberty [5] in 2013, which has drawn wide attention due to its elegancy, nice theoretical guarantees and outstanding performance in practice. Two follow-up papers [3] and [2] in 2014 further refined the theoretical bounds as well as improved the practical performance. In this report, we summarize the three papers and propose a Generalized Frequent Directions (GFD) algorithm for matrix sketching, which captures all the previous FD algorithms as special cases without losing any of the theoretical bounds. Interestingly, our additive error bound seems to apply to the previously non-guaranteed well-performing heuristic iSVD.


Author(s):  
Cheng Chen ◽  
Luo Luo ◽  
Weinan Zhang ◽  
Yong Yu ◽  
Yijiang Lian

The linear contextual bandits is a sequential decision-making problem where an agent decides among sequential actions given their corresponding contexts. Since large-scale data sets become more and more common, we study the linear contextual bandits in high-dimensional situations. Recent works focus on employing matrix sketching methods to accelerating contextual bandits. However, the matrix approximation error will bring additional terms to the regret bound. In this paper we first propose a novel matrix sketching method which is called Spectral Compensation Frequent Directions (SCFD). Then we propose an efficient approach for contextual bandits by adopting SCFD to approximate the covariance matrices. By maintaining and manipulating sketched matrices, our method only needs O(md) space and O(md) updating time in each round, where d is the dimensionality of the data and m is the sketching size. Theoretical analysis reveals that our method has better regret bounds than previous methods in high-dimensional cases. Experimental results demonstrate the effectiveness of our algorithm and verify our theoretical guarantees.


2018 ◽  
Vol 25 (7) ◽  
pp. 1069-1073 ◽  
Author(s):  
Zilin Zhang ◽  
Yan Li ◽  
Zhengwen Zhang ◽  
Cheng Jin ◽  
Meiguo Gao

Sign in / Sign up

Export Citation Format

Share Document