scholarly journals A Near-Optimal Change-Detection Based Algorithm for Piecewise-Stationary Combinatorial Semi-Bandits

2020 ◽  
Vol 34 (04) ◽  
pp. 6933-6940
Author(s):  
Huozhi Zhou ◽  
Lingda Wang ◽  
Lav Varshney ◽  
Ee-Peng Lim

We investigate the piecewise-stationary combinatorial semi-bandit problem. Compared to the original combinatorial semi-bandit problem, our setting assumes the reward distributions of base arms may change in a piecewise-stationary manner at unknown time steps. We propose an algorithm, GLR-CUCB, which incorporates an efficient combinatorial semi-bandit algorithm, CUCB, with an almost parameter-free change-point detector, the Generalized Likelihood Ratio Test (GLRT). Our analysis shows that the regret of GLR-CUCB is upper bounded by O(√NKT log T), where N is the number of piecewise-stationary segments, K is the number of base arms, and T is the number of time steps. As a complement, we also derive a nearly matching regret lower bound on the order of Ω(√NKT), for both piecewise-stationary multi-armed bandits and combinatorial semi-bandits, using information-theoretic techniques and judiciously constructed piecewise-stationary bandit instances. Our lower bound is tighter than the best available regret lower bound, which is Ω(√T). Numerical experiments on both synthetic and real-world datasets demonstrate the superiority of GLR-CUCB compared to other state-of-the-art algorithms.

2021 ◽  
Vol 15 (5) ◽  
pp. 1-32
Author(s):  
Quang-huy Duong ◽  
Heri Ramampiaro ◽  
Kjetil Nørvåg ◽  
Thu-lan Dam

Dense subregion (subgraph & subtensor) detection is a well-studied area, with a wide range of applications, and numerous efficient approaches and algorithms have been proposed. Approximation approaches are commonly used for detecting dense subregions due to the complexity of the exact methods. Existing algorithms are generally efficient for dense subtensor and subgraph detection, and can perform well in many applications. However, most of the existing works utilize the state-or-the-art greedy 2-approximation algorithm to capably provide solutions with a loose theoretical density guarantee. The main drawback of most of these algorithms is that they can estimate only one subtensor, or subgraph, at a time, with a low guarantee on its density. While some methods can, on the other hand, estimate multiple subtensors, they can give a guarantee on the density with respect to the input tensor for the first estimated subsensor only. We address these drawbacks by providing both theoretical and practical solution for estimating multiple dense subtensors in tensor data and giving a higher lower bound of the density. In particular, we guarantee and prove a higher bound of the lower-bound density of the estimated subgraph and subtensors. We also propose a novel approach to show that there are multiple dense subtensors with a guarantee on its density that is greater than the lower bound used in the state-of-the-art algorithms. We evaluate our approach with extensive experiments on several real-world datasets, which demonstrates its efficiency and feasibility.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
João Lobo ◽  
Rui Henriques ◽  
Sara C. Madeira

Abstract Background Three-way data started to gain popularity due to their increasing capacity to describe inherently multivariate and temporal events, such as biological responses, social interactions along time, urban dynamics, or complex geophysical phenomena. Triclustering, subspace clustering of three-way data, enables the discovery of patterns corresponding to data subspaces (triclusters) with values correlated across the three dimensions (observations $$\times$$ × features $$\times$$ × contexts). With increasing number of algorithms being proposed, effectively comparing them with state-of-the-art algorithms is paramount. These comparisons are usually performed using real data, without a known ground-truth, thus limiting the assessments. In this context, we propose a synthetic data generator, G-Tric, allowing the creation of synthetic datasets with configurable properties and the possibility to plant triclusters. The generator is prepared to create datasets resembling real 3-way data from biomedical and social data domains, with the additional advantage of further providing the ground truth (triclustering solution) as output. Results G-Tric can replicate real-world datasets and create new ones that match researchers needs across several properties, including data type (numeric or symbolic), dimensions, and background distribution. Users can tune the patterns and structure that characterize the planted triclusters (subspaces) and how they interact (overlapping). Data quality can also be controlled, by defining the amount of missing, noise or errors. Furthermore, a benchmark of datasets resembling real data is made available, together with the corresponding triclustering solutions (planted triclusters) and generating parameters. Conclusions Triclustering evaluation using G-Tric provides the possibility to combine both intrinsic and extrinsic metrics to compare solutions that produce more reliable analyses. A set of predefined datasets, mimicking widely used three-way data and exploring crucial properties was generated and made available, highlighting G-Tric’s potential to advance triclustering state-of-the-art by easing the process of evaluating the quality of new triclustering approaches.


Electronics ◽  
2021 ◽  
Vol 10 (12) ◽  
pp. 1407
Author(s):  
Peng Wang ◽  
Jing Zhou ◽  
Yuzhang Liu ◽  
Xingchen Zhou

Knowledge graph embedding aims to embed entities and relations into low-dimensional vector spaces. Most existing methods only focus on triple facts in knowledge graphs. In addition, models based on translation or distance measurement cannot fully represent complex relations. As well-constructed prior knowledge, entity types can be employed to learn the representations of entities and relations. In this paper, we propose a novel knowledge graph embedding model named TransET, which takes advantage of entity types to learn more semantic features. More specifically, circle convolution based on the embeddings of entity and entity types is utilized to map head entity and tail entity to type-specific representations, then translation-based score function is used to learn the presentation triples. We evaluated our model on real-world datasets with two benchmark tasks of link prediction and triple classification. Experimental results demonstrate that it outperforms state-of-the-art models in most cases.


2021 ◽  
Vol 13 (9) ◽  
pp. 1628
Author(s):  
Seden Hazal Gulen Yilmaz ◽  
Chiara Zarro ◽  
Harun Taha Hayvaci ◽  
Silvia Liberata Ullo

The problem of detecting point like targets over a glistening surface is investigated in this manuscript, and the design of an optimal waveform through a two-step process for a multipath exploitation radar is proposed. In the first step, a non-adaptive waveform is transmitted anda constrained Generalized Likelihood Ratio Test (GLRT) detector is deduced at reception which exploits multipath returns in the range cell under test by modelling the target echo as a superposition of the direct plus the multipath returns. Under the hypothesis of heterogeneous environments, thus by assuming a compound-Gaussian distribution for the clutter return, this latter is estimated in the range cell under test through the secondary data, which are collected from the out-of-bin cells. The Fixed Point Estimate (FPE) algorithm is applied in the clutter estimation, then used to design the adaptive waveform for transmission in the second step of the algorithm, in order to suppress the clutter coming from the adjacent cells. The proposed GLRT is also used at the end of the second transmission for the final decision. Extensive performance evaluation of the proposed detector and adaptive waveform for various multipath scenarios is presented. The performance analysis prove that the proposed method improves the Signal-to-Clutter Ratio (SCR) of the received signal, and the detection performance with multipath exploitation.


2015 ◽  
Vol 2015 ◽  
pp. 1-12 ◽  
Author(s):  
Wei Yang ◽  
Luhui Xu ◽  
Xiaopan Chen ◽  
Fengbin Zheng ◽  
Yang Liu

Learning a proper distance metric for histogram data plays a crucial role in many computer vision tasks. The chi-squared distance is a nonlinear metric and is widely used to compare histograms. In this paper, we show how to learn a general form of chi-squared distance based on the nearest neighbor model. In our method, the margin of sample is first defined with respect to the nearest hits (nearest neighbors from the same class) and the nearest misses (nearest neighbors from the different classes), and then the simplex-preserving linear transformation is trained by maximizing the margin while minimizing the distance between each sample and its nearest hits. With the iterative projected gradient method for optimization, we naturally introduce thel2,1norm regularization into the proposed method for sparse metric learning. Comparative studies with the state-of-the-art approaches on five real-world datasets verify the effectiveness of the proposed method.


Algorithmica ◽  
2021 ◽  
Author(s):  
Seungbum Jo ◽  
Rahul Lingala ◽  
Srinivasa Rao Satti

AbstractWe consider the problem of encoding two-dimensional arrays, whose elements come from a total order, for answering $${\text{Top-}}{k}$$ Top- k queries. The aim is to obtain encodings that use space close to the information-theoretic lower bound, which can be constructed efficiently. For an $$m \times n$$ m × n array, with $$m \le n$$ m ≤ n , we first propose an encoding for answering 1-sided $${\textsf {Top}}{\text {-}}k{}$$ Top - k queries, whose query range is restricted to $$[1 \dots m][1 \dots a]$$ [ 1 ⋯ m ] [ 1 ⋯ a ] , for $$1 \le a \le n$$ 1 ≤ a ≤ n . Next, we propose an encoding for answering for the general (4-sided) $${\textsf {Top}}{\text {-}}k{}$$ Top - k queries that takes $$(m\lg {{(k+1)n \atopwithdelims ()n}}+2nm(m-1)+o(n))$$ ( m lg ( k + 1 ) n n + 2 n m ( m - 1 ) + o ( n ) ) bits, which generalizes the joint Cartesian tree of Golin et al. [TCS 2016]. Compared with trivial $$O(nm\lg {n})$$ O ( n m lg n ) -bit encoding, our encoding takes less space when $$m = o(\lg {n})$$ m = o ( lg n ) . In addition to the upper bound results for the encodings, we also give lower bounds on encodings for answering 1 and 4-sided $${\textsf {Top}}{\text {-}}k{}$$ Top - k queries, which show that our upper bound results are almost optimal.


Sign in / Sign up

Export Citation Format

Share Document