Multi-Document Summarization with Determinantal Point Process Attention

Couplings for determinantal point processes and their reduced Palm distributions with a view to quantifying repulsiveness

Journal of Applied Probability ◽

10.1017/jpr.2020.101 ◽

2021 ◽

Vol 58 (2) ◽

pp. 469-483

Author(s):

Jesper Møller ◽

Eliza O’Reilly

Keyword(s):

Finite Number ◽

Point Process ◽

Point Processes ◽

Parametric Models ◽

Determinantal Point Processes ◽

Determinantal Point Process ◽

The Difference ◽

Weaker Conditions ◽

Palm Distributions

AbstractFor a determinantal point process (DPP) X with a kernel K whose spectrum is strictly less than one, André Goldman has established a coupling to its reduced Palm process $X^u$ at a point u with $K(u,u)>0$ so that, almost surely, $X^u$ is obtained by removing a finite number of points from X. We sharpen this result, assuming weaker conditions and establishing that $X^u$ can be obtained by removing at most one point from X, where we specify the distribution of the difference $\xi_u: = X\setminus X^u$. This is used to discuss the degree of repulsiveness in DPPs in terms of $\xi_u$, including Ginibre point processes and other specific parametric models for DPPs.

Download Full-text

High-performance sampling of generic determinantal point processes

Philosophical Transactions of The Royal Society A Mathematical Physical and Engineering Sciences ◽

10.1098/rsta.2019.0059 ◽

2020 ◽

Vol 378 (2166) ◽

pp. 20190059 ◽

Cited By ~ 1

Author(s):

Jack Poulson

Keyword(s):

Point Process ◽

Spectral Decomposition ◽

High Performance ◽

Point Processes ◽

Cholesky Factorization ◽

Determinantal Point Processes ◽

Determinantal Point Process ◽

Decomposition Approach ◽

Map Inference ◽

Sampling Schemes

Determinantal point processes (DPPs) were introduced by Macchi (Macchi 1975 Adv. Appl. Probab. 7 , 83–122) as a model for repulsive (fermionic) particle distributions. But their recent popularization is largely due to their usefulness for encouraging diversity in the final stage of a recommender system (Kulesza & Taskar 2012 Found. Trends Mach. Learn. 5 , 123–286). The standard sampling scheme for finite DPPs is a spectral decomposition followed by an equivalent of a randomly diagonally pivoted Cholesky factorization of an orthogonal projection, which is only applicable to Hermitian kernels and has an expensive set-up cost. Researchers Launay et al. 2018 ( http://arxiv.org/abs/1802.08429 ); Chen & Zhang 2018 NeurIPS ( https://papers.nips.cc/paper/7805-fast-greedy-map-inference-for-determinantal-point-process-to-improve-recommendation-diversity.pdf ) have begun to connect DPP sampling to LDL H factorizations as a means of avoiding the initial spectral decomposition, but existing approaches have only outperformed the spectral decomposition approach in special circumstances, where the number of kept modes is a small percentage of the ground set size. This article proves that trivial modifications of LU and LDL H factorizations yield efficient direct sampling schemes for non-Hermitian and Hermitian DPP kernels, respectively. Furthermore, it is experimentally shown that even dynamically scheduled, shared-memory parallelizations of high-performance dense and sparse-direct factorizations can be trivially modified to yield DPP sampling schemes with essentially identical performance. The software developed as part of this research, Catamari ( hodgestar.com/catamari ) is released under the Mozilla Public License v.2.0. It contains header-only, C++14 plus OpenMP 4.0 implementations of dense and sparse-direct, Hermitian and non-Hermitian DPP samplers. This article is part of a discussion meeting issue ‘Numerical algorithms for high-performance computational science’.

Download Full-text

A note on the simulation of the Ginibre point process

Journal of Applied Probability ◽

10.1239/jap/1450802749 ◽

2015 ◽

Vol 52 (4) ◽

pp. 1003-1012 ◽

Cited By ~ 1

Author(s):

Laurent Decreusefond ◽

Ian Flint ◽

Anais Vergne

Keyword(s):

Point Process ◽

Random Matrix ◽

Point Processes ◽

Matrix Theory ◽

Applied Mathematics ◽

Fixed Number ◽

Extended Version ◽

Determinantal Point Processes ◽

Determinantal Point Process ◽

Ginibre Point Process

The Ginibre point process (GPP) is one of the main examples of determinantal point processes on the complex plane. It is a recurring distribution of random matrix theory as well as a useful model in applied mathematics. In this paper we briefly overview the usual methods for the simulation of the GPP. Then we introduce a modified version of the GPP which constitutes a determinantal point process more suited for certain applications, and we detail its simulation. This modified GPP has the property of having a fixed number of points and having its support on a compact subset of the plane. See Decreusefond et al. (2013) for an extended version of this paper.

Download Full-text

Diverse Decoding for Abstractive Document Summarization

Applied Sciences ◽

10.3390/app9030386 ◽

2019 ◽

Vol 9 (3) ◽

pp. 386 ◽

Cited By ~ 2

Author(s):

Xu-Wang Han ◽

Hai-Tao Zheng ◽

Jin-Yuan Chen ◽

Cong-Zhi Zhao

Keyword(s):

Experimental Evaluation ◽

State Of The Art ◽

Attention Mechanism ◽

Beam Search ◽

Daily Mail ◽

Document Summarization ◽

Novel Method ◽

Search Approach ◽

Abstractive Summarization ◽

Information Coverage

Recently, neural sequence-to-sequence models have made impressive progress in abstractive document summarization. Unfortunately, as neural abstractive summarization research is in a primitive stage, the performance of these models is still far from ideal. In this paper, we propose a novel method called Neural Abstractive Summarization with Diverse Decoding (NASDD). This method augments the standard attentional sequence-to-sequence model in two aspects. First, we introduce a diversity-promoting beam search approach in the decoding process, which alleviates the serious diversity issue caused by standard beam search and hence increases the possibility of generating summary sequences that are more informative. Second, we creatively utilize the attention mechanism combined with the key information of the input document as an estimation of the salient information coverage, which aids in finding the optimal summary sequence. We carry out the experimental evaluation with state-of-the-art methods on the CNN/Daily Mail summarization dataset, and the results demonstrate the superiority of our proposed method.

Download Full-text

Reach of repulsion for determinantal point processes in high dimensions

Journal of Applied Probability ◽

10.1017/jpr.2018.49 ◽

2018 ◽

Vol 55 (3) ◽

pp. 760-788

Author(s):

François Baccelli ◽

Eliza O'Reilly

Keyword(s):

Point Process ◽

Point Processes ◽

Space Dimension ◽

Determinantal Point Processes ◽

Determinantal Point Process ◽

High Dimensions ◽

Boolean Models ◽

Moment Measure ◽

First Moment ◽

Typical Point

Abstract Goldman (2010) proved that the distribution of a stationary determinantal point process (DPP) Φ can be coupled with its reduced Palm version Φ0,! such that there exists a point process η where Φ=Φ0,!∪η in distribution and Φ0,!∩η=∅. The points of η characterize the repulsive nature of a typical point of Φ. In this paper we use the first-moment measure of η to study the repulsive behavior of DPPs in high dimensions. We show that many families of DPPs have the property that the total number of points in η converges in probability to 0 as the space dimension n→∞. We also prove that for some DPPs, there exists an R∗ such that the decay of the first-moment measure of η is slowest in a small annulus around the sphere of radius √nR∗. This R∗ can be interpreted as the asymptotic reach of repulsion of the DPP. Examples of classes of DPP models exhibiting this behavior are presented and an application to high-dimensional Boolean models is given.

Download Full-text

A note on the simulation of the Ginibre point process

Journal of Applied Probability ◽

10.1017/s002190020011304x ◽

2015 ◽

Vol 52 (04) ◽

pp. 1003-1012 ◽

Cited By ~ 16

Author(s):

Laurent Decreusefond ◽

Ian Flint ◽

Anais Vergne

Keyword(s):

Point Process ◽

Random Matrix ◽

Point Processes ◽

Matrix Theory ◽

Applied Mathematics ◽

Fixed Number ◽

Extended Version ◽

Determinantal Point Processes ◽

Determinantal Point Process ◽

Ginibre Point Process

The Ginibre point process (GPP) is one of the main examples of determinantal point processes on the complex plane. It is a recurring distribution of random matrix theory as well as a useful model in applied mathematics. In this paper we briefly overview the usual methods for the simulation of the GPP. Then we introduce a modified version of the GPP which constitutes a determinantal point process more suited for certain applications, and we detail its simulation. This modified GPP has the property of having a fixed number of points and having its support on a compact subset of the plane. See Decreusefond et al. (2013) for an extended version of this paper.

Download Full-text

Marked point processes as limits of Markovian arrival streams

Journal of Applied Probability ◽

10.1017/s0021900200117371 ◽

1993 ◽

Vol 30 (02) ◽

pp. 365-372 ◽

Cited By ~ 19

Author(s):

Søren Asmussen ◽

Ger Koole

Keyword(s):

Point Process ◽

Point Processes ◽

Marked Point ◽

The State ◽

Marked Point Processes ◽

Marked Point Process ◽

State Transitions ◽

Poisson Arrival ◽

Convergence Of Moments ◽

Environmental Process

A Markovian arrival stream is a marked point process generated by the state transitions of a given Markovian environmental process and Poisson arrival rates depending on the environment. It is shown that to a given marked point process there is a sequence of such Markovian arrival streams with the property that as m →∞. Various related corollaries (involving stationarity, convergence of moments and ergodicity) and counterexamples are discussed as well.

Download Full-text

SANTM: Efficient Self-attention-driven Network for Text Matching

ACM Transactions on Internet Technology ◽

10.1145/3426971 ◽

2022 ◽

Vol 22 (3) ◽

pp. 1-21

Author(s):

Prayag Tiwari ◽

Amit Kumar Jaiswal ◽

Sahil Garg ◽

Ilsun You

Keyword(s):

Natural Language ◽

State Of The Art ◽

The State ◽

Attention Mechanism ◽

Matching Problems ◽

Attention Model ◽

Extra Information ◽

Textual Entailment ◽

Benchmark Datasets ◽

Text Matching

Self-attention mechanisms have recently been embraced for a broad range of text-matching applications. Self-attention model takes only one sentence as an input with no extra information, i.e., one can utilize the final hidden state or pooling. However, text-matching problems can be interpreted either in symmetrical or asymmetrical scopes. For instance, paraphrase detection is an asymmetrical task, while textual entailment classification and question-answer matching are considered asymmetrical tasks. In this article, we leverage attractive properties of self-attention mechanism and proposes an attention-based network that incorporates three key components for inter-sequence attention: global pointwise features, preceding attentive features, and contextual features while updating the rest of the components. Our model follows evaluation on two benchmark datasets cover tasks of textual entailment and question-answer matching. The proposed efficient Self-attention-driven Network for Text Matching outperforms the state of the art on the Stanford Natural Language Inference and WikiQA datasets with much fewer parameters.

Download Full-text

Examples of Stationary Sequences with Approximate Negative Dependence

Functional Gaussian Approximation for Dependent Structures ◽

10.1093/oso/9780198826941.003.0010 ◽

2019 ◽

pp. 305-318

Author(s):

Florence Merlevède ◽

Magda Peligrad ◽

Sergey Utev

Keyword(s):

Spectral Density ◽

Point Process ◽

Point Processes ◽

Random Variables ◽

Stationary Processes ◽

Negative Dependence ◽

Stationary Sequences ◽

Determinantal Point Process ◽

Dependent Random Variables ◽

Associated Sequences

In this chapter, we treat several examples of stationary processes which are asymptotically negatively dependent and for which the results of Chapter 9 apply. Many systems in nature are complex, consisting of the contributions of several independent components. Our first examples are functions of two independent sequences, one negatively dependent and one interlaced mixing. For instance, the class of asymptotic negatively dependent random variables is used to treat functions of a determinantal point process and a Gaussian process with a positive continuous spectral density. Another example is point processes based on asymptotically negatively or positively associated sequences and displaced according to a Gaussian sequence with a positive continuous spectral density. Other examples include exchangeable processes, the weighted empirical process, and the exchangeable determinantal point process.

Download Full-text

Random point processes and tilings arising from Gaussian analytic functions

Acta Crystallographica Section A Foundations and Advances ◽

10.1107/s2053273314094765 ◽

2014 ◽

Vol 70 (a1) ◽

pp. C523-C523

Author(s):

Michael Baake ◽

Holger Koesters ◽

Robert Moody

Keyword(s):

Point Process ◽

Analytic Functions ◽

Point Processes ◽

Logarithmic Potential ◽

Basins Of Attraction ◽

Random Point ◽

Specific Class ◽

Determinantal Point Processes ◽

Point Sets ◽

Gaussian Analytic Functions

Getting a grasp of what aperiodic order really entails is going to require collecting and understanding many diverse examples. Aperiodic crystals are at the top of the largely unknown iceberg beneath. Here we present a recently studied form of random point process in the (complex) plane which arises as the sets of zeros of a specific class of analytic functions given by power series with randomly chosen coefficients: Gaussian analytic functions (GAF). These point sets differ from Poisson processes by having a sort of built in repulsion between points, though the resulting sets almost surely fail both conditions of the Delone property. Remarkably the point sets that arise as the zeros of GAFs determine a random point process which is, in distribution, invariant under rotation and translation. In addition, there is a logarithmic potential function for which the zeros are the attractors, and the resulting basins of attraction produce tilings of the plane by tiles which are, almost surely, all of the same area. We discuss GAFs along with their tilings and diffraction, and as well note briefly their relationship to determinantal point processes, which are also of physical interest.

Download Full-text