scholarly journals Multi-Document Summarization with Determinantal Point Process Attention

2021 ◽  
Vol 71 ◽  
pp. 371-399
Author(s):  
Laura Perez-Beltrachini ◽  
Mirella Lapata

The ability to convey relevant and diverse information is critical in multi-document summarization and yet remains elusive for neural seq-to-seq models whose outputs are often redundant and fail to correctly cover important details. In this work, we propose an attention mechanism which encourages greater focus on relevance and diversity. Attention weights are computed based on (proportional) probabilities given by Determinantal Point Processes (DPPs) defined on the set of content units to be summarized. DPPs have been successfully used in extractive summarisation, here we use them to select relevant and diverse content for neural abstractive summarisation. We integrate DPP-based attention with various seq-to-seq architectures ranging from CNNs to LSTMs, and Transformers. Experimental evaluation shows that our attention mechanism consistently improves summarization and delivers performance comparable with the state-of-the-art on the MultiNews dataset

2021 ◽  
Vol 58 (2) ◽  
pp. 469-483
Author(s):  
Jesper Møller ◽  
Eliza O’Reilly

AbstractFor a determinantal point process (DPP) X with a kernel K whose spectrum is strictly less than one, André Goldman has established a coupling to its reduced Palm process $X^u$ at a point u with $K(u,u)>0$ so that, almost surely, $X^u$ is obtained by removing a finite number of points from X. We sharpen this result, assuming weaker conditions and establishing that $X^u$ can be obtained by removing at most one point from X, where we specify the distribution of the difference $\xi_u: = X\setminus X^u$. This is used to discuss the degree of repulsiveness in DPPs in terms of $\xi_u$, including Ginibre point processes and other specific parametric models for DPPs.


Author(s):  
Jack Poulson

Determinantal point processes (DPPs) were introduced by Macchi (Macchi 1975 Adv. Appl. Probab. 7 , 83–122) as a model for repulsive (fermionic) particle distributions. But their recent popularization is largely due to their usefulness for encouraging diversity in the final stage of a recommender system (Kulesza & Taskar 2012 Found. Trends Mach. Learn. 5 , 123–286). The standard sampling scheme for finite DPPs is a spectral decomposition followed by an equivalent of a randomly diagonally pivoted Cholesky factorization of an orthogonal projection, which is only applicable to Hermitian kernels and has an expensive set-up cost. Researchers Launay et al. 2018 ( http://arxiv.org/abs/1802.08429 ); Chen & Zhang 2018 NeurIPS ( https://papers.nips.cc/paper/7805-fast-greedy-map-inference-for-determinantal-point-process-to-improve-recommendation-diversity.pdf ) have begun to connect DPP sampling to LDL H factorizations as a means of avoiding the initial spectral decomposition, but existing approaches have only outperformed the spectral decomposition approach in special circumstances, where the number of kept modes is a small percentage of the ground set size. This article proves that trivial modifications of LU and LDL H factorizations yield efficient direct sampling schemes for non-Hermitian and Hermitian DPP kernels, respectively. Furthermore, it is experimentally shown that even dynamically scheduled, shared-memory parallelizations of high-performance dense and sparse-direct factorizations can be trivially modified to yield DPP sampling schemes with essentially identical performance. The software developed as part of this research, Catamari ( hodgestar.com/catamari ) is released under the Mozilla Public License v.2.0. It contains header-only, C++14 plus OpenMP 4.0 implementations of dense and sparse-direct, Hermitian and non-Hermitian DPP samplers. This article is part of a discussion meeting issue ‘Numerical algorithms for high-performance computational science’.


2015 ◽  
Vol 52 (4) ◽  
pp. 1003-1012 ◽  
Author(s):  
Laurent Decreusefond ◽  
Ian Flint ◽  
Anais Vergne

The Ginibre point process (GPP) is one of the main examples of determinantal point processes on the complex plane. It is a recurring distribution of random matrix theory as well as a useful model in applied mathematics. In this paper we briefly overview the usual methods for the simulation of the GPP. Then we introduce a modified version of the GPP which constitutes a determinantal point process more suited for certain applications, and we detail its simulation. This modified GPP has the property of having a fixed number of points and having its support on a compact subset of the plane. See Decreusefond et al. (2013) for an extended version of this paper.


2019 ◽  
Vol 9 (3) ◽  
pp. 386 ◽  
Author(s):  
Xu-Wang Han ◽  
Hai-Tao Zheng ◽  
Jin-Yuan Chen ◽  
Cong-Zhi Zhao

Recently, neural sequence-to-sequence models have made impressive progress in abstractive document summarization. Unfortunately, as neural abstractive summarization research is in a primitive stage, the performance of these models is still far from ideal. In this paper, we propose a novel method called Neural Abstractive Summarization with Diverse Decoding (NASDD). This method augments the standard attentional sequence-to-sequence model in two aspects. First, we introduce a diversity-promoting beam search approach in the decoding process, which alleviates the serious diversity issue caused by standard beam search and hence increases the possibility of generating summary sequences that are more informative. Second, we creatively utilize the attention mechanism combined with the key information of the input document as an estimation of the salient information coverage, which aids in finding the optimal summary sequence. We carry out the experimental evaluation with state-of-the-art methods on the CNN/Daily Mail summarization dataset, and the results demonstrate the superiority of our proposed method.


2018 ◽  
Vol 55 (3) ◽  
pp. 760-788
Author(s):  
François Baccelli ◽  
Eliza O'Reilly

Abstract Goldman (2010) proved that the distribution of a stationary determinantal point process (DPP) Φ can be coupled with its reduced Palm version Φ0,! such that there exists a point process η where Φ=Φ0,!∪η in distribution and Φ0,!∩η=∅. The points of η characterize the repulsive nature of a typical point of Φ. In this paper we use the first-moment measure of η to study the repulsive behavior of DPPs in high dimensions. We show that many families of DPPs have the property that the total number of points in η converges in probability to 0 as the space dimension n→∞. We also prove that for some DPPs, there exists an R∗ such that the decay of the first-moment measure of η is slowest in a small annulus around the sphere of radius √nR∗. This R∗ can be interpreted as the asymptotic reach of repulsion of the DPP. Examples of classes of DPP models exhibiting this behavior are presented and an application to high-dimensional Boolean models is given.


2015 ◽  
Vol 52 (04) ◽  
pp. 1003-1012 ◽  
Author(s):  
Laurent Decreusefond ◽  
Ian Flint ◽  
Anais Vergne

The Ginibre point process (GPP) is one of the main examples of determinantal point processes on the complex plane. It is a recurring distribution of random matrix theory as well as a useful model in applied mathematics. In this paper we briefly overview the usual methods for the simulation of the GPP. Then we introduce a modified version of the GPP which constitutes a determinantal point process more suited for certain applications, and we detail its simulation. This modified GPP has the property of having a fixed number of points and having its support on a compact subset of the plane. See Decreusefond et al. (2013) for an extended version of this paper.


1993 ◽  
Vol 30 (02) ◽  
pp. 365-372 ◽  
Author(s):  
Søren Asmussen ◽  
Ger Koole

A Markovian arrival stream is a marked point process generated by the state transitions of a given Markovian environmental process and Poisson arrival rates depending on the environment. It is shown that to a given marked point process there is a sequence of such Markovian arrival streams with the property that as m →∞. Various related corollaries (involving stationarity, convergence of moments and ergodicity) and counterexamples are discussed as well.


2022 ◽  
Vol 22 (3) ◽  
pp. 1-21
Author(s):  
Prayag Tiwari ◽  
Amit Kumar Jaiswal ◽  
Sahil Garg ◽  
Ilsun You

Self-attention mechanisms have recently been embraced for a broad range of text-matching applications. Self-attention model takes only one sentence as an input with no extra information, i.e., one can utilize the final hidden state or pooling. However, text-matching problems can be interpreted either in symmetrical or asymmetrical scopes. For instance, paraphrase detection is an asymmetrical task, while textual entailment classification and question-answer matching are considered asymmetrical tasks. In this article, we leverage attractive properties of self-attention mechanism and proposes an attention-based network that incorporates three key components for inter-sequence attention: global pointwise features, preceding attentive features, and contextual features while updating the rest of the components. Our model follows evaluation on two benchmark datasets cover tasks of textual entailment and question-answer matching. The proposed efficient Self-attention-driven Network for Text Matching outperforms the state of the art on the Stanford Natural Language Inference and WikiQA datasets with much fewer parameters.


Author(s):  
Florence Merlevède ◽  
Magda Peligrad ◽  
Sergey Utev

In this chapter, we treat several examples of stationary processes which are asymptotically negatively dependent and for which the results of Chapter 9 apply. Many systems in nature are complex, consisting of the contributions of several independent components. Our first examples are functions of two independent sequences, one negatively dependent and one interlaced mixing. For instance, the class of asymptotic negatively dependent random variables is used to treat functions of a determinantal point process and a Gaussian process with a positive continuous spectral density. Another example is point processes based on asymptotically negatively or positively associated sequences and displaced according to a Gaussian sequence with a positive continuous spectral density. Other examples include exchangeable processes, the weighted empirical process, and the exchangeable determinantal point process.


2014 ◽  
Vol 70 (a1) ◽  
pp. C523-C523
Author(s):  
Michael Baake ◽  
Holger Koesters ◽  
Robert Moody

Getting a grasp of what aperiodic order really entails is going to require collecting and understanding many diverse examples. Aperiodic crystals are at the top of the largely unknown iceberg beneath. Here we present a recently studied form of random point process in the (complex) plane which arises as the sets of zeros of a specific class of analytic functions given by power series with randomly chosen coefficients: Gaussian analytic functions (GAF). These point sets differ from Poisson processes by having a sort of built in repulsion between points, though the resulting sets almost surely fail both conditions of the Delone property. Remarkably the point sets that arise as the zeros of GAFs determine a random point process which is, in distribution, invariant under rotation and translation. In addition, there is a logarithmic potential function for which the zeros are the attractors, and the resulting basins of attraction produce tilings of the plane by tiles which are, almost surely, all of the same area. We discuss GAFs along with their tilings and diffraction, and as well note briefly their relationship to determinantal point processes, which are also of physical interest.


Sign in / Sign up

Export Citation Format

Share Document