scholarly journals A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings

Author(s):  
Mikel Artetxe ◽  
Gorka Labaka ◽  
Eneko Agirre
2020 ◽  
Vol 34 (05) ◽  
pp. 7797-7804
Author(s):  
Goran Glavašš ◽  
Swapna Somasundaran

Breaking down the structure of long texts into semantically coherent segments makes the texts more readable and supports downstream applications like summarization and retrieval. Starting from an apparent link between text coherence and segmentation, we introduce a novel supervised model for text segmentation with simple but explicit coherence modeling. Our model – a neural architecture consisting of two hierarchically connected Transformer networks – is a multi-task learning model that couples the sentence-level segmentation objective with the coherence objective that differentiates correct sequences of sentences from corrupt ones. The proposed model, dubbed Coherence-Aware Text Segmentation (CATS), yields state-of-the-art segmentation performance on a collection of benchmark datasets. Furthermore, by coupling CATS with cross-lingual word embeddings, we demonstrate its effectiveness in zero-shot language transfer: it can successfully segment texts in languages unseen in training.


2012 ◽  
Vol 457-458 ◽  
pp. 1586-1594
Author(s):  
Yi Jing Liu ◽  
Li Ya Chai ◽  
Jing Min Liu ◽  
Bo Wen Li

Author(s):  
Chen Zhang ◽  
Ziying Liu ◽  
Changli Zhang ◽  
Xudong Li ◽  
Qiuna Wang

Sign in / Sign up

Export Citation Format

Share Document