Meerkat Mafia: Multilingual and Cross-Level Semantic Textual Similarity Systems

Author(s):  
Abhay Kashyap ◽  
Lushan Han ◽  
Roberto Yus ◽  
Jennifer Sleeman ◽  
Taneeya Satyapanich ◽  
...  
Author(s):  
Alok Debnath ◽  
Nikhil Pinnaparaju ◽  
Manish Shrivastava ◽  
Vasudeva Varma ◽  
Isabelle Augenstein

Author(s):  
Antonio L. Alfeo ◽  
Mario G. C. A. Cimino ◽  
Gigliola Vaglini

AbstractIn nowadays manufacturing, each technical assistance operation is digitally tracked. This results in a huge amount of textual data that can be exploited as a knowledge base to improve these operations. For instance, an ongoing problem can be addressed by retrieving potential solutions among the ones used to cope with similar problems during past operations. To be effective, most of the approaches for semantic textual similarity need to be supported by a structured semantic context (e.g. industry-specific ontology), resulting in high development and management costs. We overcome this limitation with a textual similarity approach featuring three functional modules. The data preparation module provides punctuation and stop-words removal, and word lemmatization. The pre-processed sentences undergo the sentence embedding module, based on Sentence-BERT (Bidirectional Encoder Representations from Transformers) and aimed at transforming the sentences into fixed-length vectors. Their cosine similarity is processed by the scoring module to match the expected similarity between the two original sentences. Finally, this similarity measure is employed to retrieve the most suitable recorded solutions for the ongoing problem. The effectiveness of the proposed approach is tested (i) against a state-of-the-art competitor and two well-known textual similarity approaches, and (ii) with two case studies, i.e. private company technical assistance reports and a benchmark dataset for semantic textual similarity. With respect to the state-of-the-art, the proposed approach results in comparable retrieval performance and significantly lower management cost: 30-min questionnaires are sufficient to obtain the semantic context knowledge to be injected into our textual search engine.


2014 ◽  
Author(s):  
Islam Beltagy ◽  
Katrin Erk ◽  
Raymond Mooney

2020 ◽  
Author(s):  
Nina Poerner ◽  
Ulli Waltinger ◽  
Hinrich Schütze

Author(s):  
Julien Hay ◽  
Tim Van de Cruys ◽  
Philippe Muller ◽  
Bich-Liên Doan ◽  
Fabrice Popineau ◽  
...  

Author(s):  
Mahtab Ahmed ◽  
Chahna Dixit ◽  
Robert E. Mercer ◽  
Atif Khan ◽  
Muhammad Rifayat Samee ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document