universal sentence Latest Research Papers

Machine-learning as a validated tool to characterize individual differences in free recall of naturalistic events.

10.31234/osf.io/uygzv ◽

2021 ◽

Author(s):

Xinxu Shen ◽

Troy Houser ◽

David Victor Smith ◽

Vishnu P. Murty

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Individual Difference ◽

Language Processing ◽

Large Scale ◽

High Reliability ◽

Difference Analysis ◽

Universal Sentence ◽

Natural Language Processing Tool

The use of naturalistic stimuli, such as narrative movies, is gaining popularity in many fields, characterizing memory, affect, and decision-making. Narrative recall paradigms are often used to capture the complexity and richness of memory for naturalistic events. However, scoring narrative recalls is time-consuming and prone to human biases. Here, we show the validity and reliability of using a natural language processing tool, the Universal Sentence Encoder (USE), to automatically score narrative recall. We compared the reliability in scoring made between two independent raters (i.e., hand-scored) and between our automated algorithm and individual raters (i.e., automated) on trial-unique, video clips of magic tricks. Study 1 showed that our automated segmentation approaches yielded high reliability and reflected measures yielded by hand-scoring, and further that the results using USE outperformed another popular natural language processing tool, GloVe. In study two, we tested whether our automated approach remained valid when testing individual’s varying on clinically-relevant dimensions that influence episodic memory, age and anxiety. We found that our automated approach was equally reliable across both age groups and anxiety groups, which shows the efficacy of our approach to assess narrative recall in large-scale individual difference analysis. In sum, these findings suggested that machine learning approaches implementing USE are a promising tool for scoring large-scale narrative recalls and perform individual difference analysis for research using naturalistic stimuli.

Gated Recurrent Unit with Multilingual Universal Sentence Encoder for Arabic Aspect-Based Sentiment Analysis

Knowledge-Based Systems ◽

10.1016/j.knosys.2021.107540 ◽

2021 ◽

pp. 107540

Author(s):

Mohammad AL-Smadi ◽

Mahmoud M. Hammad ◽

Sa’ad A. Al-Zboon ◽

Saja AL-Tawalbeh ◽

Erik Cambria

Keyword(s):

Sentiment Analysis ◽

Universal Sentence ◽

Gated Recurrent Unit

Organisational values of National Health Service trusts in England: semantic analysis and relation to performance indicators

BMJ Leader ◽

10.1136/leader-2021-000512 ◽

2021 ◽

pp. leader-2021-000512

Author(s):

Amina Waheed ◽

Edward Presswood ◽

Gregory Scott

Keyword(s):

National Health Service ◽

Health Service ◽

Sickness Absence ◽

Language Processing ◽

National Health ◽

Semantic Analysis ◽

Care Quality ◽

Positive Effects ◽

Universal Sentence ◽

Organisational Values

BackgroundOrganisational values are widely assumed to have positive effects on performance and staff. National Health Service (NHS) trusts in England have accordingly chosen their own organisational values. However, there has been no survey of the values adopted, and there is little evidence that the choice of values per se has consequences for outcomes. We comprehensively described trusts’ organisational values, using natural language processing to identify common themes. We tested whether the choice of themes was associated with outcomes for patients and staff.MethodsWe collected data on trusts’ values (from their websites), performance (Summary Hospital-level Mortality Indicator (SHMI) statistics, Care Quality Commission (CQC) ratings), sickness absence rates (SAR) and staff opinions (NHS Staff Survey responses). We first characterised values based on lexical properties then progressed to semantic analysis, using Google’s Universal Sentence Encoder, to transform values to high-dimensional embeddings, and k-means clustering of embeddings to semantically cluster values into 12 common themes. We tested for associations between trusts’ use of these themes and outcomes.ResultsOrganisational values were obtained for 221 of 228 NHS trusts, with 985 values in total (480 unique). Semantic clustering identified themes including ‘care’, ‘value respect’ and ‘togetherness’. There was no significant association between themes and SHMI or CQC ratings. However, themes predicted trusts’ SAR (p=0.001, R2=0.159), with use of ‘care’, ‘value respect’, ‘aspirational’ and ‘people’ all significant predictors of increased sickness absence; themes also predicted staff opinions on ‘Equality, diversity and inclusion’ (p=0.011, R2=0.116), but with ‘supportive’ and ‘openness’ predicting more negative responses.ConclusionA trust’s adoption of individualised organisational values does not seem to make a positive difference to its patients or staff. These findings should give NHS managers pause for thought, challenging them to reconsider their reliance on value-defining initiatives, and to seek evidence that a focus on values has measurable benefits on outcomes.

An approach to identifying threats of extracting confidential data from automated control systems based on internet technologies

Business Informatics ◽

10.17323/2587-814x.2021.3.35.47 ◽

2021 ◽

Vol 15 (3) ◽

pp. 35-47

Author(s):

Vladimir Kuzmin ◽

Artem Menisov

Keyword(s):

Neural Network ◽

Information Security ◽

Control Systems ◽

Short Term Memory ◽

Methodological Approach ◽

The State ◽

External Information ◽

Automated Control ◽

Universal Sentence ◽

Confidential Data

Together with ubiquitous, global digitalization, cybercrime is growing and developing rapidly. The state considers the creation of an environment conducive to information security to be a strategic goal for the development of the information society in Russia. However, the question of how the “state of protection of the individual, society and the state from internal and external information threats” should be achieved in accordance with the “Information Security” and the “Digital Economy of Russia 2024” programs remains open. The aim of this study is to increase the efficiency whereby automated control systems identify confidential data from html-pages to reduce the risk of using this data in the preparatory and initial stages of attacks on the infrastructure of government organizations. The article describes an approach that has been developed to identify confidential data based on the combination of several neural network technologies: a universal sentence encoder and a neural network recurrent architecture of bidirectional long-term short-term memory. The results of an assessment in comparison with modern means of natural language text processing (SpaCy) showed the merits and prospects of the practical application of the methodological approach.

Query-Based Retrieval Using Universal Sentence Encoder

Revue d intelligence artificielle ◽

10.18280/ria.350404 ◽

2021 ◽

Vol 35 (4) ◽

pp. 301-306

Author(s):

Godavarthi Deepthi ◽

A. Mary Sowjanya

Keyword(s):

Natural Language Processing ◽

Language Processing ◽

Correct Answer ◽

Question Answering ◽

Semantic Information ◽

Word Embeddings ◽

Question Answering System ◽

Universal Sentence ◽

User Query ◽

Averaging Network

In Natural language processing, various tasks can be implemented with the features provided by word embeddings. But for obtaining embeddings for larger chunks like sentences, the efforts applied through word embeddings will not be sufficient. To resolve such issues sentence embeddings can be used. In sentence embeddings, complete sentences along with their semantic information are represented as vectors so that the machine finds it easy to understand the context. In this paper, we propose a Question Answering System (QAS) based on sentence embeddings. Our goal is to obtain the text from the provided context for a user-query by extracting the sentence in which the correct answer is present. Traditionally, infersent models have been used on SQUAD for building QAS. In recent times, Universal Sentence Encoder with USECNN and USETrans have been developed. In this paper, we have used another variant of the Universal sentence encoder, i.e. Deep averaging network in order to obtain pre-trained sentence embeddings. The results on the SQUAD-2.0 dataset indicate our approach (USE with DAN) performs well compared to Facebook’s infersent embedding.

Tracing CVE Vulnerability Information to CAPEC Attack Patterns Using Natural Language Processing Techniques

Information ◽

10.3390/info12080298 ◽

2021 ◽

Vol 12 (8) ◽

pp. 298

Author(s):

Kenta Kanakogi ◽

Hironori Washizaki ◽

Yoshiaki Fukazawa ◽

Shinpei Ogata ◽

Takao Okubo ◽

...

Keyword(s):

Language Processing ◽

Similarity Measures ◽

Knowledge Repository ◽

Test Input ◽

Attack Pattern ◽

Universal Sentence ◽

Vulnerability Management ◽

Attack Patterns ◽

The Common ◽

Processing Techniques

For effective vulnerability management, vulnerability and attack information must be collected quickly and efficiently. A security knowledge repository can collect such information. The Common Vulnerabilities and Exposures (CVE) provides known vulnerabilities of products, while the Common Attack Pattern Enumeration and Classification (CAPEC) stores attack patterns, which are descriptions of common attributes and approaches employed by adversaries to exploit known weaknesses. Due to the fact that the information in these two repositories are not linked, identifying related CAPEC attack information from CVE vulnerability information is challenging. Currently, the related CAPEC-ID can be traced from the CVE-ID using Common Weakness Enumeration (CWE) in some but not all cases. Here, we propose a method to automatically trace the related CAPEC-IDs from CVE-ID using three similarity measures: TF–IDF, Universal Sentence Encoder (USE), and Sentence-BERT (SBERT). We prepared and used 58 CVE-IDs as test input data. Then, we tested whether we could trace CAPEC-IDs related to each of the 58 CVE-IDs. Additionally, we experimentally confirm that TF–IDF is the best similarity measure, as it traced 48 of the 58 CVE-IDs to the related CAPEC-ID.

Toward Automated Factchecking

Digital Threats: Research and Practice ◽

10.1145/3412869 ◽

2021 ◽

Vol 2 (2) ◽

pp. 1-16

Author(s):

Lev Konstantinovskiy ◽

Oliver Price ◽

Mevan Babakar ◽

Arkaitz Zubiaga

Keyword(s):

State Of The Art ◽

Collaborative Work ◽

User Feedback ◽

Detection Task ◽

The State ◽

Relative Improvement ◽

Universal Sentence ◽

Art Methods ◽

Tv Shows ◽

Full Fact

In an effort to assist factcheckers in the process of factchecking, we tackle the claim detection task, one of the necessary stages prior to determining the veracity of a claim. It consists of identifying the set of sentences, out of a long text, deemed capable of being factchecked. This article is a collaborative work between Full Fact, an independent factchecking charity, and academic partners. Leveraging the expertise of professional factcheckers, we develop an annotation schema and a benchmark for automated claim detection that is more consistent across time, topics, and annotators than are previous approaches. Our annotation schema has been used to crowdsource the annotation of a dataset with sentences from UK political TV shows. We introduce an approach based on universal sentence representations to perform the classification, achieving an F1 score of 0.83, with over 5% relative improvement over the state-of-the-art methods ClaimBuster and ClaimRank. The system was deployed in production and received positive user feedback.

Leveraging Universal Sentence Encoder to Predict Movie Genre

2021 7th International Conference on Advanced Computing and Communication Systems (ICACCS) ◽

10.1109/icaccs51430.2021.9441685 ◽

2021 ◽

Author(s):

Nikhil Kumar ◽

Sanjay Kumar ◽

Aditya Dev ◽

Siraz Naorem

Keyword(s):

Universal Sentence

Adversarial training for few-shot text classification

Intelligenza Artificiale ◽

10.3233/ia-200051 ◽

2021 ◽

Vol 14 (2) ◽

pp. 201-214

Author(s):

Danilo Croce ◽

Giuseppe Castellucci ◽

Roberto Basili

Keyword(s):

Supervised Learning ◽

Language Processing ◽

Reproducing Kernel ◽

Generative Adversarial Networks ◽

Training Material ◽

Semantic Classification ◽

Universal Sentence ◽

Kernel Hilbert Spaces ◽

Supervised Methods ◽

Low Dimensional

In recent years, Deep Learning methods have become very popular in classification tasks for Natural Language Processing (NLP); this is mainly due to their ability to reach high performances by relying on very simple input representations, i.e., raw tokens. One of the drawbacks of deep architectures is the large amount of annotated data required for an effective training. Usually, in Machine Learning this problem is mitigated by the usage of semi-supervised methods or, more recently, by using Transfer Learning, in the context of deep architectures. One recent promising method to enable semi-supervised learning in deep architectures has been formalized within Semi-Supervised Generative Adversarial Networks (SS-GANs) in the context of Computer Vision. In this paper, we adopt the SS-GAN framework to enable semi-supervised learning in the context of NLP. We demonstrate how an SS-GAN can boost the performances of simple architectures when operating in expressive low-dimensional embeddings; these are derived by combining the unsupervised approximation of linguistic Reproducing Kernel Hilbert Spaces and the so-called Universal Sentence Encoders. We experimentally evaluate the proposed approach over a semantic classification task, i.e., Question Classification, by considering different sizes of training material and different numbers of target classes. By applying such adversarial schema to a simple Multi-Layer Perceptron, a classifier trained over a subset derived from 1% of the original training material achieves 92% of accuracy. Moreover, when considering a complex classification schema, e.g., involving 50 classes, the proposed method outperforms state-of-the-art alternatives such as BERT.

Discrete Cosine Transform as Universal Sentence Encoder

10.18653/v1/2021.acl-short.53 ◽

2021 ◽

Author(s):

Nada Almarwani ◽

Mona Diab

Keyword(s):

Discrete Cosine Transform ◽

Cosine Transform ◽

Universal Sentence

universal sentence
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Machine-learning as a validated tool to characterize individual differences in free recall of naturalistic events.

Gated Recurrent Unit with Multilingual Universal Sentence Encoder for Arabic Aspect-Based Sentiment Analysis

Organisational values of National Health Service trusts in England: semantic analysis and relation to performance indicators

An approach to identifying threats of extracting confidential data from automated control systems based on internet technologies

Query-Based Retrieval Using Universal Sentence Encoder

Tracing CVE Vulnerability Information to CAPEC Attack Patterns Using Natural Language Processing Techniques

Toward Automated Factchecking

Leveraging Universal Sentence Encoder to Predict Movie Genre

Adversarial training for few-shot text classification

Discrete Cosine Transform as Universal Sentence Encoder

Export Citation Format

universal sentenceRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Machine-learning as a validated tool to characterize individual differences in free recall of naturalistic events.

Gated Recurrent Unit with Multilingual Universal Sentence Encoder for Arabic Aspect-Based Sentiment Analysis

Organisational values of National Health Service trusts in England: semantic analysis and relation to performance indicators

An approach to identifying threats of extracting confidential data from automated control systems based on internet technologies

Query-Based Retrieval Using Universal Sentence Encoder

Tracing CVE Vulnerability Information to CAPEC Attack Patterns Using Natural Language Processing Techniques

Toward Automated Factchecking

Leveraging Universal Sentence Encoder to Predict Movie Genre

Adversarial training for few-shot text classification

Discrete Cosine Transform as Universal Sentence Encoder

universal sentence
Recently Published Documents