universal sentence
Recently Published Documents


TOTAL DOCUMENTS

45
(FIVE YEARS 28)

H-INDEX

6
(FIVE YEARS 2)

2021 ◽  
Author(s):  
Xinxu Shen ◽  
Troy Houser ◽  
David Victor Smith ◽  
Vishnu P. Murty

The use of naturalistic stimuli, such as narrative movies, is gaining popularity in many fields, characterizing memory, affect, and decision-making. Narrative recall paradigms are often used to capture the complexity and richness of memory for naturalistic events. However, scoring narrative recalls is time-consuming and prone to human biases. Here, we show the validity and reliability of using a natural language processing tool, the Universal Sentence Encoder (USE), to automatically score narrative recall. We compared the reliability in scoring made between two independent raters (i.e., hand-scored) and between our automated algorithm and individual raters (i.e., automated) on trial-unique, video clips of magic tricks. Study 1 showed that our automated segmentation approaches yielded high reliability and reflected measures yielded by hand-scoring, and further that the results using USE outperformed another popular natural language processing tool, GloVe. In study two, we tested whether our automated approach remained valid when testing individual’s varying on clinically-relevant dimensions that influence episodic memory, age and anxiety. We found that our automated approach was equally reliable across both age groups and anxiety groups, which shows the efficacy of our approach to assess narrative recall in large-scale individual difference analysis. In sum, these findings suggested that machine learning approaches implementing USE are a promising tool for scoring large-scale narrative recalls and perform individual difference analysis for research using naturalistic stimuli.


2021 ◽  
pp. 107540
Author(s):  
Mohammad AL-Smadi ◽  
Mahmoud M. Hammad ◽  
Sa’ad A. Al-Zboon ◽  
Saja AL-Tawalbeh ◽  
Erik Cambria

2021 ◽  
Vol 15 (3) ◽  
pp. 35-47
Author(s):  
Vladimir Kuzmin ◽  
Artem Menisov

Together with ubiquitous, global digitalization, cybercrime is growing and developing rapidly. The state considers the creation of an environment conducive to information security to be a strategic goal for the development of the information society in Russia. However, the question of how the “state of protection of the individual, society and the state from internal and external information threats” should be achieved in accordance with the “Information Security” and the “Digital Economy of Russia 2024” programs remains open. The aim of this study is to increase the efficiency whereby automated control systems identify confidential data from html-pages to reduce the risk of using this data in the preparatory and initial stages of attacks on the infrastructure of government organizations. The article describes an approach that has been developed to identify confidential data based on the combination of several neural network technologies: a universal sentence encoder and a neural network recurrent architecture of bidirectional long-term short-term memory. The results of an assessment in comparison with modern means of natural language text processing (SpaCy) showed the merits and prospects of the practical application of the methodological approach.


BMJ Leader ◽  
2021 ◽  
pp. leader-2021-000512
Author(s):  
Amina Waheed ◽  
Edward Presswood ◽  
Gregory Scott

BackgroundOrganisational values are widely assumed to have positive effects on performance and staff. National Health Service (NHS) trusts in England have accordingly chosen their own organisational values. However, there has been no survey of the values adopted, and there is little evidence that the choice of values per se has consequences for outcomes. We comprehensively described trusts’ organisational values, using natural language processing to identify common themes. We tested whether the choice of themes was associated with outcomes for patients and staff.MethodsWe collected data on trusts’ values (from their websites), performance (Summary Hospital-level Mortality Indicator (SHMI) statistics, Care Quality Commission (CQC) ratings), sickness absence rates (SAR) and staff opinions (NHS Staff Survey responses). We first characterised values based on lexical properties then progressed to semantic analysis, using Google’s Universal Sentence Encoder, to transform values to high-dimensional embeddings, and k-means clustering of embeddings to semantically cluster values into 12 common themes. We tested for associations between trusts’ use of these themes and outcomes.ResultsOrganisational values were obtained for 221 of 228 NHS trusts, with 985 values in total (480 unique). Semantic clustering identified themes including ‘care’, ‘value respect’ and ‘togetherness’. There was no significant association between themes and SHMI or CQC ratings. However, themes predicted trusts’ SAR (p=0.001, R2=0.159), with use of ‘care’, ‘value respect’, ‘aspirational’ and ‘people’ all significant predictors of increased sickness absence; themes also predicted staff opinions on ‘Equality, diversity and inclusion’ (p=0.011, R2=0.116), but with ‘supportive’ and ‘openness’ predicting more negative responses.ConclusionA trust’s adoption of individualised organisational values does not seem to make a positive difference to its patients or staff. These findings should give NHS managers pause for thought, challenging them to reconsider their reliance on value-defining initiatives, and to seek evidence that a focus on values has measurable benefits on outcomes.


2021 ◽  
Vol 35 (4) ◽  
pp. 301-306
Author(s):  
Godavarthi Deepthi ◽  
A. Mary Sowjanya

In Natural language processing, various tasks can be implemented with the features provided by word embeddings. But for obtaining embeddings for larger chunks like sentences, the efforts applied through word embeddings will not be sufficient. To resolve such issues sentence embeddings can be used. In sentence embeddings, complete sentences along with their semantic information are represented as vectors so that the machine finds it easy to understand the context. In this paper, we propose a Question Answering System (QAS) based on sentence embeddings. Our goal is to obtain the text from the provided context for a user-query by extracting the sentence in which the correct answer is present. Traditionally, infersent models have been used on SQUAD for building QAS. In recent times, Universal Sentence Encoder with USECNN and USETrans have been developed. In this paper, we have used another variant of the Universal sentence encoder, i.e. Deep averaging network in order to obtain pre-trained sentence embeddings. The results on the SQUAD-2.0 dataset indicate our approach (USE with DAN) performs well compared to Facebook’s infersent embedding.


Information ◽  
2021 ◽  
Vol 12 (8) ◽  
pp. 298
Author(s):  
Kenta Kanakogi ◽  
Hironori Washizaki ◽  
Yoshiaki Fukazawa ◽  
Shinpei Ogata ◽  
Takao Okubo ◽  
...  

For effective vulnerability management, vulnerability and attack information must be collected quickly and efficiently. A security knowledge repository can collect such information. The Common Vulnerabilities and Exposures (CVE) provides known vulnerabilities of products, while the Common Attack Pattern Enumeration and Classification (CAPEC) stores attack patterns, which are descriptions of common attributes and approaches employed by adversaries to exploit known weaknesses. Due to the fact that the information in these two repositories are not linked, identifying related CAPEC attack information from CVE vulnerability information is challenging. Currently, the related CAPEC-ID can be traced from the CVE-ID using Common Weakness Enumeration (CWE) in some but not all cases. Here, we propose a method to automatically trace the related CAPEC-IDs from CVE-ID using three similarity measures: TF–IDF, Universal Sentence Encoder (USE), and Sentence-BERT (SBERT). We prepared and used 58 CVE-IDs as test input data. Then, we tested whether we could trace CAPEC-IDs related to each of the 58 CVE-IDs. Additionally, we experimentally confirm that TF–IDF is the best similarity measure, as it traced 48 of the 58 CVE-IDs to the related CAPEC-ID.


2021 ◽  
Vol 2 (2) ◽  
pp. 1-16
Author(s):  
Lev Konstantinovskiy ◽  
Oliver Price ◽  
Mevan Babakar ◽  
Arkaitz Zubiaga

In an effort to assist factcheckers in the process of factchecking, we tackle the claim detection task, one of the necessary stages prior to determining the veracity of a claim. It consists of identifying the set of sentences, out of a long text, deemed capable of being factchecked. This article is a collaborative work between Full Fact, an independent factchecking charity, and academic partners. Leveraging the expertise of professional factcheckers, we develop an annotation schema and a benchmark for automated claim detection that is more consistent across time, topics, and annotators than are previous approaches. Our annotation schema has been used to crowdsource the annotation of a dataset with sentences from UK political TV shows. We introduce an approach based on universal sentence representations to perform the classification, achieving an F1 score of 0.83, with over 5% relative improvement over the state-of-the-art methods ClaimBuster and ClaimRank. The system was deployed in production and received positive user feedback.


2021 ◽  
Vol 14 (2) ◽  
pp. 201-214
Author(s):  
Danilo Croce ◽  
Giuseppe Castellucci ◽  
Roberto Basili

In recent years, Deep Learning methods have become very popular in classification tasks for Natural Language Processing (NLP); this is mainly due to their ability to reach high performances by relying on very simple input representations, i.e., raw tokens. One of the drawbacks of deep architectures is the large amount of annotated data required for an effective training. Usually, in Machine Learning this problem is mitigated by the usage of semi-supervised methods or, more recently, by using Transfer Learning, in the context of deep architectures. One recent promising method to enable semi-supervised learning in deep architectures has been formalized within Semi-Supervised Generative Adversarial Networks (SS-GANs) in the context of Computer Vision. In this paper, we adopt the SS-GAN framework to enable semi-supervised learning in the context of NLP. We demonstrate how an SS-GAN can boost the performances of simple architectures when operating in expressive low-dimensional embeddings; these are derived by combining the unsupervised approximation of linguistic Reproducing Kernel Hilbert Spaces and the so-called Universal Sentence Encoders. We experimentally evaluate the proposed approach over a semantic classification task, i.e., Question Classification, by considering different sizes of training material and different numbers of target classes. By applying such adversarial schema to a simple Multi-Layer Perceptron, a classifier trained over a subset derived from 1% of the original training material achieves 92% of accuracy. Moreover, when considering a complex classification schema, e.g., involving 50 classes, the proposed method outperforms state-of-the-art alternatives such as BERT.


Sign in / Sign up

Export Citation Format

Share Document