Tracing CVE Vulnerability Information to CAPEC Attack Patterns Using Natural Language Processing Techniques

For effective vulnerability management, vulnerability and attack information must be collected quickly and efficiently. A security knowledge repository can collect such information. The Common Vulnerabilities and Exposures (CVE) provides known vulnerabilities of products, while the Common Attack Pattern Enumeration and Classification (CAPEC) stores attack patterns, which are descriptions of common attributes and approaches employed by adversaries to exploit known weaknesses. Due to the fact that the information in these two repositories are not linked, identifying related CAPEC attack information from CVE vulnerability information is challenging. Currently, the related CAPEC-ID can be traced from the CVE-ID using Common Weakness Enumeration (CWE) in some but not all cases. Here, we propose a method to automatically trace the related CAPEC-IDs from CVE-ID using three similarity measures: TF–IDF, Universal Sentence Encoder (USE), and Sentence-BERT (SBERT). We prepared and used 58 CVE-IDs as test input data. Then, we tested whether we could trace CAPEC-IDs related to each of the 58 CVE-IDs. Additionally, we experimentally confirm that TF–IDF is the best similarity measure, as it traced 48 of the 58 CVE-IDs to the related CAPEC-ID.

Download Full-text

Conceptual Graphs Based Approach for Subjective Answers Evaluation

Natural Language Processing ◽

10.4018/978-1-7998-0951-7.ch037 ◽

2020 ◽

pp. 770-790

Author(s):

Goonjan Jain ◽

D.K. Lobiyal

Keyword(s):

Natural Language Processing ◽

Language Processing ◽

Evaluation System ◽

Pearson Correlation ◽

Similarity Measures ◽

Conceptual Graphs ◽

Pearson Correlation Coefficient ◽

Evaluation Systems ◽

Automated Evaluation ◽

Processing Techniques

Automated evaluation systems for objective type tests already exist. However, it is challenging to make an automated evaluation system for subjective type tests. Therefore, focus of this paper is on evaluation of simple text based subjective answers using Natural Language Processing techniques. A student's answer is evaluated by comparing it with a model answer of the question. Model answers cannot exactly match with the students' answers due to variability in writing. Therefore, researchers create conceptual graphs for both student as well as model answer and compute similarity between these graphs using techniques of graph similarity measures. Based on the similarity, marks are assigned to an answer. Lastly, in this manuscript authors compare the results obtained by human graders and the proposed system using Pearson correlation coefficient. Also, comparison has been drawn between the results of proposed system with other existing evaluation systems. The experimental evaluation of the proposed system shows promising results.

Download Full-text

Conceptual Graphs Based Approach for Subjective Answers Evaluation

International Journal of Conceptual Structures and Smart Applications ◽

10.4018/ijcssa.2017070101 ◽

2017 ◽

Vol 5 (2) ◽

pp. 1-21

Author(s):

Goonjan Jain ◽

D.K. Lobiyal

Keyword(s):

Language Processing ◽

Evaluation System ◽

Pearson Correlation ◽

Similarity Measures ◽

Conceptual Graphs ◽

Pearson Correlation Coefficient ◽

Evaluation Systems ◽

Automated Evaluation ◽

Processing Techniques ◽

Objective Type

Download Full-text

Adversarial training for few-shot text classification

Intelligenza Artificiale ◽

10.3233/ia-200051 ◽

2021 ◽

Vol 14 (2) ◽

pp. 201-214

Author(s):

Danilo Croce ◽

Giuseppe Castellucci ◽

Roberto Basili

Keyword(s):

Supervised Learning ◽

Language Processing ◽

Reproducing Kernel ◽

Generative Adversarial Networks ◽

Training Material ◽

Semantic Classification ◽

Universal Sentence ◽

Kernel Hilbert Spaces ◽

Supervised Methods ◽

Low Dimensional

In recent years, Deep Learning methods have become very popular in classification tasks for Natural Language Processing (NLP); this is mainly due to their ability to reach high performances by relying on very simple input representations, i.e., raw tokens. One of the drawbacks of deep architectures is the large amount of annotated data required for an effective training. Usually, in Machine Learning this problem is mitigated by the usage of semi-supervised methods or, more recently, by using Transfer Learning, in the context of deep architectures. One recent promising method to enable semi-supervised learning in deep architectures has been formalized within Semi-Supervised Generative Adversarial Networks (SS-GANs) in the context of Computer Vision. In this paper, we adopt the SS-GAN framework to enable semi-supervised learning in the context of NLP. We demonstrate how an SS-GAN can boost the performances of simple architectures when operating in expressive low-dimensional embeddings; these are derived by combining the unsupervised approximation of linguistic Reproducing Kernel Hilbert Spaces and the so-called Universal Sentence Encoders. We experimentally evaluate the proposed approach over a semantic classification task, i.e., Question Classification, by considering different sizes of training material and different numbers of target classes. By applying such adversarial schema to a simple Multi-Layer Perceptron, a classifier trained over a subset derived from 1% of the original training material achieves 92% of accuracy. Moreover, when considering a complex classification schema, e.g., involving 50 classes, the proposed method outperforms state-of-the-art alternatives such as BERT.

Download Full-text

Twitter Sentiment Analysis towards COVID-19 Vaccines in the Philippines Using Naïve Bayes

Information ◽

10.3390/info12050204 ◽

2021 ◽

Vol 12 (5) ◽

pp. 204

Author(s):

Charlyn Villavicencio ◽

Julio Jerison Macrohon ◽

X. Alphonse Inbaraj ◽

Jyh-Horng Jeng ◽

Jer-Guang Hsieh

Keyword(s):

Sentiment Analysis ◽

Language Processing ◽

Data Science ◽

Naive Bayes ◽

The Philippines ◽

Naïve Bayes ◽

Social Networking Site ◽

Bayes Model ◽

The Government ◽

Processing Techniques

A year into the COVID-19 pandemic and one of the longest recorded lockdowns in the world, the Philippines received its first delivery of COVID-19 vaccines on 1 March 2021 through WHO’s COVAX initiative. A month into inoculation of all frontline health professionals and other priority groups, the authors of this study gathered data on the sentiment of Filipinos regarding the Philippine government’s efforts using the social networking site Twitter. Natural language processing techniques were applied to understand the general sentiment, which can help the government in analyzing their response. The sentiments were annotated and trained using the Naïve Bayes model to classify English and Filipino language tweets into positive, neutral, and negative polarities through the RapidMiner data science software. The results yielded an 81.77% accuracy, which outweighs the accuracy of recent sentiment analysis studies using Twitter data from the Philippines.

Download Full-text

Shall I Work with Them? A Knowledge Graph-Based Approach for Predicting Future Research Collaborations

Entropy ◽

10.3390/e23060664 ◽

2021 ◽

Vol 23 (6) ◽

pp. 664

Author(s):

Nikos Kanakaris ◽

Nikolaos Giarelis ◽

Ilias Siachos ◽

Nikos Karacapilidis

Keyword(s):

Language Processing ◽

Scientific Knowledge ◽

Link Prediction ◽

Performance Metrics ◽

Future Research ◽

Knowledge Graph ◽

Prediction Problem ◽

Textual Information ◽

Research Collaborations ◽

Processing Techniques

We consider the prediction of future research collaborations as a link prediction problem applied on a scientific knowledge graph. To the best of our knowledge, this is the first work on the prediction of future research collaborations that combines structural and textual information of a scientific knowledge graph through a purposeful integration of graph algorithms and natural language processing techniques. Our work: (i) investigates whether the integration of unstructured textual data into a single knowledge graph affects the performance of a link prediction model, (ii) studies the effect of previously proposed graph kernels based approaches on the performance of an ML model, as far as the link prediction problem is concerned, and (iii) proposes a three-phase pipeline that enables the exploitation of structural and textual information, as well as of pre-trained word embeddings. We benchmark the proposed approach against classical link prediction algorithms using accuracy, recall, and precision as our performance metrics. Finally, we empirically test our approach through various feature combinations with respect to the link prediction problem. Our experimentations with the new COVID-19 Open Research Dataset demonstrate a significant improvement of the abovementioned performance metrics in the prediction of future research collaborations.

Download Full-text

A Natural Language Processing Approach to Measuring Treatment Adherence and Consistency Using Semantic Similarity

AERA Open ◽

10.1177/23328584211028615 ◽

2021 ◽

Vol 7 ◽

pp. 233285842110286

Author(s):

Kylie L. Anglin ◽

Vivian C. Wong ◽

Arielle Boguslav

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Semantic Similarity ◽

Language Processing ◽

Intervention Implementation ◽

Proof Of Concept ◽

Coaching Intervention ◽

Processing Techniques ◽

Teacher Coaching ◽

The Impact

Though there is widespread recognition of the importance of implementation research, evaluators often face intense logistical, budgetary, and methodological challenges in their efforts to assess intervention implementation in the field. This article proposes a set of natural language processing techniques called semantic similarity as an innovative and scalable method of measuring implementation constructs. Semantic similarity methods are an automated approach to quantifying the similarity between texts. By applying semantic similarity to transcripts of intervention sessions, researchers can use the method to determine whether an intervention was delivered with adherence to a structured protocol, and the extent to which an intervention was replicated with consistency across sessions, sites, and studies. This article provides an overview of semantic similarity methods, describes their application within the context of educational evaluations, and provides a proof of concept using an experimental study of the impact of a standardized teacher coaching intervention.

Download Full-text

Exploring Library Core Competencies: A Text Mining Study of American Library Association’s Job Advertisements From 2006 Through 2017

Social Science Computer Review ◽

10.1177/08944393211027279 ◽

2021 ◽

pp. 089443932110272

Author(s):

Qinghong Yang ◽

Zehong Shi ◽

Yan Quan Liu

Keyword(s):

Language Processing ◽

Work Experience ◽

Core Competencies ◽

Core Competency ◽

Analytical Investigation ◽

Market Demand ◽

American Library ◽

Job Titles ◽

Job Advertisements ◽

Processing Techniques

Are core competency requirements for relevant positions in the library shifting? Applying natural language processing techniques to understand the current market demand for core competencies, this study explores job advertisements issued by the American Library Association (ALA) from 2006 to 2017. Research reveals that the job demand continues to rise at a rate of 13% (2006–2017) and that the requirements for work experience are substantially extended, diversity of job titles becomes prevalent, and rich service experience and continuous lifelong learning skills are becoming more and more predominant for librarians. This analytical investigation informs the emerging demands in the American job market debriefing the prioritization and reprioritization of the current core competency requirements for ALA librarians.

Download Full-text

Detecting Malicious Windows Commands Using Natural Language Processing Techniques

Innovative Security Solutions for Information Technology and Communications - Lecture Notes in Computer Science ◽

10.1007/978-3-030-12942-2_13 ◽

2019 ◽

pp. 157-169

Author(s):

Muhammd Mudassar Yamin ◽

Basel Katt

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Processing Techniques

Download Full-text

Compansion: From research prototype to practical integration

Natural Language Engineering ◽

10.1017/s1351324998001843 ◽

1998 ◽

Vol 4 (1) ◽

pp. 73-95 ◽

Cited By ~ 12

Author(s):

KATHLEEN F. MCCOY ◽

CHRISTOPHER A. PENNINGTON ◽

ARLENE LUBEROFF BADMAN

Keyword(s):

Language Processing ◽

Augmentative And Alternative Communication ◽

Communication Aids ◽

Alternative Communication ◽

Communicative Ability ◽

Linguistic Ability ◽

Research Prototype ◽

Intelligent Communication ◽

Processing Techniques ◽

Parser Generator

Augmentative and Alternative Communication (AAC) is the field of study concerned with providing devices and techniques to augment the communicative ability of a person whose disability makes it difficult to speak or otherwise communicate in an understandable fashion. For several years, we have been applying natural language processing techniques to the field of AAC to develop intelligent communication aids that attempt to provide linguistically correct output while increasing communication rate. Previous effort has resulted in a research prototype called Compansion that expands telegraphic input. In this paper we describe that research prototype and introduce the Intelligent Parser Generator (IPG). IPG is intended to be a practical embodiment of the research prototype aimed at a group of users who have cognitive impairments that affect their linguistic ability. We describe both the theoretical underpinnings of Compansion and the practical considerations in developing a usable system for this population of users.

Download Full-text

Identification of spam comments using natural language processing techniques

2014 IEEE 10th International Conference on Intelligent Computer Communication and Processing (ICCP) ◽

10.1109/iccp.2014.6936976 ◽

2014 ◽

Cited By ~ 6

Author(s):

Cristina Radulescu ◽

Mihaela Dinsoreanu ◽

Rodica Potolea

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Processing Techniques

Download Full-text