scholarly journals CyBERT: Cybersecurity Claim Classification by Fine-Tuning the BERT Language Model

2021 ◽  
Vol 1 (4) ◽  
pp. 615-637
Author(s):  
Kimia Ameri ◽  
Michael Hempel ◽  
Hamid Sharif ◽  
Juan Lopez ◽  
Kalyan Perumalla

We introduce CyBERT, a cybersecurity feature claims classifier based on bidirectional encoder representations from transformers and a key component in our semi-automated cybersecurity vetting for industrial control systems (ICS). To train CyBERT, we created a corpus of labeled sequences from ICS device documentation collected across a wide range of vendors and devices. This corpus provides the foundation for fine-tuning BERT’s language model, including a prediction-guided relabeling process. We propose an approach to obtain optimal hyperparameters, including the learning rate, the number of dense layers, and their configuration, to increase the accuracy of our classifier. Fine-tuning all hyperparameters of the resulting model led to an increase in classification accuracy from 76% obtained with BertForSequenceClassification’s original architecture to 94.4% obtained with CyBERT. Furthermore, we evaluated CyBERT for the impact of randomness in the initialization, training, and data-sampling phases. CyBERT demonstrated a standard deviation of ±0.6% during validation across 100 random seed values. Finally, we also compared the performance of CyBERT to other well-established language models including GPT2, ULMFiT, and ELMo, as well as neural network models such as CNN, LSTM, and BiLSTM. The results showed that CyBERT outperforms these models on the validation accuracy and the F1 score, validating CyBERT’s robustness and accuracy as a cybersecurity feature claims classifier.

Computers ◽  
2021 ◽  
Vol 10 (12) ◽  
pp. 166
Author(s):  
Bogdan Nicula ◽  
Mihai Dascalu ◽  
Natalie N. Newton ◽  
Ellen Orcutt ◽  
Danielle S. McNamara

Learning to paraphrase supports both writing ability and reading comprehension, particularly for less skilled learners. As such, educational tools that integrate automated evaluations of paraphrases can be used to provide timely feedback to enhance learner paraphrasing skills more efficiently and effectively. Paraphrase identification is a popular NLP classification task that involves establishing whether two sentences share a similar meaning. Paraphrase quality assessment is a slightly more complex task, in which pairs of sentences are evaluated in-depth across multiple dimensions. In this study, we focus on four dimensions: lexical, syntactical, semantic, and overall quality. Our study introduces and evaluates various machine learning models using handcrafted features combined with Extra Trees, Siamese neural networks using BiLSTM RNNs, and pretrained BERT-based models, together with transfer learning from a larger general paraphrase corpus, to estimate the quality of paraphrases across the four dimensions. Two datasets are considered for the tasks involving paraphrase quality: ULPC (User Language Paraphrase Corpus) containing 1998 paraphrases and a smaller dataset with 115 paraphrases based on children’s inputs. The paraphrase identification dataset used for the transfer learning task is the MSRP dataset (Microsoft Research Paraphrase Corpus) containing 5801 paraphrases. On the ULPC dataset, our BERT model improves upon the previous baseline by at least 0.1 in F1-score across the four dimensions. When using fine-tuning from ULPC for the children dataset, both the BERT and Siamese neural network models improve upon their original scores by at least 0.11 F1-score. The results of these experiments suggest that transfer learning using generic paraphrase identification datasets can be successful, while at the same time obtaining comparable results in fewer epochs.


2021 ◽  
Author(s):  
Flávio Arthur Oliveira Santos ◽  
Cleber Zanchettin ◽  
Leonardo Nogueira Matos ◽  
Paulo Novais

Abstract Robustness is a significant constraint in machine learning models. The performance of the algorithms must not deteriorate when training and testing with slightly different data. Deep neural network models achieve awe-inspiring results in a wide range of applications of computer vision. Still, in the presence of noise or region occlusion, some models exhibit inaccurate performance even with data handled in training. Besides, some experiments suggest deep learning models sometimes use incorrect parts of the input information to perform inference. Active image augmentation (ADA) is an augmentation method that uses interpretability methods to augment the training data and improve its robustness to face the described problems. Although ADA presented interesting results, its original version only used the vanilla backpropagation interpretability to train the U-Net model. In this work, we propose an extensive experimental analysis of the interpretability method’s impact on ADA. We use five interpretability methods: vanilla backpropagation, guided backpropagation, gradient-weighted class activation mapping (GradCam), guided GradCam and InputXGradient. The results show that all methods achieve similar performance at the ending of training, but when combining ADA with GradCam, the U-Net model presented an impressive fast convergence.


2017 ◽  
Author(s):  
Charlie W. Zhao ◽  
Mark J. Daley ◽  
J. Andrew Pruszynski

AbstractFirst-order tactile neurons have spatially complex receptive fields. Here we use machine learning tools to show that such complexity arises for a wide range of training sets and network architectures, and benefits network performance, especially on more difficult tasks and in the presence of noise. Our work suggests that spatially complex receptive fields are normatively good given the biological constraints of the tactile periphery.


2020 ◽  
Vol 34 (05) ◽  
pp. 9282-9289
Author(s):  
Qingyang Wu ◽  
Lei Li ◽  
Hao Zhou ◽  
Ying Zeng ◽  
Zhou Yu

Many social media news writers are not professionally trained. Therefore, social media platforms have to hire professional editors to adjust amateur headlines to attract more readers. We propose to automate this headline editing process through neural network models to provide more immediate writing support for these social media news writers. To train such a neural headline editing model, we collected a dataset which contains articles with original headlines and professionally edited headlines. However, it is expensive to collect a large number of professionally edited headlines. To solve this low-resource problem, we design an encoder-decoder model which leverages large scale pre-trained language models. We further improve the pre-trained model's quality by introducing a headline generation task as an intermediate task before the headline editing task. Also, we propose Self Importance-Aware (SIA) loss to address the different levels of editing in the dataset by down-weighting the importance of easily classified tokens and sentences. With the help of Pre-training, Adaptation, and SIA, the model learns to generate headlines in the professional editor's style. Experimental results show that our method significantly improves the quality of headline editing comparing against previous methods.


2021 ◽  
Vol 9 (1) ◽  
pp. 15-31
Author(s):  
Ali Arishi ◽  
Krishna K Krishnan ◽  
Vatsal Maru

As COVID-19 pandemic spreads in different regions with varying intensity, supply chains (SC) need to utilize an effective mechanism to adjust spike in both supply and demand of resources, and need techniques to detect unexpected behavior in SC at an early stage. During COVID-19 pandemic, the demand of medical supplies and essential products increases unexpectedly while the availability of recourses and raw materials decreases significantly. As such, the questions of SC and society survivability were raised. Responding to this urgent demand quickly and predicting how it will vary as the pandemic progresses is a key modeling question. In this research, we take the initiative in addressing the impact of COVID-19 disruption on manufacturing SC performance overwhelmed by the unprecedented demands of urgent items by developing a digital twin model for the manufacturing SC. In this model, we combine system dynamic simulation and artificial intelligence to dynamically monitor SC performance and predict SC reaction patterns. The simulation modeling is used to study the disruption propagation in the manufacturing SC and the efficiency of the recovery policy. Then based on this model, we develop artificial neural network models to learn from disruptions and make an online prediction of potential risks. The developed digital twin model is aimed to operate in real-time for early identification of disruptions and the respective SC reaction patterns to increase SC visibility and resilience.


Author(s):  
Sacha J. van Albada ◽  
Jari Pronold ◽  
Alexander van Meegen ◽  
Markus Diesmann

AbstractWe are entering an age of ‘big’ computational neuroscience, in which neural network models are increasing in size and in numbers of underlying data sets. Consolidating the zoo of models into large-scale models simultaneously consistent with a wide range of data is only possible through the effort of large teams, which can be spread across multiple research institutions. To ensure that computational neuroscientists can build on each other’s work, it is important to make models publicly available as well-documented code. This chapter describes such an open-source model, which relates the connectivity structure of all vision-related cortical areas of the macaque monkey with their resting-state dynamics. We give a brief overview of how to use the executable model specification, which employs NEST as simulation engine, and show its runtime scaling. The solutions found serve as an example for organizing the workflow of future models from the raw experimental data to the visualization of the results, expose the challenges, and give guidance for the construction of an ICT infrastructure for neuroscience.


Author(s):  
Filipe Caldeira ◽  
Tiago Cruz ◽  
Paulo Simões ◽  
Edmundo Monteiro

Critical Infrastructures (CIs) such as power distribution are referred to as “Critical” as, in case of failure, the impact on society and economy can be enormous. CIs are exposed to a growing number of threats. ICT security plays a major role in CI protection and risk prevention for single and interconnected CIs were cascading effects might occur. This chapter addresses CI Protection discussing MICIE Project main results, along with the mechanisms that manage the degree of confidence assigned to risk alerts allowing improving the resilience of CIs when faced with inaccurate/inconsistent alerts. The CockpitCI project is also presented, aiming to improve the resilience and dependability of CIs through automatic detection of cyber-threats and the sharing of real-time information about attacks among CIs. CockpitCI addresses one MICIE's shortcoming by adding SCADA-oriented security detection capabilities, providing input for risk prediction models and assessment of the operational status of the Industrial Control Systems.


2020 ◽  
Vol 31 (3) ◽  
pp. 287-296
Author(s):  
Ahmed A. Moustafa ◽  
Angela Porter ◽  
Ahmed M. Megreya

AbstractMany students suffer from anxiety when performing numerical calculations. Mathematics anxiety is a condition that has a negative effect on educational outcomes and future employment prospects. While there are a multitude of behavioral studies on mathematics anxiety, its underlying cognitive and neural mechanism remain unclear. This article provides a systematic review of cognitive studies that investigated mathematics anxiety. As there are no prior neural network models of mathematics anxiety, this article discusses how previous neural network models of mathematical cognition could be adapted to simulate the neural and behavioral studies of mathematics anxiety. In other words, here we provide a novel integrative network theory on the links between mathematics anxiety, cognition, and brain substrates. This theoretical framework may explain the impact of mathematics anxiety on a range of cognitive and neuropsychological tests. Therefore, it could improve our understanding of the cognitive and neurological mechanisms underlying mathematics anxiety and also has important applications. Indeed, a better understanding of mathematics anxiety could inform more effective therapeutic techniques that in turn could lead to significant improvements in educational outcomes.


Author(s):  
Mahantesh Halappanavar ◽  
John Feo ◽  
Oreste Villa ◽  
Antonino Tumeo ◽  
Alex Pothen

Graph matching is a prototypical combinatorial problem with many applications in high-performance scientific computing. Optimal algorithms for computing matchings are challenging to parallelize. Approximation algorithms are amenable to parallelization and are therefore important to compute matchings for large-scale problems. Approximation algorithms also generate nearly optimal solutions that are sufficient for many applications. In this paper we present multithreaded algorithms for computing half-approximate weighted matching on state-of-the-art multicore (Intel Nehalem and AMD Magny-Cours), manycore (Nvidia Tesla and Nvidia Fermi), and massively multithreaded (Cray XMT) platforms. We provide two implementations: the first uses shared work queues and is suited for all platforms; and the second implementation, based on dataflow principles, exploits special features available on the Cray XMT. Using a carefully chosen dataset that exhibits characteristics from a wide range of applications, we show scalable performance across different platforms. In particular, for one instance of the input, an R-MAT graph (RMAT-G), we show speedups of about [Formula: see text] on [Formula: see text] cores of an AMD Magny-Cours, [Formula: see text] on [Formula: see text] cores of Intel Nehalem, [Formula: see text] on Nvidia Tesla and [Formula: see text] on Nvidia Fermi relative to one core of Intel Nehalem, and [Formula: see text] on [Formula: see text] processors of Cray XMT. We demonstrate strong as well as weak scaling for graphs with up to a billion edges using up to 12,800 threads. We avoid excessive fine-tuning for each platform and retain the basic structure of the algorithm uniformly across platforms. An exception is the dataflow algorithm designed specifically for the Cray XMT. To the best of the authors' knowledge, this is the first such large-scale study of the half-approximate weighted matching problem on multithreaded platforms. Driven by the critical enabling role of combinatorial algorithms such as matching in scientific computing and the emergence of informatics applications, there is a growing demand to support irregular computations on current and future computing platforms. In this context, we evaluate the capability of emerging multithreaded platforms to tolerate latency induced by irregular memory access patterns, and to support fine-grained parallelism via light-weight synchronization mechanisms. By contrasting the architectural features of these platforms against the Cray XMT, which is specifically designed to support irregular memory-intensive applications, we delineate the impact of these choices on performance.


2014 ◽  
Vol 2014 ◽  
pp. 1-11 ◽  
Author(s):  
Muhammad Akram ◽  
Ather Ashraf ◽  
Mansoor Sarwar

Many problems of practical interest can be modeled and solved by using graph algorithms. In general, graph theory has a wide range of applications in diverse fields. In this paper, the intuitionistic fuzzy organizational and neural network models, intuitionistic fuzzy neurons in medical diagnosis, intuitionistic fuzzy digraphs in vulnerability assessment of gas pipeline networks, and intuitionistic fuzzy digraphs in travel time are presented as examples of intuitionistic fuzzy digraphs in decision support system. We have also designed and implemented the algorithms for these decision support systems.


Sign in / Sign up

Export Citation Format

Share Document