Cyber Security Tool Kit (CyberSecTK): A Python Library for Machine Learning and Cyber Security

Ricardo A. Calix; Sumendra B. Singh; Tingyu Chen; Dingkai Zhang; Michael Tu

doi:10.3390/info11020100

Cyber Security Tool Kit (CyberSecTK): A Python Library for Machine Learning and Cyber Security

Information ◽

10.3390/info11020100 ◽

2020 ◽

Vol 11 (2) ◽

pp. 100

Author(s):

Ricardo A. Calix ◽

Sumendra B. Singh ◽

Tingyu Chen ◽

Dingkai Zhang ◽

Michael Tu

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Cyber Security ◽

Research Work ◽

Data Sets ◽

Learning Approaches ◽

Related Data ◽

Research And Teaching ◽

Survey Results ◽

Program Modules

The cyber security toolkit, CyberSecTK, is a simple Python library for preprocessing and feature extraction of cyber-security-related data. As the digital universe expands, more and more data need to be processed using automated approaches. In recent years, cyber security professionals have seen opportunities to use machine learning approaches to help process and analyze their data. The challenge is that cyber security experts do not have necessary trainings to apply machine learning to their problems. The goal of this library is to help bridge this gap. In particular, we propose the development of a toolkit in Python that can process the most common types of cyber security data. This will help cyber experts to implement a basic machine learning pipeline from beginning to end. This proposed research work is our first attempt to achieve this goal. The proposed toolkit is a suite of program modules, data sets, and tutorials supporting research and teaching in cyber security and defense. An example of use cases is presented and discussed. Survey results of students using some of the modules in the library are also presented.

Download Full-text

Machine Learning Approaches for the Analysis of Non-Metallic Inclusion Data Sets

AISTech2019 Proceedings of the Iron and Steel Technology Conference ◽

10.33313/377/275 ◽

2019 ◽

Author(s):

M. Webler ◽

B. Abdulsalam

Keyword(s):

Machine Learning ◽

Data Sets ◽

Learning Approaches ◽

Metallic Inclusion

Download Full-text

Decision Tree: A Machine Learning for Intrusion Detection

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.f1234.0486s419 ◽

2019 ◽

Vol 8 (6S4) ◽

pp. 1126-1130

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Detection System ◽

Research Work ◽

Machine Learning Techniques ◽

Data Sets ◽

Legitimate User ◽

Learning Techniques ◽

Three Stages

The Intrusion is a major threat to unauthorized data or legal network using the legitimate user identity or any of the back doors and vulnerabilities in the network. IDS mechanisms are developed to detect the intrusions at various levels. The objective of the research work is to improve the Intrusion Detection System performance by applying machine learning techniques based on decision trees for detection and classification of attacks. The methodology adapted will process the datasets in three stages. The experimentation is conducted on KDDCUP99 data sets based on number of features. The Bayesian three modes are analyzed for different sized data sets based upon total number of attacks. The time consumed by the classifier to build the model is analyzed and the accuracy is done.

Download Full-text

Deep Learning Approaches for Sentiment Analysis Challenges and Future Issues

10.4018/978-1-7998-8161-2.ch003 ◽

2022 ◽

pp. 27-50

Author(s):

Rajalaxmi Prabhu B. ◽

Seema S.

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Model Building ◽

Large Data ◽

Machine Learning Algorithms ◽

Large Data Sets ◽

Data Sets ◽

Learning Approaches ◽

Learning Techniques ◽

Important Challenge

A lot of user-generated data is available these days from huge platforms, blogs, websites, and other review sites. These data are usually unstructured. Analyzing sentiments from these data automatically is considered an important challenge. Several machine learning algorithms are implemented to check the opinions from large data sets. A lot of research has been undergone in understanding machine learning approaches to analyze sentiments. Machine learning mainly depends on the data required for model building, and hence, suitable feature exactions techniques also need to be carried. In this chapter, several deep learning approaches, its challenges, and future issues will be addressed. Deep learning techniques are considered important in predicting the sentiments of users. This chapter aims to analyze the deep-learning techniques for predicting sentiments and understanding the importance of several approaches for mining opinions and determining sentiment polarity.

Download Full-text

Effectiveness of Machine Learning Approaches Towards Credibility Assessment of Crowdfunding Projects for Reliable Recommendations

Applied Sciences ◽

10.3390/app10249062 ◽

2020 ◽

Vol 10 (24) ◽

pp. 9062

Author(s):

Wafa Shafqat ◽

Yung-Cheol Byun ◽

Namje Park

Keyword(s):

Machine Learning ◽

Latent Dirichlet Allocation ◽

Short Term Memory ◽

Research Work ◽

Learning Approaches ◽

Credibility Assessment ◽

User Interests ◽

Machine Learning Approach ◽

Hybrid Machine ◽

Numeric Data

Recommendation systems aim to decipher user interests, preferences, and behavioral patterns automatically. However, it becomes trickier to make the most trustworthy and reliable recommendation to users, especially when their hardest earned money is at risk. The credibility of the recommendation is of magnificent importance in crowdfunding project recommendations. This research work devises a hybrid machine learning-based approach for credible crowdfunding projects’ recommendations by wisely incorporating backers’ sentiments and other influential features. The proposed model has four modules: a feature extraction module, a hybrid LDA-LSTM (latent Dirichlet allocation and long short-term memory) based latent topics evaluation module, credibility formulation, and recommendation module. The credibility analysis proffers a process of correlating project creator’s proficiency, reviewers’ sentiments, and their influence to estimate a project’s authenticity level that makes our model robust to unauthentic and untrustworthy projects and profiles. The recommendation module selects projects based on the user’s interests with the highest credible scores and recommends them. The proposed recommendation method harnesses numeric data and sentiment expressions linked with comments, backers’ preferences, profile data, and the creator’s credibility for quantitative examination of several alternative projects. The proposed model’s evaluation depicts that credibility assessment based on the hybrid machine learning approach contributes efficient results (with 98% accuracy) than existing recommendation models. We have also evaluated our credibility assessment technique on different categories of the projects, i.e., suspended, canceled, delivered, and never delivered projects, and achieved satisfactory outcomes, i.e., 93%, 84%, 58%, and 93%, projects respectively accurately classify into our desired range of credibility.

Download Full-text

A Literature Review of the Detection and Categorization of various Arecanut Diseases using Image Processing and Machine Learning Approaches

International Journal of Applied Engineering and Management Letters ◽

10.47992/ijaeml.2581.7000.0112 ◽

2021 ◽

pp. 183-204

Author(s):

Puneeth B. R. ◽

Nethravathi P. S.

Keyword(s):

Machine Learning ◽

Literature Review ◽

Research Work ◽

Machine Learning Techniques ◽

Research Centre ◽

Learning Approaches ◽

Study Objective ◽

Survey Paper ◽

Multiple Data ◽

New Ideas

Background/Purpose: Every scholarly research project starts with a survey of the literature, which acts as a springboard for new ideas. The purpose of this literature review is to become familiar with the study domain and to assess the work's credibility. It also improves with the subject's integration and summary. This article briefly discusses the detection of disease and classification to achieve the objectives of the study. Objective: The main objective of this literature survey is to explore the different techniques applied to identify and classify the various diseases on arecanut. This paper also recommends the methodology and techniques that can be used to achieve the objectives of the study. Design/Methodology/Approach: Multiple data sources, such as journals, conference proceedings, books, and research papers published in reputable journals, were used to compile the essential literature on the chosen topic and collect information from the arecanuts research centre and many farmers in the south Canara and Udupi districts, before narrowing down the literature that is relevant to the research work. The shortlisted literature was carefully assessed by reading each paper and taking notes as appropriate. The information gathered is then examined to identify the potential gap in the study. Findings/Result: Based on the analysis of the papers reviewed, discussion with farmers and research center officers, it is observed that, not much work is carried out in the field of disease identification and classification on arecanut using machine learning techniques. This survey paper recommends techniques and the methodology that can be applied to identify and classify the diseases in arecanut and to classify them in to healthy and unhealthy. Research limitations/implications: The literature review mentioned in this paper are detection and classification of different diseases in arecanut. Originality/Value: This paper focuses on various online research journals, conference papers, technical books, and web articles. Paper Type: Literature review paper on techniques and methods used to achieve the objectives.

Download Full-text

Human Activity Recognition of Exoskeleton Robot with Supervised Learning Techniques

10.21203/rs.3.rs-1161576/v1 ◽

2021 ◽

Author(s):

Jiacheng Mai ◽

zhiyuan chen ◽

Chunzhi Yi ◽

Zhen Ding

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Activity Recognition ◽

Human Activity ◽

Human Activity Recognition ◽

The Body ◽

Lower Limbs ◽

Data Sets ◽

Learning Approaches ◽

Learning Method

Abstract Lower limbs exoskeleton robots improve the motor ability of humans and can facilitate superior rehabilitative training. By training large datasets, many of the currently available mobile and signal devices that may be worn on the body can employ machine learning approaches to forecast and classify people's movement characteristics. This approach could help exoskeleton robots improve their ability to predict human activities. Two popular data sets are PAMAP2, which was obtained by measuring people's movement through inertial sensors, and WISDM, which was collected people's activity information through mobile phones. With the focus on human activity recognition, this paper applied the traditional machine learning method and deep learning method to train and test these datasets, whereby it was found that the prediction performance of a decision tree model was highest on these two data sets, which is 99% and 72% separately, and the time consumption of decision tree is the least. In addition, a comparison of the signals collected from different parts of the human body showed that the signals deriving from the hands presented the best performance in terms of recognizing human movement types.

Download Full-text

PMU10 MACHINE LEARNING APPROACHES TO FEATURE EXTRACTION WHEN PREDICTING MEDICATION COMPLIANCE AND PERSISTENCY

Value in Health ◽

10.1016/j.jval.2019.04.1173 ◽

2019 ◽

Vol 22 ◽

pp. S250-S251

Author(s):

L. Gautier ◽

S. Balkin

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Medication Compliance ◽

Learning Approaches

Download Full-text

Wearable Devices Data for Activity Prediction Using Machine Learning Algorithms

International Journal of Big Data and Analytics in Healthcare ◽

10.4018/ijbdah.2019010103 ◽

2019 ◽

Vol 4 (1) ◽

pp. 32-46

Author(s):

Lakshmi Prayaga ◽

Krishna Devulapalli ◽

Chandra Prayaga

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Learning Algorithms ◽

Wearable Devices ◽

Machine Learning Algorithms ◽

Embedded Sensors ◽

Data Sets ◽

Activity Prediction ◽

Related Data ◽

Recent Trends

Wearable devices are contributing heavily towards the proliferation of data and creating a rich minefield for data analytics. Recent trends in the design of wearable devices include several embedded sensors which also provide useful data for many applications. This research presents results obtained from studying human-activity related data, collected from wearable devices. The activities considered for this study were working at the computer, standing and walking, standing, walking, walking up and down the stairs, and talking while walking. The research entails the use of a portion of the data to train machine learning algorithms and build a model. The rest of the data is used as test data for predicting the activity of an individual. Details of data collection, processing, and presentation are also discussed. After studying the literature and the data sets, a Random Forest machine learning algorithm was determined to be best applicable algorithm for analyzing data from wearable devices. The software used in this research includes the R statistical package and the SensorLog app.

Download Full-text

Machine Learning Approaches in Cyber Security Analytics

10.1007/978-981-15-1706-8 ◽

2020 ◽

Cited By ~ 1

Author(s):

Tony Thomas ◽

Athira P. Vijayaraghavan ◽

Sabu Emmanuel

Keyword(s):

Machine Learning ◽

Cyber Security ◽

Learning Approaches ◽

Security Analytics

Download Full-text

Towards Mapping Images to Text Using Deep-Learning Architectures

Mathematics ◽

10.3390/math8091606 ◽

2020 ◽

Vol 8 (9) ◽

pp. 1606

Author(s):

Daniela Onita ◽

Adriana Birlutiu ◽

Liviu P. Dinu

Keyword(s):

Neural Network ◽

Feature Extraction ◽

Deep Learning ◽

Image Features ◽

The Other ◽

Data Sets ◽

Network Architectures ◽

Learning Approaches ◽

Data Set ◽

Pixel Value

Images and text represent types of content that are used together for conveying a message. The process of mapping images to text can provide very useful information and can be included in many applications from the medical domain, applications for blind people, social networking, etc. In this paper, we investigate an approach for mapping images to text using a Kernel Ridge Regression model. We considered two types of features: simple RGB pixel-value features and image features extracted with deep-learning approaches. We investigated several neural network architectures for image feature extraction: VGG16, Inception V3, ResNet50, Xception. The experimental evaluation was performed on three data sets from different domains. The texts associated with images represent objective descriptions for two of the three data sets and subjective descriptions for the other data set. The experimental results show that the more complex deep-learning approaches that were used for feature extraction perform better than simple RGB pixel-value approaches. Moreover, the ResNet50 network architecture performs best in comparison to the other three deep network architectures considered for extracting image features. The model error obtained using the ResNet50 network is less by approx. 0.30 than other neural network architectures. We extracted natural language descriptors of images and we made a comparison between original and generated descriptive words. Furthermore, we investigated if there is a difference in performance between the type of text associated with the images: subjective or objective. The proposed model generated more similar descriptions to the original ones for the data set containing objective descriptions whose vocabulary is simpler, bigger and clearer.

Download Full-text