Detect and Classify the Unpredictable Cyber-Attacks by using DNN Model

Machine learning techniques are often used to develop IDS by detecting and deploying fast and automated network attacks to torpedoes and host standards. However, there are many problems, as severe attacks change all the time and occur at very high levels that require a lot of resolution. There are many malicious packages available for further investigation by the cybersecurity community. However, one completed study did not provide a complete analysis to apply different machine learning algorithms on different media packages. Because of the persistent methods of attack and the dynamic nature of malware, it is important to systematically update and approve malicious packages that are available to the public. This paper explores the DNN, a type of comprehensive learning model, promoting flexible and appropriate IDS for detecting and deploying expected and unpredictable online attacks. Sustainable industrial development and rapid development of attacks need evaluation for some data developed over the years using static and dynamic methods. This type of research can help determine the best algorithm to identify future attacks. Comparative data for some commonly available malware provides a comprehensive comparison of DNN experiences with other class machine learning classifications. The best network parameters and network topologies for DNN are selected using the KDDCup 99 package with this hyperparameter selection method. The DNN model, which works well on KDDCup 99, works on other data, such as the NSL-KDD memory test. Our DNN model teaches how to transfer IDS information functions from multicultural.Multidisciplinary representations in a variety of encryption. Complex tests have shown that DNN performs better than conventional machine learning classification. Finally, we present a large and hybrid DNN torrent structure called Scale-Hybrid-IDS-AlertNet, which can be used to effectively monitor the impact of network traffic and host-level events to warn directly about cyber-attacks.

Download Full-text

Evaluation of Interstate Work Zone Mobility using Probe Vehicle Data and Machine Learning Techniques

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/0361198119827936 ◽

2019 ◽

Vol 2673 (2) ◽

pp. 811-822 ◽

Cited By ~ 1

Author(s):

Mohsen Kamyab ◽

Stephen Remias ◽

Erfan Najmi ◽

Kerrick Hood ◽

Mustafa Al-Akshar ◽

...

Keyword(s):

Machine Learning ◽

Machine Learning Algorithms ◽

Work Zone ◽

Third Party ◽

Machine Learning Techniques ◽

Work Zones ◽

Vehicle Data ◽

Gps Devices ◽

Future Work ◽

The Impact

According to the Federal Highway Administration (FHWA), US work zones on freeways account for nearly 24% of nonrecurring freeway delays and 10% of overall congestion. Historically, there have been limited scalable datasets to investigate the specific causes of congestion due to work zones or to improve work zone planning processes to characterize the impact of work zone congestion. In recent years, third-party data vendors have provided scalable speed data from Global Positioning System (GPS) devices and cell phones which can be used to characterize mobility on all roadways. Each work zone has unique characteristics and varying mobility impacts which are predicted during the planning and design phases, but can realistically be quite different from what is ultimately experienced by the traveling public. This paper uses these datasets to introduce a scalable Work Zone Mobility Audit (WZMA) template. Additionally, the paper uses metrics developed for individual work zones to characterize the impact of more than 250 work zones varying in length and duration from Southeast Michigan. The authors make recommendations to work zone engineers on useful data to collect for improving the WZMA. As more systematic work zone data are collected, improved analytical assessment techniques, such as machine learning processes, can be used to identify the factors that will predict future work zone impacts. The paper concludes by demonstrating two machine learning algorithms, Random Forest and XGBoost, which show historical speed variation is a critical component when predicting the mobility impact of work zones.

Download Full-text

Insider Threat Detection Using Supervised Machine Learning Algorithms on an Extremely Imbalanced Dataset

International Journal of Cyber Warfare and Terrorism ◽

10.4018/ijcwt.2020040101 ◽

2020 ◽

Vol 10 (2) ◽

pp. 1-26

Author(s):

Naghmeh Moradpoor Sheykhkanloo ◽

Adam Hall

Keyword(s):

Machine Learning ◽

Performance Metrics ◽

Machine Learning Algorithms ◽

Third Party ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Insider Threat ◽

Threat Detection ◽

Imbalanced Dataset ◽

The Impact

An insider threat can take on many forms and fall under different categories. This includes malicious insider, careless/unaware/uneducated/naïve employee, and the third-party contractor. Machine learning techniques have been studied in published literature as a promising solution for such threats. However, they can be biased and/or inaccurate when the associated dataset is hugely imbalanced. Therefore, this article addresses the insider threat detection on an extremely imbalanced dataset which includes employing a popular balancing technique known as spread subsample. The results show that although balancing the dataset using this technique did not improve performance metrics, it did improve the time taken to build the model and the time taken to test the model. Additionally, the authors realised that running the chosen classifiers with parameters other than the default ones has an impact on both balanced and imbalanced scenarios, but the impact is significantly stronger when using the imbalanced dataset.

Download Full-text

A Comparative Evaluation of Supervised Machine Learning Classification Techniques for Engineering Design Applications

Journal of Mechanical Design ◽

10.1115/1.4044524 ◽

2019 ◽

Vol 141 (12) ◽

Author(s):

Conner Sharpe ◽

Tyler Wiest ◽

Pingfeng Wang ◽

Carolyn Conner Seepersad

Keyword(s):

Machine Learning ◽

Engineering Design ◽

Design Space ◽

Optimization Problems ◽

Machine Learning Algorithms ◽

Training Data ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Design Exploration ◽

Machine Learning Classification

Abstract Supervised machine learning techniques have proven to be effective tools for engineering design exploration and optimization applications, in which they are especially useful for mapping promising or feasible regions of the design space. The design space mappings can be used to inform early-stage design exploration, provide reliability assessments, and aid convergence in multiobjective or multilevel problems that require collaborative design teams. However, the accuracy of the mappings can vary based on problem factors such as the number of design variables, presence of discrete variables, multimodality of the underlying response function, and amount of training data available. Additionally, there are several useful machine learning algorithms available, and each has its own set of algorithmic hyperparameters that significantly affect accuracy and computational expense. This work elucidates the use of machine learning for engineering design exploration and optimization problems by investigating the performance of popular classification algorithms on a variety of example engineering optimization problems. The results are synthesized into a set of observations to provide engineers with intuition for applying these techniques to their own problems in the future, as well as recommendations based on problem type to aid engineers in algorithm selection and utilization.

Download Full-text

Optimizing Laboratory Investigations of Saline Intrusion by Incorporating Machine Learning Techniques

Water ◽

10.3390/w12112996 ◽

2020 ◽

Vol 12 (11) ◽

pp. 2996

Author(s):

Georgios Etsias ◽

Gerard A. Hamill ◽

Eric M. Benner ◽

Jesús F. Águila ◽

Mark C. McDonnell ◽

...

Keyword(s):

Machine Learning ◽

Image Processing ◽

Porous Medium ◽

Glass Bead ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Image Processing Technique ◽

Saline Intrusion ◽

Learning Techniques ◽

The Impact

Deriving saltwater concentrations from the light intensity values of dyed saline solutions is a long-established image processing practice in laboratory scale investigations of saline intrusion. The current paper presents a novel methodology that employs the predictive ability of machine learning algorithms in order to determine saltwater concentration fields. The proposed approach consists of three distinct parts, image pre-processing, porous medium classification (glass bead structure recognition) and saltwater field generation (regression). It minimizes the need for aquifer-specific calibrations, significantly shortening the experimental procedure by up to 50% of the time required. A series of typical saline intrusion experiments were conducted in homogeneous and heterogeneous aquifers, consisting of glass beads of varying sizes, to recreate the necessary laboratory data. An innovative method of distinguishing and filtering out the common experimental error introduced by both backlighting and the optical irregularities of the glass bead medium was formulated. This enabled the acquisition of quality predictions by classical, easy-to-use machine learning techniques, such as feedforward Artificial Neural Networks, using a limited amount of training data, proving the applicability of the procedure. The new process was benchmarked against a traditional regression algorithm. A series of variables were utilized to quantify the variance between the results generated by the two procedures. No compromise was found to the quality of the derived concentration fields and it was established that the proposed image processing technique is robust when applied to homogeneous and heterogeneous domains alike, outperforming the classical approach in all test cases. Moreover, the method minimized the impact of experimental errors introduced by small movements of the camera and the presence air bubbles trapped in the porous medium.

Download Full-text

Towards Predicting Student’s Dropout in University Courses Using Different Machine Learning Techniques

Applied Sciences ◽

10.3390/app11073130 ◽

2021 ◽

Vol 11 (7) ◽

pp. 3130

Author(s):

Janka Kabathova ◽

Martin Drlik

Keyword(s):

Machine Learning ◽

Performance Metrics ◽

Machine Learning Algorithms ◽

Classification Model ◽

Machine Learning Techniques ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Unseen Data ◽

E Learning ◽

The Impact

Early and precisely predicting the students’ dropout based on available educational data belongs to the widespread research topic of the learning analytics research field. Despite the amount of already realized research, the progress is not significant and persists on all educational data levels. Even though various features have already been researched, there is still an open question, which features can be considered appropriate for different machine learning classifiers applied to the typical scarce set of educational data at the e-learning course level. Therefore, the main goal of the research is to emphasize the importance of the data understanding, data gathering phase, stress the limitations of the available datasets of educational data, compare the performance of several machine learning classifiers, and show that also a limited set of features, which are available for teachers in the e-learning course, can predict student’s dropout with sufficient accuracy if the performance metrics are thoroughly considered. The data collected from four academic years were analyzed. The features selected in this study proved to be applicable in predicting course completers and non-completers. The prediction accuracy varied between 77 and 93% on unseen data from the next academic year. In addition to the frequently used performance metrics, the comparison of machine learning classifiers homogeneity was analyzed to overcome the impact of the limited size of the dataset on obtained high values of performance metrics. The results showed that several machine learning algorithms could be successfully applied to a scarce dataset of educational data. Simultaneously, classification performance metrics should be thoroughly considered before deciding to deploy the best performance classification model to predict potential dropout cases and design beneficial intervention mechanisms.

Download Full-text

Data Mining and Principal Component Analysis on Coimbra Breast Cancer Dataset

Proceedings of Intelligent Computing and Technologies Conference ◽

10.21467/proceedings.115.5 ◽

2021 ◽

Author(s):

Anupam Sen

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Principal Component Analysis ◽

Principal Component ◽

Component Analysis ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Breast Cancer Dataset ◽

Analysis Tool ◽

Machine Learning Classification

Machine Learning (ML) techniques play an important role in the medical field. Early diagnosis is required to improve the treatment of carcinoma. During this analysis Breast Cancer Coimbra dataset (BCCD) with ten predictors are analyzed to classify carcinoma. In this paper method for feature selection and Machine learning algorithms are applied to the dataset from the UCI repository. WEKA (“Waikato Environment for Knowledge Analysis”) tool is used for machine learning techniques. In this paper Principal Component Analysis (PCA) is used for feature extraction. Different Machine Learning classification algorithms are applied through WEKA such as Glmnet, Gbm, ada Boosting, Adabag Boosting, C50, Cforest, DcSVM, fnn, Ksvm, Node Harvest compares the accuracy and also compare values such as Kappa statistic, Mean Absolute Error (MAE), Root Mean Square Error (RMSE). Here the 10-fold cross validation method is used for training, testing and validation purposes.

Download Full-text

Apparel Recommendation Engine Using Inverse Document Frequency and Weighted Average Word2vec

10.54216/jchci.010201 ◽

2021 ◽

pp. 46-56

Author(s):

Parvesh K ◽

◽

Tharun C ◽

...

Keyword(s):

Machine Learning ◽

Computer Vision ◽

Business Models ◽

Rapid Development ◽

Weighted Average ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Document Frequency ◽

Learning Techniques ◽

Marked Area

The rapid development of e-commerce shopping marketplaces necessitates the use of recommendation engines and quick, precise, and efficient algorithms in order for the company's business models to generate a massive amount of profit. A computer vision software programme enables a computer to learn a great deal from digital images or movies. Machine learning methods are used in computer vision, and several machine learning techniques have been developed specifically for this purpose. Information retrieval is the process of extracting useful information from a dataset, and computer vision is the most commonly used tool for this purpose nowadays. This project consists of a series of modules that run sequentially to retrieve information from a marked area on a receipt. A receipt image is used as an input for the model, and the model first uses various image processing algorithms to clean the data, after which the pre-processed data is applied to machine learning algorithms to produce better results, and the result is a string of numerical digits including the decimal point. The program's accuracy is primarily determined by the image quality or pixel density, and it is necessary to ensure that an input receipt is not damaged and content is not blurred.

Download Full-text

Application of Machine Learning Techniques to Predict the Impact of Health Insurance on the Wellbeing of an Individual

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.b7247.129219 ◽

2019 ◽

Vol 9 (2) ◽

pp. 3065-3070

Keyword(s):

Machine Learning ◽

Health Insurance ◽

Large Scale ◽

Insurance Industry ◽

Financial Burden ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Insurance Companies ◽

The Government ◽

The Impact

The healthcare domain in India has suffered considerably despite the advancement in technology. Several financing schemes are endorsed by the insurance companies to lessen the financial burden faced by the government and people. Nonetheless, Health Insurance segment in India remains underdeveloped due to various complexities that it faces. This paper exploits a heuristic sampling approach combined with the ensemble Machine Learning algorithms on the large-scale insurance business data to realize the current shape of the Health Insurance industry in India. Through the courtesy of Data Mining and Data Analytics, it is plausible to furnish insights that assist the common people in acquiring closure that helps in the process of decision making.

Download Full-text

Adversarial Machine Learning Attacks and Defense Methods in the Cyber Security Domain

ACM Computing Surveys ◽

10.1145/3453158 ◽

2021 ◽

Vol 54 (5) ◽

pp. 1-36

Author(s):

Ishai Rosenberg ◽

Asaf Shabtai ◽

Yuval Elovici ◽

Lior Rokach

Keyword(s):

Machine Learning ◽

Cyber Security ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Future Research ◽

Research Directions ◽

Future Research Directions ◽

Adversarial Attack ◽

The Impact

In recent years, machine learning algorithms, and more specifically deep learning algorithms, have been widely used in many fields, including cyber security. However, machine learning systems are vulnerable to adversarial attacks, and this limits the application of machine learning, especially in non-stationary, adversarial environments, such as the cyber security domain, where actual adversaries (e.g., malware developers) exist. This article comprehensively summarizes the latest research on adversarial attacks against security solutions based on machine learning techniques and illuminates the risks they pose. First, the adversarial attack methods are characterized based on their stage of occurrence, and the attacker’ s goals and capabilities. Then, we categorize the applications of adversarial attack and defense methods in the cyber security domain. Finally, we highlight some characteristics identified in recent research and discuss the impact of recent advancements in other adversarial learning domains on future research directions in the cyber security domain. To the best of our knowledge, this work is the first to discuss the unique challenges of implementing end-to-end adversarial attacks in the cyber security domain, map them in a unified taxonomy, and use the taxonomy to highlight future research directions.

Download Full-text

Performance Evaluation of Machine Learning Techniques for DOS Detection in Wireless Sensor Network

International Journal of Network Security & Its Applications ◽

10.5121/ijnsa.2021.13202 ◽

2021 ◽

Vol 13 (2) ◽

pp. 21-29

Author(s):

Lama Alsulaiman ◽

Saad Al-Ahmadi

Keyword(s):

Machine Learning ◽

Detection System ◽

Denial Of Service ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Wireless Sensor ◽

Security Threats ◽

Dos Attacks ◽

Machine Learning Classification ◽

Learning Techniques

The nature of Wireless Sensor Networks (WSN) and the widespread of using WSN introduce many security threats and attacks. An effective Intrusion Detection System (IDS) should be used to detect attacks. Detecting such an attack is challenging, especially the detection of Denial of Service (DoS) attacks. Machine learning classification techniques have been used as an approach for DoS detection. This paper conducted an experiment using Waikato Environment for Knowledge Analysis (WEKA)to evaluate the efficiency of five machine learning algorithms for detecting flooding, grayhole, blackhole, and scheduling at DoS attacks in WSNs. The evaluation is based on a dataset, called WSN-DS. The results showed that the random forest classifier outperforms the other classifiers with an accuracy of 99.72%.

Download Full-text