Mitigating Webshell Attacks through Machine Learning Techniques

A webshell is a command execution environment in the form of web pages. It is often used by attackers as a backdoor tool for web server operations. Accurately detecting webshells is of great significance to web server protection. Most security products detect webshells based on feature-matching methods—matching input scripts against pre-built malicious code collections. The feature-matching method has a low detection rate for obfuscated webshells. However, with the help of machine learning algorithms, webshells can be detected more efficiently and accurately. In this paper, we propose a new PHP webshell detection model, the NB-Opcode (naïve Bayes and opcode sequence) model, which is a combination of naïve Bayes classifiers and opcode sequences. Through experiments and analysis on a large number of samples, the experimental results show that the proposed method could effectively detect a range of webshells. Compared with the traditional webshell detection methods, this method improves the efficiency and accuracy of webshell detection.

Download Full-text

Breast Cancer Prediction Using Classification Techniques of Machine Learning

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2022.39743 ◽

2022 ◽

Vol 10 (1) ◽

pp. 51-57

Author(s):

Angela More

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Naive Bayes ◽

Naïve Bayes ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Decision Tree Classifier ◽

Learning Techniques ◽

Tree Classifier ◽

Abstract Data

Abstract: Data analytics play vital roles in diagnosis and treatment in the health care sector. To enable practitioner decisionmaking, huge volumes of data should be processed with machine learning techniques to produce tools for prediction and classification Breast Cancer reports 1 million cases per year. We have proposed a prediction model, which is specifically designed for prediction of Breast Cancer using Machine learning algorithms Decision tree classifier, Naïve Bayes, SVM and KNearest Neighbour algorithms. The model predicts the type of tumour, the tumour can be benign (noncancerous) or malignant (cancerous) . The model uses supervised learning which is a machine learning concept where we provide dependent and independent columns to machine. It uses classification technique which predicts the type of tumour. Keywords: Cancer, Machine learning, Prediction, Data Visualization, SVM, Naïve Bayes, Classification.

Download Full-text

A Comparative Analysis to Visualize the Behavior of Different Machine Learning Algorithms for Normalized and Un-Normalized Data in Predicting Alzheimer’s Disease

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2019.8259 ◽

2019 ◽

Vol 16 (9) ◽

pp. 3840-3848

Author(s):

Neeraj Kumar ◽

Jatinder Manhas ◽

Vinod Sharma

Keyword(s):

Machine Learning ◽

Naive Bayes ◽

Neurodegenerative Disorder ◽

Learning Algorithms ◽

Naïve Bayes ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Linear Discriminant ◽

Age Related

Advancement in technology has helped people to live a long and better life. But the increased life expectancy has also elevated the risk of age related disorders, especially the neurodegenerative disorders. Alzheimer’s is one such neurodegenerative disorder, which is also the leading contributor towards dementia in elderly people. Despite of extensive research in this field, scientists have failed to find a cure for the disease till date. This makes early diagnosis of Alzheimer’s very crucial so as to delay its progression and improve the condition of the patient. Various techniques are being employed for diagnosing Alzheimer’s which include neuropsychological tests, medical imaging, blood based biomarkers, etc. Apart from this, various machine learning algorithms have been employed so far to diagnose Alzheimer’s in its early stages. In the current research, authors compared the performance of various machine learning techniques i.e., Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), Naïve Bayes (NB), Support Vector Machines (SVM), Decision Trees (DT), Random Forests (RF) and Multi Layer Perceptron (MLP) on Alzheimer’s dataset. This paper experimentally demonstrated that normalization exhibits a predominant role in enhancing the efficiency of some machine learning algorithms. Therefore it becomes imperative to choose the algorithms as per the available data. In this paper, the efficiency of the given machine learning methods was compared in terms of accuracy and f1-score. Naïve Bayes gave a better overall performance for both accuracy and f1-score and it also remained unaffected with the normalization of data along with LDA, DT and RF. Whereas KNN, SVM and MLP showed a drastic (17% to 86%) improvement in the performance when they are given normalized data as compared to un-normalized data from Alzheimer’s dataset.

Download Full-text

Using Machine Learning to Build a Classification Model for IoT Networks to Detect Attack Signatures

International journal of Computer Networks & Communications ◽

10.5121/ijcnc.2020.12607 ◽

2020 ◽

Vol 12 (6) ◽

pp. 99-116

Author(s):

Mousa Al-Akhras ◽

Mohammed Alawairdhi ◽

Ali Alkoudari ◽

Samer Atawneh

Keyword(s):

Machine Learning ◽

Naive Bayes ◽

Denial Of Service ◽

Learning Algorithms ◽

Naïve Bayes ◽

Machine Learning Algorithms ◽

Classification Model ◽

Security And Privacy ◽

K Nearest Neighbors ◽

Detection Model

Internet of things (IoT) has led to several security threats and challenges within society. Regardless of the benefits that it has brought with it to the society, IoT could compromise the security and privacy of individuals and companies at various levels. Denial of Service (DoS) and Distributed DoS (DDoS) attacks, among others, are the most common attack types that face the IoT networks. To counter such attacks, companies should implement an efficient classification/detection model, which is not an easy task. This paper proposes a classification model to examine the effectiveness of several machine-learning algorithms, namely, Random Forest (RF), k-Nearest Neighbors (KNN), and Naïve Bayes. The machine learning algorithms are used to detect attacks on the UNSW-NB15 benchmark dataset. The UNSW-NB15 contains normal network traffic and malicious traffic instants. The experimental results reveal that RF and KNN classifiers give the best performance with an accuracy of 100% (without noise injection) and 99% (with 10% noise filtering), while the Naïve Bayes classifier gives the worst performance with an accuracy of 95.35% and 82.77 without noise and with 10% noise, respectively. Other evaluation matrices, such as precision and recall, also show the effectiveness of RF and KNN classifiers over Naïve Bayes.

Download Full-text

Classification of Student Performance Dataset using Machine Learning Algorithms

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.b1114.1292s219 ◽

2019 ◽

Vol 9 (2S2) ◽

pp. 752-757

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Student Performance ◽

Performance Metrics ◽

Naive Bayes ◽

Research Work ◽

Naïve Bayes ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector

The scope of this research work is to identify the efficient machine learning algorithm for predicting the behavior of a student from the student performance dataset. We applied Support Vector Machines, K-Nearest Neighbor, Decision Tree and Naïve Bayes algorithms to predict the grade of a student and compared their prediction results in terms of various performance metrics. The students who visited many resources for reference, made academic related discussions and interactions in the class room, absent for minimum days, cared by parents care have shown great improvement in the final grade. Among the machine learning techniques we have used, SVM has shown more accuracy in terms of four important attribute. The accuracy rate of SVM after tuning is 0.80. The KNN and decision tree achieves the accuracy of 0.64, 0.65 respectively whereas the Naïve Bayes achieves 0.77.

Download Full-text

Analysis of Machine Learning Techniques for Anomaly-Based Intrusion Detection

International Journal of Distributed Artificial Intelligence ◽

10.4018/ijdai.2020010102 ◽

2020 ◽

Vol 12 (1) ◽

pp. 20-38

Author(s):

Winfred Yaokumah ◽

Isaac Wiafe

Keyword(s):

Machine Learning ◽

Random Forest ◽

Intrusion Detection ◽

Decision Tree ◽

Naive Bayes ◽

Weighted Average ◽

Absolute Error ◽

Naïve Bayes ◽

Machine Learning Algorithms ◽

Machine Learning Techniques

Determining the machine learning (ML) technique that performs best on new datasets is an important factor in the design of effective anomaly-based intrusion detection systems. This study therefore evaluated four machine learning algorithms (naive Bayes, k-nearest neighbors, decision tree, and random forest) on UNSW-NB 15 dataset for intrusion detection. The experiment results showed that random forest and decision tree classifiers are effective for detecting intrusion. Random forest had the highest weighted average accuracy of 89.66% and a mean absolute error (MAE) value of 0.0252 whereas decision tree recorded 89.20% and 0.0242, respectively. Naive Bayes classifier had the worst results on the dataset with 56.43% accuracy and a MAE of 0.0867. However, contrary to existing knowledge, naïve Bayes was observed to be potent in classifying backdoor attacks. Observably, naïve Bayes performed relatively well in classes where tree-based classifiers demonstrated abysmal performance.

Download Full-text

Sentiment Analysis using various Machine Learning and Deep Learning Techniques

Journal of the Nigerian Society of Physical Sciences ◽

10.46481/jnsps.2021.308 ◽

2021 ◽

pp. 385-394

Author(s):

V Umarani ◽

A Julian ◽

J Deepa

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Sentiment Analysis ◽

Naive Bayes ◽

Naïve Bayes ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Support Vector ◽

Analysis Process ◽

Learning Techniques

Sentiment analysis has gained a lot of attention from researchers in the last year because it has been widely applied to a variety of application domains such as business, government, education, sports, tourism, biomedicine, and telecommunication services. Sentiment analysis is an automated computational method for studying or evaluating sentiments, feelings, and emotions expressed as comments, feedbacks, or critiques. The sentiment analysis process can be automated using machine learning techniques, which analyses text patterns faster. The supervised machine learning technique is the most used mechanism for sentiment analysis. The proposed work discusses the flow of sentiment analysis process and investigates the common supervised machine learning techniques such as multinomial naive bayes, Bernoulli naive bayes, logistic regression, support vector machine, random forest, K-nearest neighbor, decision tree, and deep learning techniques such as Long Short-Term Memory and Convolution Neural Network. The work examines such learning methods using standard data set and the experimental results of sentiment analysis demonstrate the performance of various classifiers taken in terms of the precision, recall, F1-score, RoC-Curve, accuracy, running time and k fold cross validation and helps in appreciating the novelty of the several deep learning techniques and also giving the user an overview of choosing the right technique for their application.

Download Full-text

Predicting Student’s Performance Using Machine Learning Algorithm

International Journal of Advanced Research in Science, Communication and Technology ◽

10.48175/ijarsct-1209 ◽

2021 ◽

pp. 53-58

Author(s):

Sheela Rani P ◽

Dhivya S ◽

Dharshini Priya M ◽

Dharmila Chowdary A

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Prediction Model ◽

Naive Bayes ◽

Learning Algorithm ◽

Naïve Bayes ◽

Machine Learning Algorithms ◽

Support Vector ◽

Learning Approaches ◽

K Nearest Neighbors

Machine learning is a new analysis discipline that uses knowledge to boost learning, optimizing the training method and developing the atmosphere within which learning happens. There square measure 2 sorts of machine learning approaches like supervised and unsupervised approach that square measure accustomed extract the knowledge that helps the decision-makers in future to require correct intervention. This paper introduces an issue that influences students' tutorial performance prediction model that uses a supervised variety of machine learning algorithms like support vector machine , KNN(k-nearest neighbors), Naïve Bayes and supplying regression and logistic regression. The results supported by various algorithms are compared and it is shown that the support vector machine and Naïve Bayes performs well by achieving improved accuracy as compared to other algorithms. The final prediction model during this paper may have fairly high prediction accuracy .The objective is not just to predict future performance of students but also provide the best technique for finding the most impactful features that influence student’s while studying.

Download Full-text

Cyber Bullying Detection for Twitter Using ML Classification Algorithms

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.38701 ◽

2021 ◽

Vol 9 (11) ◽

pp. 24-29

Author(s):

Muskan Patidar

Keyword(s):

Machine Learning ◽

Social Media ◽

Natural Language ◽

Naive Bayes ◽

Learning Algorithms ◽

Naïve Bayes ◽

Cyber Bullying ◽

Machine Learning Algorithms ◽

Support Vector ◽

Classification Algorithms

Abstract: Social networking platforms have given us incalculable opportunities than ever before, and its benefits are undeniable. Despite benefits, people may be humiliated, insulted, bullied, and harassed by anonymous users, strangers, or peers. Cyberbullying refers to the use of technology to humiliate and slander other people. It takes form of hate messages sent through social media and emails. With the exponential increase of social media users, cyberbullying has been emerged as a form of bullying through electronic messages. We have tried to propose a possible solution for the above problem, our project aims to detect cyberbullying in tweets using ML Classification algorithms like Naïve Bayes, KNN, Decision Tree, Random Forest, Support Vector etc. and also we will apply the NLTK (Natural language toolkit) which consist of bigram, trigram, n-gram and unigram on Naïve Bayes to check its accuracy. Finally, we will compare the results of proposed and baseline features with other machine learning algorithms. Findings of the comparison indicate the significance of the proposed features in cyberbullying detection. Keywords: Cyber bullying, Machine Learning Algorithms, Twitter, Natural Language Toolkit

Download Full-text

Preliminary Screening of COVID-19 Infection Employing Machine Learning Techniques From Simple Blood Profile

International Journal of Quantitative Structure-Property Relationships ◽

10.4018/ijqspr.2021070103 ◽

2021 ◽

Vol 6 (3) ◽

pp. 35-47

Author(s):

Anirudh Reddy Cingireddy ◽

Robin Ghosh ◽

Supratik Kar ◽

Venkata Melapu ◽

Sravanthi Joginipeli ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Naive Bayes ◽

Albert Einstein ◽

Naïve Bayes ◽

Machine Learning Techniques ◽

Support Vector ◽

Blood Profile ◽

Molecular Tests ◽

Large Populations

Frequent testing of the entire population would help to identify individuals with active COVID-19 and allow us to identify concealed carriers. Molecular tests, antigen tests, and antibody tests are being widely used to confirm COVID-19 in the population. Molecular tests such as the real-time reverse transcription-polymerase chain reaction (rRT-PCR) test will take a minimum of 3 hours to a maximum of 4 days for the results. The authors suggest using machine learning and data mining tools to filter large populations at a preliminary level to overcome this issue. The ML tools could reduce the testing population size by 20 to 30%. In this study, they have used a subset of features from full blood profile which are drawn from patients at Israelita Albert Einstein hospital located in Brazil. They used classification models, namely KNN, logistic regression, XGBooting, naive Bayes, decision tree, random forest, support vector machine, and multilayer perceptron with k-fold cross-validation, to validate the models. Naïve bayes, KNN, and random forest stand out as the most predictive ones with 88% accuracy each.

Download Full-text

Intelligent Malware Detection Using Deep Dilated Residual Networks for Cyber Security

Research Anthology on Artificial Intelligence Applications in Security ◽

10.4018/978-1-7998-7705-9.ch050 ◽

2021 ◽

pp. 1085-1099

Author(s):

S. Abijah Roseline ◽

S. Geetha

Keyword(s):

Machine Learning ◽

Cyber Security ◽

Machine Learning Algorithms ◽

Human Interaction ◽

Machine Learning Techniques ◽

Detection Methods ◽

Security Threat ◽

Signature Detection ◽

Learning Techniques ◽

Feature Based

Malware is the most serious security threat, which possibly targets billions of devices like personal computers, smartphones, etc. across the world. Malware classification and detection is a challenging task due to the targeted, zero-day, and stealthy nature of advanced and new malwares. The traditional signature detection methods like antivirus software were effective for detecting known malwares. At present, there are various solutions for detection of such unknown malwares employing feature-based machine learning algorithms. Machine learning techniques detect known malwares effectively but are not optimal and show a low accuracy rate for unknown malwares. This chapter explores a novel deep learning model called deep dilated residual network model for malware image classification. The proposed model showed a higher accuracy of 98.50% and 99.14% on Kaggle Malimg and BIG 2015 datasets, respectively. The new malwares can be handled in real-time with minimal human interaction using the proposed deep residual model.

Download Full-text