scholarly journals Detecting the Phishing Website with the Highest Accuracy

TEM Journal ◽  
2021 ◽  
pp. 947-953
Author(s):  
Hesham Abusaimeh

Phishing attacks are increasing and it becomes necessary to use appropriate response methods and to respond effectively to phishing attacks. This paper aims to uncover phishing attack sites by analyzing a three-module set to prevent damage and reconsider the awareness of phishing attacks. Based on the analyzed content, a countermeasure was proposed for each type of phishing attack by using website features. These features will be classified in order to determine the effectiveness of the countermeasure. Finally, the proposed method enhanced the site security as anti-phishing technology. The phishing detection used three classification algorithms, which are the decision tree; the supporting vector machine and the random forest were combined into one system that was proposed in this paper for the purpose of obtaining the highest accuracy in detecting phishing sites. The results of the proposed algorithm showed 98.52% higher accuracy than others.

Nowadays, heart disease is the main cause of several deaths among all other diseases. Due to the lack of resources in the medical field, the prediction of heart diseases becomes a major problem. For early diagnosis and treatment, some classification algorithms such as Decision Tree and Random Forest Algorithm are used. The data mining techniques compare the accuracy of the algorithm and predict heart diseases. The main aim of this paper is to predict heart disease based on the dataset values. In this paper we are comparing the accuracy of above two algorithms. To implement these methods the following steps are used. In first phase, a dataset of 13 attributes is collected and it was applied on classification techniques using the Decision tree and Random Forest Algorithms. Finally, the accuracy is collected for both the algorithms. In this paper we observed that random forest is generating better results than decision tree in prediction of heart diseases.


Author(s):  
M. Nirmala

Abstract: Data Mining in Educational System has increased tremendously in the past and still increasing in present era. This study focusses on the academic stand point and the performance of the student is evaluated by various parameters such as Scholastic Features, Demographic Features and Emotional Features are carried out. Various Machine learning methodologies are adopted to extract the masked knowledge from the educational data set provided, which helps in identifying the features giving more impact to the student academic performance and there by knowing the impacting features, helps us to predict deeper insights about student performance in academics. Various Machine learning workflow starting from problem definition to Model Prediction has been carried out in this study. The supervised learning methodology has been adopted and various Feature engineering methods has been adopted to make the ML model appropriate for training and evaluation. It is a prediction problem and various Classification algorithms such as Logistic Regression, Random Forest, SVM, KNN, XGBOOST, Decision Tree modelling has been done to fit the student data appropriately. Keywords: Scholastic, Demographic, Emotional, Logistic Regression, Random Forest, SVM, KNN, XGBOOST, Decision Tree.


2020 ◽  
Vol 3 (1) ◽  
pp. 64-74
Author(s):  
Ahmed A. Elsherif ◽  
Arwa A. Aldaej

One of the major challenges that faces the acceptance and growth rate of business and governmental sites is a Botnet-based DDoS attack. A flooding DDoS strikes a victim machine by means of sending a vast amount of malicious traffic, causing a significant drop in the service quality (QoS) in IoT devices. Nonetheless, it is not that easy to detect and tackle flooding DDoS attacks, owing to the significant number of attacking machines, the usage of source-address spoofing, and the common areas shared between legitimate and malicious traffic. New kinds of attacks are identified daily, and some remain undiscovered, accordingly, this paper aims to improve the traffic classification algorithm of network traffic, that hackers use to try to be ambiguous or misleading. A recorded simulated traffic was used for both samples; normal and DDoS attack traffic, approximately 104.000 cases of each, where both datasets -which were created for this study- represent the input data in order to create a classification model, to be used as a tool to mitigate the risk of being attacked. The next step is putting datasets in a format suitable for classification. This process is done through preprocessing techniques, to convert categorical data into numerical data. A classification process is applied to capture datasets, to create a classification model, by using five classification algorithms which are; Decision Tree, Support Vector Machine, Naive Bayes, K-Neighbours and Random Forest. The core code used for classification is the python code, which is controlled by a user interface. The highest prediction, precision and accuracy are obtained using the Decision Tree and Random Forest classification algorithms, which also have the lowest processing time.


2014 ◽  
Vol 2014 ◽  
pp. 1-6 ◽  
Author(s):  
Andronicus A. Akinyelu ◽  
Aderemi O. Adewumi

Phishing is one of the major challenges faced by the world of e-commerce today. Thanks to phishing attacks, billions of dollars have been lost by many companies and individuals. In 2012, an online report put the loss due to phishing attack at about $1.5 billion. This global impact of phishing attacks will continue to be on the increase and thus requires more efficient phishing detection techniques to curb the menace. This paper investigates and reports the use of random forest machine learning algorithm in classification of phishing attacks, with the major objective of developing an improved phishing email classifier with better prediction accuracy and fewer numbers of features. From a dataset consisting of 2000 phishing and ham emails, a set of prominent phishing email features (identified from the literature) were extracted and used by the machine learning algorithm with a resulting classification accuracy of 99.7% and low false negative (FN) and false positive (FP) rates.


Author(s):  
Nur Sholihah Zaini ◽  
Deris Stiawan ◽  
Mohd Faizal Ab Razak ◽  
Ahmad Firdaus ◽  
Wan Isni Sofiah Wan Din ◽  
...  

<span>The increasing development of the Internet, more and more applications are put into websites can be directly accessed through the network. This development has attracted an attacker with phishing websites to compromise computer systems. Several solutions have been proposed to detect a phishing attack. However, there still room for improvement to tackle this phishing threat. This paper aims to investigate and evaluate the effectiveness of machine learning approach in the classification of phishing attack. This paper applied a heuristic approach with machine learning classifier to identify phishing attacks noted in the web site applications. The study compares with five classifiers to find the best machine learning classifiers in detecting phishing attacks. In identifying the phishing attacks, it demonstrates that random forest is able to achieve high detection accuracy with true positive rate value of 94.79% using website features. The results indicate that random forest is effective classifiers for detecting phishing attacks.</span>


2020 ◽  
Vol 4 (Supplement_1) ◽  
pp. 268-269
Author(s):  
Jaime Speiser ◽  
Kathryn Callahan ◽  
Jason Fanning ◽  
Thomas Gill ◽  
Anne Newman ◽  
...  

Abstract Advances in computational algorithms and the availability of large datasets with clinically relevant characteristics provide an opportunity to develop machine learning prediction models to aid in diagnosis, prognosis, and treatment of older adults. Some studies have employed machine learning methods for prediction modeling, but skepticism of these methods remains due to lack of reproducibility and difficulty understanding the complex algorithms behind models. We aim to provide an overview of two common machine learning methods: decision tree and random forest. We focus on these methods because they provide a high degree of interpretability. We discuss the underlying algorithms of decision tree and random forest methods and present a tutorial for developing prediction models for serious fall injury using data from the Lifestyle Interventions and Independence for Elders (LIFE) study. Decision tree is a machine learning method that produces a model resembling a flow chart. Random forest consists of a collection of many decision trees whose results are aggregated. In the tutorial example, we discuss evaluation metrics and interpretation for these models. Illustrated in data from the LIFE study, prediction models for serious fall injury were moderate at best (area under the receiver operating curve of 0.54 for decision tree and 0.66 for random forest). Machine learning methods may offer improved performance compared to traditional models for modeling outcomes in aging, but their use should be justified and output should be carefully described. Models should be assessed by clinical experts to ensure compatibility with clinical practice.


2021 ◽  
Vol 11 (4) ◽  
pp. 1378
Author(s):  
Seung Hyun Lee ◽  
Jaeho Son

It has been pointed out that the act of carrying a heavy object that exceeds a certain weight by a worker at a construction site is a major factor that puts physical burden on the worker’s musculoskeletal system. However, due to the nature of the construction site, where there are a large number of workers simultaneously working in an irregular space, it is difficult to figure out the weight of the object carried by the worker in real time or keep track of the worker who carries the excess weight. This paper proposes a prototype system to track the weight of heavy objects carried by construction workers by developing smart safety shoes with FSR (Force Sensitive Resistor) sensors. The system consists of smart safety shoes with sensors attached, a mobile device for collecting initial sensing data, and a web-based server computer for storing, preprocessing and analyzing such data. The effectiveness and accuracy of the weight tracking system was verified through the experiments where a weight was lifted by each experimenter from +0 kg to +20 kg in 5 kg increments. The results of the experiment were analyzed by a newly developed machine learning based model, which adopts effective classification algorithms such as decision tree, random forest, gradient boosting algorithm (GBM), and light GBM. The average accuracy classifying the weight by each classification algorithm showed similar, but high accuracy in the following order: random forest (90.9%), light GBM (90.5%), decision tree (90.3%), and GBM (89%). Overall, the proposed weight tracking system has a significant 90.2% average accuracy in classifying how much weight each experimenter carries.


Sign in / Sign up

Export Citation Format

Share Document