scholarly journals A Novel Phishing Email Detection Algorithm based on Multinomial Naive Bayes Classifier and Natural Language Processing

Author(s):  
Omar Abdelaziz ◽  
Sahana Deb ◽  
Rania Hodhod ◽  
Lydia Ray
2021 ◽  
Vol 30 (1) ◽  
pp. 774-792
Author(s):  
Mazin Abed Mohammed ◽  
Dheyaa Ahmed Ibrahim ◽  
Akbal Omran Salman

Abstract Spam electronic mails (emails) refer to harmful and unwanted commercial emails sent to corporate bodies or individuals to cause harm. Even though such mails are often used for advertising services and products, they sometimes contain links to malware or phishing hosting websites through which private information can be stolen. This study shows how the adaptive intelligent learning approach, based on the visual anti-spam model for multi-natural language, can be used to detect abnormal situations effectively. The application of this approach is for spam filtering. With adaptive intelligent learning, high performance is achieved alongside a low false detection rate. There are three main phases through which the approach functions intelligently to ascertain if an email is legitimate based on the knowledge that has been gathered previously during the course of training. The proposed approach includes two models to identify the phishing emails. The first model has proposed to identify the type of the language. New trainable model based on Naive Bayes classifier has also been proposed. The proposed model is trained on three types of languages (Arabic, English and Chinese) and the trained model has used to identify the language type and use the label for the next model. The second model has been built by using two classes (phishing and normal email for each language) as a training data. The second trained model (Naive Bayes classifier) has been applied to identify the phishing emails as a final decision for the proposed approach. The proposed strategy is implemented using the Java environments and JADE agent platform. The testing of the performance of the AIA learning model involved the use of a dataset that is made up of 2,000 emails, and the results proved the efficiency of the model in accurately detecting and filtering a wide range of spam emails. The results of our study suggest that the Naive Bayes classifier performed ideally when tested on a database that has the biggest estimate (having a general accuracy of 98.4%, false positive rate of 0.08%, and false negative rate of 2.90%). This indicates that our Naive Bayes classifier algorithm will work viably on the off chance, connected to a real-world database, which is more common but not the largest.


2021 ◽  
Vol 12 (03) ◽  
pp. 15-24
Author(s):  
Swetha Sree Cheeti ◽  
Yanyan Li ◽  
Ahmad Hadaegh

Education system has been gravely affected due to widespread of Covid-19 across the globe. In this paper we present a thorough sentiment analysis of tweets related to education available on twitter platform and deduce conclusions about its impact on people’s emotions as the pandemic advanced over the months. Through twitter over ninety thousand tweets have been gathered related to the circumstances involving the change in education system over the world. Using Natural language tool kit (NLTK) functionalities and Naive Bayes Classifier a sentiment analysis has been performed on the gathered dataset. Based on the results of this analysis we infer to exhibit the impact of covid-19 on education and how people’s sentiment altered due to the changes with regard to the education system. Thus, we would like to present a better understanding of people’s sentiment on education while trying to cope with the pandemic in such unprecedented times.


JNANALOKA ◽  
2021 ◽  
pp. 81-86
Author(s):  
Sigit Suryono ◽  
Emha Taufiq Luthfi

Analisis Sentiment merupakan salah satu cabang dari bidang ilmu Text Mining. Analisis sentiment merupakan sumber penting dalam melakukan evaluasi dan pengambilan keputusan terhadap sebuah topik permasalahan. Tujuan utama dari analisis sentiment adalah untuk mengetahui polaritas dari sentiment positif, negatif ataupun netral. Sentiment-sentiment tersebut salah satunya didapatkan dari Twitter. Dalam tulisan ini, tweet-tweet yang berhubungan dengan kata kunci yang dicari dikumpulkan dari Twitter dengan menggunakan API Twitter dan data mentah yang didapatkan diolah dengan menggunakan Natural Language Toolkit pada bahasa pemrograman Python. Setelah diolah selanjutnya akan dilakukan klasifikasi dengan menggunakan Naïve Bayes Classifier untuk mengetahui tingkat akurasi dari proses klasifikasi yang dilakukan. Proses klasifikasi dilakukan dengan RapidMiner. Dari hasil uji coba sebanyak empat kali, didapatkan hasil tingkat akurasi pada percobaan pertama sebesar 62.98%, percobaan kedua sebesar 64.95%, percobaan ketiga sebesar 66.36%, dan percobaan keempat sebesar 66.79%. Dari hasil klasifikasi didapat tingkat persentase sentiment positif sebesar 28%, sentiment negatif sebesar 20% dan sentiment netral sebesar 52%.


2013 ◽  
Vol 3 (2) ◽  
pp. 7-15 ◽  
Author(s):  
S. Praveena ◽  
◽  
S.P. Singh ◽  
I.V. Muralikrishna ◽  
◽  
...  

2021 ◽  
Author(s):  
Deniz Ertuncay ◽  
Giovanni Costa

AbstractNear-fault ground motions may contain impulse behavior on velocity records. To calculate the probability of occurrence of the impulsive signals, a large dataset is collected from various national data providers and strong motion databases. The dataset has a large number of parameters which carry information on the earthquake physics, ruptured faults, ground motion parameters, distance between the station and several parts of the ruptured fault. Relation between the parameters and impulsive signals is calculated. It is found that fault type, moment magnitude, distance and azimuth between a site of interest and the surface projection of the ruptured fault are correlated with the impulsiveness of the signals. Separate models are created for strike-slip faults and non-strike-slip faults by using multivariate naïve Bayes classifier method. Naïve Bayes classifier allows us to have the probability of observing impulsive signals. The models have comparable accuracy rates, and they are more consistent on different fault types with respect to previous studies.


Sign in / Sign up

Export Citation Format

Share Document