scholarly journals Detection of Fake Profiles on Twitter Using Hybrid SVM Algorithm

2021 ◽  
Vol 309 ◽  
pp. 01046
Author(s):  
Sarangam Kodati ◽  
Kumbala Pradeep Reddy ◽  
Sreenivas Mekala ◽  
PL Srinivasa Murthy ◽  
P Chandra Sekhar Reddy

Establishing and management of social relationships among huge amount of users has been provided by the emerging communication medium called online social networks (OSNs). The attackers have attracted because of the rapid increasing of OSNs and the large amount of its subscriber’s personal data. Then they pretend to spread malicious activities, share false news and even stolen personal data. Twitter is one of the biggest networking platforms of micro blogging social networks in which daily more than half a billion tweets are posted most of that are malware activities. Analyze, who are encouraging threats in social networks is need to classify the social networks profiles of the users. Traditionally, there are different classification methods for detecting the fake profiles on the social networks that needed to improve their accuracy rate of classification. Thus machine learning algorithms are focused in this paper. Therefore detection of fake profiles on twitter using hybrid Support Vector Machine (SVM) algorithm is proposed in this paper. The machine learning based hybrid SVM algorithm is used in this for classification of fake and genuine profiles of Twitter accounts and applied the dimension reduction techniques, feature selection and bots. Less number of features is used in the proposed hybrid SVM algorithm and 98% of the accounts are correctly classified with proposed algorithm.

Cybersecurity ◽  
2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Md. Shafiur Rahman ◽  
Sajal Halder ◽  
Md. Ashraf Uddin ◽  
Uzzal Kumar Acharjee

AbstractAnomaly detection has been an essential and dynamic research area in the data mining. A wide range of applications including different social medias have adopted different state-of-the-art methods to identify anomaly for ensuring user’s security and privacy. The social network refers to a forum used by different groups of people to express their thoughts, communicate with each other, and share the content needed. This social networks also facilitate abnormal activities, spread fake news, rumours, misinformation, unsolicited messages, and propaganda post malicious links. Therefore, detection of abnormalities is one of the important data analysis activities for the identification of normal or abnormal users on the social networks. In this paper, we have developed a hybrid anomaly detection method named DT-SVMNB that cascades several machine learning algorithms including decision tree (C5.0), Support Vector Machine (SVM) and Naïve Bayesian classifier (NBC) for classifying normal and abnormal users in social networks. We have extracted a list of unique features derived from users’ profile and contents. Using two kinds of dataset with the selected features, the proposed machine learning model called DT-SVMNB is trained. Our model classifies users as depressed one or suicidal one in the social network. We have conducted an experiment of our model using synthetic and real datasets from social network. The performance analysis demonstrates around 98% accuracy which proves the effectiveness and efficiency of our proposed system.


2020 ◽  
Vol 10 (15) ◽  
pp. 5047 ◽  
Author(s):  
Viet-Ha Nhu ◽  
Danesh Zandi ◽  
Himan Shahabi ◽  
Kamran Chapi ◽  
Ataollah Shirzadi ◽  
...  

This paper aims to apply and compare the performance of the three machine learning algorithms–support vector machine (SVM), bayesian logistic regression (BLR), and alternating decision tree (ADTree)–to map landslide susceptibility along the mountainous road of the Salavat Abad saddle, Kurdistan province, Iran. We identified 66 shallow landslide locations, based on field surveys, by recording the locations of the landslides by a global position System (GPS), Google Earth imagery and black-and-white aerial photographs (scale 1: 20,000) and 19 landslide conditioning factors, then tested these factors using the information gain ratio (IGR) technique. We checked the validity of the models using statistical metrics, including sensitivity, specificity, accuracy, kappa, root mean square error (RMSE), and area under the receiver operating characteristic curve (AUC). We found that, although all three machine learning algorithms yielded excellent performance, the SVM algorithm (AUC = 0.984) slightly outperformed the BLR (AUC = 0.980), and ADTree (AUC = 0.977) algorithms. We observed that not only all three algorithms are useful and effective tools for identifying shallow landslide-prone areas but also the BLR algorithm can be used such as the SVM algorithm as a soft computing benchmark algorithm to check the performance of the models in future.


Glass Industry is considered one of the most important industries in the world. The Glass is used everywhere, from water bottles to X-Ray and Gamma Rays protection. This is a non-crystalline, amorphous solid that is most often transparent. There are lots of uses of glass, and during investigation in a crime scene, the investigators need to know what is type of glass in a scene. To find out the type of glass, we will use the online dataset and machine learning to solve the above problem. We will be using ML algorithms such as Artificial Neural Network (ANN), K-nearest neighbors (KNN) algorithm, Support Vector Machine (SVM) algorithm, Random Forest algorithm, and Logistic Regression algorithm. By comparing all the algorithm Random Forest did the best in glass classification.


2021 ◽  
Vol 11 (20) ◽  
pp. 9487
Author(s):  
Mohammed Al-Sarem ◽  
Faisal Saeed ◽  
Zeyad Ghaleb Al-Mekhlafi ◽  
Badiea Abdulkarem Mohammed ◽  
Mohammed Hadwan ◽  
...  

The widespread usage of social media has led to the increasing popularity of online advertisements, which have been accompanied by a disturbing spread of clickbait headlines. Clickbait dissatisfies users because the article content does not match their expectation. Detecting clickbait posts in online social networks is an important task to fight this issue. Clickbait posts use phrases that are mainly posted to attract a user’s attention in order to click onto a specific fake link/website. That means clickbait headlines utilize misleading titles, which could carry hidden important information from the target website. It is very difficult to recognize these clickbait headlines manually. Therefore, there is a need for an intelligent method to detect clickbait and fake advertisements on social networks. Several machine learning methods have been applied for this detection purpose. However, the obtained performance (accuracy) only reached 87% and still needs to be improved. In addition, most of the existing studies were conducted on English headlines and contents. Few studies focused specifically on detecting clickbait headlines in Arabic. Therefore, this study constructed the first Arabic clickbait headline news dataset and presents an improved multiple feature-based approach for detecting clickbait news on social networks in Arabic language. The proposed approach includes three main phases: data collection, data preparation, and machine learning model training and testing phases. The collected dataset included 54,893 Arabic news items from Twitter (after pre-processing). Among these news items, 23,981 were clickbait news (43.69%) and 30,912 were legitimate news (56.31%). This dataset was pre-processed and then the most important features were selected using the ANOVA F-test. Several machine learning (ML) methods were then applied with hyper-parameter tuning methods to ensure finding the optimal settings. Finally, the ML models were evaluated, and the overall performance is reported in this paper. The experimental results show that the Support Vector Machine (SVM) with the top 10% of ANOVA F-test features (user-based features (UFs) and content-based features (CFs)) obtained the best performance and achieved 92.16% of detection accuracy.


2019 ◽  
Vol 72 (6) ◽  
pp. 431-437 ◽  
Author(s):  
Laura Bigorra ◽  
Iciar Larriba ◽  
Ricardo Gutiérrez-Gallego

AimsRed blood cell (RBC) lysis resistance interferes with white blood cell (WBC) count and differential; still, its detection relies on the identification of an abnormal scattergram, and this is not clearly adverted by specific flags in the Beckman-Coulter DXH-800. The aims were to analyse precisely the effect of RBC lysis resistance interference in WBC counts, differentials and cell population data (CPD) and then to design, develop and implement a novel diagnostic machine learning (ML) model to optimise the detection of samples presenting this phenomenon.MethodsWBC counts, differentials and CPD from 232 patients (anaemia or liver disease) were compared with 100 healthy controls (HC) using analysis of variance. The data were analysed after a corrective action, and the analyser differentials were also compared with the digital leucocyte differentials. The ML support vector machine (SVM) algorithm was trained with 70% of the samples (n=233) and the 30% remaining (n=99) were employed exclusively during the validation phase.ResultsWe identified that impedance WBC was not affected by the RBC lysis resistance interference while the DXH-800 differentials overestimated lymphoid subpopulations (17.6%), sometimes even yielding spurious lymphocytosis, and the latter were corrected when sample dilution was performed. The ML-SVM algorithm allowed the classification of the pathological groups when compared with HC with validation accuracies corresponding to 97.98%, 100% and 88.78% for the global, anaemia and liver disease groups, respectively.ConclusionsThe proposed algorithm has an impressive discriminatory potential and its application would be a valuable support system to detect spurious results due to RBC lysis resistance.


2021 ◽  
Vol 15 (23) ◽  
pp. 136-147
Author(s):  
Hajar A. Alharbi ◽  
Hessa I. Alshaya ◽  
Meshaiel M. Alsheail ◽  
Mukhlisah H. Koujan

The graduation projects (GP) are important because it reflects the academic profile and achievement of the students. For many years’ graduation projects are done by the information technology department students. Most of these projects have great value, and some were published in scientific journals and international conferences. However, these projects are stored in an archive room haphazardly and there is a very small part of it is a set of electronic PDF files stored on hard disk, which wastes time and effort and cannot benefit from it. However, there is no system to classify and store these projects in a good way that can benefit from them. In this paper, we reviewed some of the best machine learning algorithms to classify text “graduation projects”, support vector machine (SVM) algorithm, logistic regression (LR) algorithm, random forest (RF) algorithm, which can deal with an extremely small amount of dataset after comparing these algorithms based on accuracy. We choose the SVM algorithm to classify the projects. Besides, we will mention how to deal with a super small dataset and solve this problem.


Author(s):  
Faraz Ahmad ◽  
S. A. M. Rizvi

<p>Twitter is one of the most influential social media platforms, facilitates the spreading of information in the form of text, images, and videos. However, the credibility of posted content is still trailed by an interrogation mark. Introduction: In this paper, a model has been developed for finding the user’s credibility based on the tweets which they had posted on Twitter social networks. The model consists of machine learning algorithms that assist not only in categorizing the tweets into credibility classes but also helps in finding user’s credibility ratings on the social media platform. Methods and results: The dataset and associated features of 100,000 tweets were extracted and pre-processed. Furthermore, the credibility class labelling of tweets was performed using four different human annotators. The meaning cloud and natural language understanding platforms were used for calculating the polarity, sentiment, and emotions score. The K-Means algorithm was applied for finding the clusters of tweets based on features set, whereas, random forest, support vector machine, naïve Bayes, K-nearest-neighbours (KNN), J48 decision tree, and multilayer perceptron were used for classifying the tweets into credibility classes. A significant level of accuracy, precision, and recall was provided by all the classifiers for all the given credibility classes.</p>


Sensors ◽  
2021 ◽  
Vol 21 (10) ◽  
pp. 3559
Author(s):  
Rana Massoud ◽  
Riccardo Berta ◽  
Stefan Poslad ◽  
Alessandro De Gloria ◽  
Francesco Bellotti

Internet of Things technologies are spurring new types of instructional games, namely reality-enhanced serious games (RESGs), that support training directly in the field. This paper investigates a key feature of RESGs, i.e., user performance evaluation using real data, and studies an application of RESGs for promoting fuel-efficient driving, using fuel consumption as an indicator of driver performance. In particular, we propose a reference model for supporting a novel smart sensing dataflow involving the combination of two modules, based on machine learning, to be employed in RESGs in parallel and in real-time. The first module concerns quantitative performance assessment, while the second one targets verbal recommendation. For the assessment module, we compared the performance of three well-established machine learning algorithms: support vector regression, random forest and artificial neural networks. The experiments show that random forest achieves a slightly better performance assessment correlation than the others but requires a higher inference time. The instant recommendation module, implemented using fuzzy logic, triggers advice when inefficient driving patterns are detected. The dataflow has been tested with data from the enviroCar public dataset, exploiting on board diagnostic II (OBD II) standard vehicular interface information. The data covers various driving environments and vehicle models, which makes the system robust for real-world conditions. The results show the feasibility and effectiveness of the proposed approach, attaining a high estimation correlation (R2 = 0.99, with random forest) and punctual verbal feedback to the driver. An important word of caution concerns users’ privacy, as the modules rely on sensitive personal data, and provide information that by no means should be misused.


The tremendous rise in technology and social media sites enabled everyone to express and share their thoughts and feelings with millions of people in the world. Online social networks like Google+, Instagram, Facebook, twitter, LinkedIn turned into significant medium for communication. With these sites, users can generate, send and receive data among large number of people. Along with the advantages, these platforms are having few issues about its user safety such as the build out and sharing suicidal thoughts. Therefore, in this paper we built a performance report of five Machine Learning algorithms called Support Vector Machine, Random Forest, Decision Tree, Naïve Bayes, and Prism, with the aim of identifying, classifying suicide related text on twitter and providing to the research related to the suicide ideation on communication networks. Firstly, these algorithms identify the most worrying tweets such as suicide ideation, reporting of suicidal thoughts, etc. Also, find outs the flippant to suicide. Along with ML classifiers, One of the most powerful NLP technologies i.e: Opinion summarization is used to classify suicidal and non-suicidal tweets. The outcome of the analysis representing that Prism classifier achieved good accuracy by observing emotions of people and extracting suicidal information from Twitter than other machine learning algorithms


Sign in / Sign up

Export Citation Format

Share Document