Determining an Optimal Data Classification Model for Credibility-Based Fake News Detection

Author(s):  
Amit Neil Ramkissoon ◽  
Shareeda Mohammed ◽  
Wayne Goodridge
Symmetry ◽  
2021 ◽  
Vol 13 (4) ◽  
pp. 556
Author(s):  
Thaer Thaher ◽  
Mahmoud Saheb ◽  
Hamza Turabieh ◽  
Hamouda Chantar

Fake or false information on social media platforms is a significant challenge that leads to deliberately misleading users due to the inclusion of rumors, propaganda, or deceptive information about a person, organization, or service. Twitter is one of the most widely used social media platforms, especially in the Arab region, where the number of users is steadily increasing, accompanied by an increase in the rate of fake news. This drew the attention of researchers to provide a safe online environment free of misleading information. This paper aims to propose a smart classification model for the early detection of fake news in Arabic tweets utilizing Natural Language Processing (NLP) techniques, Machine Learning (ML) models, and Harris Hawks Optimizer (HHO) as a wrapper-based feature selection approach. Arabic Twitter corpus composed of 1862 previously annotated tweets was utilized by this research to assess the efficiency of the proposed model. The Bag of Words (BoW) model is utilized using different term-weighting schemes for feature extraction. Eight well-known learning algorithms are investigated with varying combinations of features, including user-profile, content-based, and words-features. Reported results showed that the Logistic Regression (LR) with Term Frequency-Inverse Document Frequency (TF-IDF) model scores the best rank. Moreover, feature selection based on the binary HHO algorithm plays a vital role in reducing dimensionality, thereby enhancing the learning model’s performance for fake news detection. Interestingly, the proposed BHHO-LR model can yield a better enhancement of 5% compared with previous works on the same dataset.


Author(s):  
Sakshi Kaushal ◽  
Bala Buksh

Cloud computing is the most popular term among enterprises and news. The concepts come true because of fast internet bandwidth and advanced cooperation technology. Resources on the cloud can be accessed through internet without self built infrastructure. Cloud computing is effectively manage the security in the cloud applications. Data classification is a machine learning technique used to predict the class of the unclassified data. Data mining uses different tools to know the unknown, valid patterns and relationships in the dataset. These tools are mathematical algorithms, statistical models and Machine Learning (ML) algorithms. In this paper author uses improved Bayesian technique to classify the data and encrypt the sensitive data using hybrid stagnography. The encrypted and non encrypted sensitive data is sent to cloud environment and evaluate the parameters with different encryption algorithms.


In recent days, deep learning models become a significant research area because of its applicability in diverse domains. In this paper, we employ an optimal deep neural network (DNN) based model for classifying diabetes disease. The DNN is employed for diagnosing the patient diseases effectively with better performance. To further improve the classifier efficiency, multilayer perceptron (MLP) is employed to remove the misclassified instance in the dataset. Then, the processed data is again provided as input to the DNN based classification model. The use of MLP significantly helps to remove the misclassified instances. The presented optimal data classification model is experimented on the PIMA Indians Diabetes dataset which holds the medical details of 768 patients under the presence of 8 attributes for every record. The obtained simulation results verified the superior nature of the presented model over the compared methods.


2021 ◽  
Vol 2136 (1) ◽  
pp. 012057
Author(s):  
Han Zhou

Abstract In the context of the comprehensive popularization of network technical services and database construction system, more and more data are used by enterprises or individuals. It is difficult for the existing technology to meet the technical analysis requirements of the development of the era of big data. Therefore, in the development of practice, we should continue to explore new technologies and methods to reasonably use big data. Therefore, on the basis of understanding the current big data technology and its system operation status, this paper designs relevant algorithms according to the big data classification model, and verifies the effectiveness of the analysis model algorithm based on practice.


Author(s):  
Aung Myo Thaw ◽  
Nataly Zhukova ◽  
Tin Tun Aung ◽  
Vladimir Chernokulsky

2020 ◽  
Vol 34 (04) ◽  
pp. 6680-6687
Author(s):  
Jian Yin ◽  
Chunjing Gan ◽  
Kaiqi Zhao ◽  
Xuan Lin ◽  
Zhe Quan ◽  
...  

Recently, imbalanced data classification has received much attention due to its wide applications. In the literature, existing researches have attempted to improve the classification performance by considering various factors such as the imbalanced distribution, cost-sensitive learning, data space improvement, and ensemble learning. Nevertheless, most of the existing methods focus on only part of these main aspects/factors. In this work, we propose a novel imbalanced data classification model that considers all these main aspects. To evaluate the performance of our proposed model, we have conducted experiments based on 14 public datasets. The results show that our model outperforms the state-of-the-art methods in terms of recall, G-mean, F-measure and AUC.


Sign in / Sign up

Export Citation Format

Share Document