Feature-Based Learning Model for Fake News Detection and Classification

Author(s):  
G. Purna Chandar Rao ◽  
V. B. Narasimha

A social media adoption is important to provide content authenticity and awareness for the unknown news that might be fake. Therefore, a Natural Language Processing (NLP) model is required to identify the content properties for language-driven feature generation. The present research work utilizes language-driven features that extract the grammatical, sentimental, syntactic, readable features. The feature from the particular news content is extracted to deal with the dimensional problem as the language level features are quite complex. Thus, the Dropout layer-based Long Short Term Network Model (LSTM) for sequential learning achieved better results during fake news detection. The results obtained validate the important features extracted linguistic model features and are combined to achieve better classification accuracy. The proposed Drop out based LSTM model obtained accuracy of 95.3% for fake news classification and detection when compared to the sequential neural model for fake news detection.

Author(s):  
Arkadipta De ◽  
Dibyanayan Bandyopadhyay ◽  
Baban Gain ◽  
Asif Ekbal

Fake news classification is one of the most interesting problems that has attracted huge attention to the researchers of artificial intelligence, natural language processing, and machine learning (ML). Most of the current works on fake news detection are in the English language, and hence this has limited its widespread usability, especially outside the English literate population. Although there has been a growth in multilingual web content, fake news classification in low-resource languages is still a challenge due to the non-availability of an annotated corpus and tools. This article proposes an effective neural model based on the multilingual Bidirectional Encoder Representations from Transformer (BERT) for domain-agnostic multilingual fake news classification. Large varieties of experiments, including language-specific and domain-specific settings, are conducted. The proposed model achieves high accuracy in domain-specific and domain-agnostic experiments, and it also outperforms the current state-of-the-art models. We perform experiments on zero-shot settings to assess the effectiveness of language-agnostic feature transfer across different languages, showing encouraging results. Cross-domain transfer experiments are also performed to assess language-independent feature transfer of the model. We also offer a multilingual multidomain fake news detection dataset of five languages and seven different domains that could be useful for the research and development in resource-scarce scenarios.


2018 ◽  
Vol 18 (2) ◽  
pp. 159-177
Author(s):  
Paweł Kaczmarczyk

Abstract The aim of this research study is to test the effectiveness of the single-sectional integrated model, in which a neural network is applied to support a regression, as a consistent tool for short-term forecasting of hourly demand (in sec.) for telecommunications services. The theoretical part of the paper involves the idea of the single-sectional integrated model and differences between this model and a multi-sectional integrated model. Moreover, the research methodology is described, i.e. the elements used in the constructed model (the feedforward neural model and the regression with dichotomous explanatory variables), and the manner of their integration are discussed. In the empirical part of this work, the results of the carried out experiments are included. The comparison of the obtained effectiveness (in terms of approximation and prediction) of the explored single-sectional integrated model with the effectiveness of the non-supported regression model and the multi-sectional integrated model are conducted. In this research work, it is proved that the single-sectional integrated model enables better results in comparison to the non-integrated regression and the mutli-sectional integrated model. The originality of this paper is based on: the created single-sectional integrated model in terms of the analysed phenomenon, the verification of the model effectiveness, and the comparison of the constructed model with other models and assessment.


2020 ◽  
Vol 9 (1) ◽  
pp. 2668-2671

Now a day's prediction of fake news is somewhat an important aspect. The spreading of fake news mainly misleads the people and some false news that led to the absence of truth and stirs up the public opinion. It might influence some people in the society which leads to a loss in all directions like financial, psychological and also political issues, affecting voting decisions during elections etc. Our research work is to find reliable and accurate model that categorize a given news in dataset as fake or real. The existing techniques involved in are from a deep learning perspective by Recurrent Neural Network (RNN) technique models Vanilla, Gated Recurrent Unit (GRU) and Long Short-Term Memories (LSTMs) by applying on LAIR dataset. So we come up with a different plan to increase the accuracy by hybridizing Decision Tree and Random Forest.


Author(s):  
Samrudhi Naik

Abstract: The spreading of fake news has given rise to many problems in society. It is due to its ability to cause a lot of social and national damage with destructive impacts. Sometimes it gets very difficult to know if the news is genuine or fake. Therefore it is very important to detect if the news is fake or not. "Fake News" is a term used to represent fabricated news or propaganda comprising misinformation communicated through traditional media channels like print, and television as well as nontraditional media channels like social media. Techniques of NLP and Machine learning can be used to create models which can help to detect fake news. In this paper we have presented six LSTM models using the techniques of NLP and ML. The datasets in comma-separated values format, pertaining to political domain were used in the project. The different attributes like the title and text of the news headline/article were used to perform the fake news detection. The results showed that the proposed solution performs well in terms of providing an output with good accuracy, precision and recall. The performance analysis made between all the models showed that the models which have used GloVe and Word2vec method work better than the models using TF-IDF. Further, a larger dataset for better output and also other factors such as the author ,publisher of the news can be used to determine the credibility of the news. Also, further research can also be done on images, videos, images containing text which can help in improving the models in future. Keywords: Fake news detection, LSTM(long short term memory),Word2Vec,TF-IDF,Natural Language Processing.


Designs ◽  
2021 ◽  
Vol 5 (3) ◽  
pp. 42
Author(s):  
Eric Lazarski ◽  
Mahmood Al-Khassaweneh ◽  
Cynthia Howard

In recent years, disinformation and “fake news” have been spreading throughout the internet at rates never seen before. This has created the need for fact-checking organizations, groups that seek out claims and comment on their veracity, to spawn worldwide to stem the tide of misinformation. However, even with the many human-powered fact-checking organizations that are currently in operation, disinformation continues to run rampant throughout the Web, and the existing organizations are unable to keep up. This paper discusses in detail recent advances in computer science to use natural language processing to automate fact checking. It follows the entire process of automated fact checking using natural language processing, from detecting claims to fact checking to outputting results. In summary, automated fact checking works well in some cases, though generalized fact checking still needs improvement prior to widespread use.


Symmetry ◽  
2021 ◽  
Vol 13 (4) ◽  
pp. 556
Author(s):  
Thaer Thaher ◽  
Mahmoud Saheb ◽  
Hamza Turabieh ◽  
Hamouda Chantar

Fake or false information on social media platforms is a significant challenge that leads to deliberately misleading users due to the inclusion of rumors, propaganda, or deceptive information about a person, organization, or service. Twitter is one of the most widely used social media platforms, especially in the Arab region, where the number of users is steadily increasing, accompanied by an increase in the rate of fake news. This drew the attention of researchers to provide a safe online environment free of misleading information. This paper aims to propose a smart classification model for the early detection of fake news in Arabic tweets utilizing Natural Language Processing (NLP) techniques, Machine Learning (ML) models, and Harris Hawks Optimizer (HHO) as a wrapper-based feature selection approach. Arabic Twitter corpus composed of 1862 previously annotated tweets was utilized by this research to assess the efficiency of the proposed model. The Bag of Words (BoW) model is utilized using different term-weighting schemes for feature extraction. Eight well-known learning algorithms are investigated with varying combinations of features, including user-profile, content-based, and words-features. Reported results showed that the Logistic Regression (LR) with Term Frequency-Inverse Document Frequency (TF-IDF) model scores the best rank. Moreover, feature selection based on the binary HHO algorithm plays a vital role in reducing dimensionality, thereby enhancing the learning model’s performance for fake news detection. Interestingly, the proposed BHHO-LR model can yield a better enhancement of 5% compared with previous works on the same dataset.


2021 ◽  
pp. 1-10
Author(s):  
Hye-Jeong Song ◽  
Tak-Sung Heo ◽  
Jong-Dae Kim ◽  
Chan-Young Park ◽  
Yu-Seop Kim

Sentence similarity evaluation is a significant task used in machine translation, classification, and information extraction in the field of natural language processing. When two sentences are given, an accurate judgment should be made whether the meaning of the sentences is equivalent even if the words and contexts of the sentences are different. To this end, existing studies have measured the similarity of sentences by focusing on the analysis of words, morphemes, and letters. To measure sentence similarity, this study uses Sent2Vec, a sentence embedding, as well as morpheme word embedding. Vectors representing words are input to the 1-dimension convolutional neural network (1D-CNN) with various sizes of kernels and bidirectional long short-term memory (Bi-LSTM). Self-attention is applied to the features transformed through Bi-LSTM. Subsequently, vectors undergoing 1D-CNN and self-attention are converted through global max pooling and global average pooling to extract specific values, respectively. The vectors generated through the above process are concatenated to the vector generated through Sent2Vec and are represented as a single vector. The vector is input to softmax layer, and finally, the similarity between the two sentences is determined. The proposed model can improve the accuracy by up to 5.42% point compared with the conventional sentence similarity estimation models.


2021 ◽  
Vol 54 (2) ◽  
pp. 1-37
Author(s):  
Dhivya Chandrasekaran ◽  
Vijay Mago

Estimating the semantic similarity between text data is one of the challenging and open research problems in the field of Natural Language Processing (NLP). The versatility of natural language makes it difficult to define rule-based methods for determining semantic similarity measures. To address this issue, various semantic similarity methods have been proposed over the years. This survey article traces the evolution of such methods beginning from traditional NLP techniques such as kernel-based methods to the most recent research work on transformer-based models, categorizing them based on their underlying principles as knowledge-based, corpus-based, deep neural network–based methods, and hybrid methods. Discussing the strengths and weaknesses of each method, this survey provides a comprehensive view of existing systems in place for new researchers to experiment and develop innovative ideas to address the issue of semantic similarity.


2021 ◽  
pp. 001112872110364
Author(s):  
Natalia Redondo ◽  
Marina J. Muñoz-Rivas ◽  
Arthur L. Cantos ◽  
Jose Luis Graña

The Transtheoretical Model (TTM) of behavior change predicts that patients go through different stages of change prior to changing their problematic behavior. This study aims to evaluate the utility and validity of this model in a sample of 549 court-ordered partner violent men. Three types of perpetrators with respect to their readiness to change were revealed. Those in more advantage stage of change use more processes to change their problem and present with higher levels of intimate partner violence (IPV). Low readiness to change levels and treatment drop-out predict short-term criminal justice recidivism, while treatment drop-out predicts medium and long-term recidivism. Results highlight the applicability of the TTM in IPV and its usefulness in designing behavioral interventions with this population.


Sign in / Sign up

Export Citation Format

Share Document