Big Data and quality data for fake news and misinformation detection

Fake news has become an important topic of research in a variety of disciplines including linguistics and computer science. In this paper, we explain how the problem is approached from the perspective of natural language processing, with the goal of building a system to automatically detect misinformation in news. The main challenge in this line of research is collecting quality data, i.e., instances of fake and real news articles on a balanced distribution of topics. We review available datasets and introduce the MisInfoText repository as a contribution of our lab to the community. We make available the full text of the news articles, together with veracity labels previously assigned based on manual assessment of the articles’ truth content. We also perform a topic modelling experiment to elaborate on the gaps and sources of imbalance in currently available datasets to guide future efforts. We appeal to the community to collect more data and to make it available for research purposes.

Download Full-text

Using NLP for Fact Checking: A Survey

Designs ◽

10.3390/designs5030042 ◽

2021 ◽

Vol 5 (3) ◽

pp. 42

Author(s):

Eric Lazarski ◽

Mahmood Al-Khassaweneh ◽

Cynthia Howard

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Computer Science ◽

Language Processing ◽

The Internet ◽

Fake News ◽

Fact Checking ◽

The Many ◽

Human Powered ◽

The Web

In recent years, disinformation and “fake news” have been spreading throughout the internet at rates never seen before. This has created the need for fact-checking organizations, groups that seek out claims and comment on their veracity, to spawn worldwide to stem the tide of misinformation. However, even with the many human-powered fact-checking organizations that are currently in operation, disinformation continues to run rampant throughout the Web, and the existing organizations are unable to keep up. This paper discusses in detail recent advances in computer science to use natural language processing to automate fact checking. It follows the entire process of automated fact checking using natural language processing, from detecting claims to fact checking to outputting results. In summary, automated fact checking works well in some cases, though generalized fact checking still needs improvement prior to widespread use.

Download Full-text

Thai Fake News Detection Based on Information Retrieval, Natural Language Processing and Machine Learning

SN Computer Science ◽

10.1007/s42979-021-00775-6 ◽

2021 ◽

Vol 2 (6) ◽

Author(s):

Phayung Meesad

Keyword(s):

Machine Learning ◽

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Fake News

Download Full-text

Ansätze zur quantitativen Inhaltsanalyse

WiSt - Wirtschaftswissenschaftliches Studium ◽

10.15358/0340-1650-2021-2-3-17 ◽

2021 ◽

Vol 50 (2-3) ◽

pp. 17-22

Author(s):

Johannes Brunzel

Keyword(s):

Big Data ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing

Der Beitrag erläutert, inwiefern die Methode der quantitativen Textanalyse ein wesentliches Mittel zur betriebswirtschaftlichen Effizienzsteigerung sein kann. Dabei geht der Artikel über die Nennung von Chancen und Risiken des Einsatzes von künstlicher Intelligenz/Big Data-Analysen hinaus, indem der Beitrag praxisorientiert wichtige Entwicklungen im Bereich der quantitativen Inhaltsanalyse aus der wirtschaftswissenschaftlichen Literatur herleitet. Nachfolgend unterteilt der Artikel die wichtigsten Schritte zur Implementierung in (1) Datenerhebung von quantitativen Textdaten, (2) Durchführung der generischen Textanalyse und (3) Durchführung des Natural Language Processing. Als ein Hauptergebnis hält der Artikel fest, dass Natural Language Processing-Ansätze zwar weiterführende und komplexere Einsichten bieten, jedoch das Potenzial generischer Textanalyse - aufgrund der Flexibilität und verhältnismäßig einfachen Anwendbarkeit im Unternehmenskontext - noch nicht ausgeschöpft ist. Zudem stehen Führungskräfte vor der dichotomen Entscheidung, ob programmierbasierte oder kommerzielle Lösungen für die Durchführung der Textanalyse relevant sind.

Download Full-text

Fake News Detection and Classification using Natural Language Processing

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.34700 ◽

2021 ◽

Vol 9 (5) ◽

pp. 1837-1841

Author(s):

Suwarna Gothane

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Fake News

Download Full-text

Design of Link Evaluation Method to Improve Reliability based on Linked Open Big Data and Natural Language Processing

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i3.33.18601 ◽

2018 ◽

Vol 7 (3.33) ◽

pp. 168

Author(s):

Yonglak SHON ◽

Jaeyoung PARK ◽

Jangmook KANG ◽

Sangwon LEE

Keyword(s):

Big Data ◽

Natural Language Processing ◽

Language Processing ◽

Confidence Level ◽

Linked Data ◽

Evaluation Method ◽

Data Sets ◽

Knowledge Based ◽

Global Data ◽

Improve Reliability

The LOD data sets consist of RDF Triples based on the Ontology, a specification of existing facts, and by linking them to previously disclosed knowledge based on linked data principles. These structured LOD clouds form a large global data network, which provides a more accurate foundation for users to deliver the desired information. However, it is difficult to identify that, if the presence of the same object is identified differently across several LOD data sets, they are inherently identical. This is because objects with different URIs in the LOD datasets must be different and they must be closely examined for similarities in order to judge them as identical. The aim of this study is that the prosed model, RILE, evaluates similarity by comparing object values of existing specified predicates. After performing experiments with our model, we could check the improvement of the confidence level of the connection by extracting the link value.

Download Full-text

John Senior and Éva Gyarmathy: AI and Developing Human Intelligence, Future Learning and Educational Innovation

Revija za socijalnu politiku ◽

10.3935/rsp.v28i3.1867 ◽

2021 ◽

Vol 28 (3) ◽

pp. 442-446

Author(s):

Valentin Kuleto ◽

Milena Ilić

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Problem Solving ◽

Computer Science ◽

Image Recognition ◽

Language Processing ◽

Educational Innovation ◽

Human Intelligence ◽

Intelligent Machines ◽

Future Learning

AI is a branch of computer science that emphasises the development of intelligent machines that think and work like humans. Examples of AI applications are speech recognition, natural language processing, image recognition etc. The term ML represents the application of AI to enable systems’ ability to learn and improve based on experience, without the explicit need for programming, using various problem-solving algorithms. For example, in machine learning, computers learn based on the data they process, not program instructions

Download Full-text

EOR/IOR Screening with Big Data Analytics and Natural Language Processing for Unstructured Data: A Statistical Approach

10.2118/181117-ms ◽

2016 ◽

Author(s):

Sardar Afra ◽

Mohammadali Tarrahi

Keyword(s):

Big Data ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Data Analytics ◽

Statistical Approach ◽

Big Data Analytics ◽

Unstructured Data

Download Full-text

Big Data and Natural Language Processing for Analysing Railway Safety

Innovative Applications of Big Data in the Railway Industry - Advances in Civil and Industrial Engineering ◽

10.4018/978-1-5225-3176-0.ch011 ◽

2018 ◽

pp. 240-267

Author(s):

Kanza Noor Syeda ◽

Syed Noorulhassan Shirazi ◽

Syed Asad Ali Naqvi ◽

Howard J Parkinson ◽

Gary Bamford

Keyword(s):

Big Data ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Machine Intelligence ◽

Data Availability ◽

Accident Data ◽

Data Driven Approach ◽

Advanced Analytics ◽

The Uk

Due to modern powerful computing and the explosion in data availability and advanced analytics, there should be opportunities to use a Big Data approach to proactively identify high risk scenarios on the railway. In this chapter, we comprehend the need for developing machine intelligence to identify heightened risk on the railway. In doing so, we have explained a potential for a new data driven approach in the railway, we then focus the rest of the chapter on Natural Language Processing (NLP) and its potential for analysing accident data. We review and analyse investigation reports of railway accidents in the UK, published by the Rail Accident Investigation Branch (RAIB), aiming to reveal the presence of entities which are informative of causes and failures such as human, technical and external. We give an overview of a framework based on NLP and machine learning to analyse the raw text from RAIB reports which would assist the risk and incident analysis experts to study causal relationship between causes and failures towards the overall safety in the rail industry.

Download Full-text

Detecting Fake News Using Deep Learning and NLP

Advances in Digital Crime, Forensics, and Cyber Terrorism - Confluence of AI, Machine, and Deep Learning in Cyber Forensics ◽

10.4018/978-1-7998-4900-1.ch007 ◽

2021 ◽

pp. 117-133

Author(s):

Uma Maheswari Sadasivam ◽

Nitin Ganesan

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Reliable Information ◽

Citizen Journalism ◽

Fake News ◽

Social Unrest ◽

Safe Place ◽

The Social

Fake news is the word making more talk these days be it election, COVID 19 pandemic, or any social unrest. Many social websites have started to fact check the news or articles posted on their websites. The reason being these fake news creates confusion, chaos, misleading the community and society. In this cyber era, citizen journalism is happening more where citizens do the collection, reporting, dissemination, and analyse news or information. This means anyone can publish news on the social websites and lead to unreliable information from the readers' points of view as well. In order to make every nation or country safe place to live by holding a fair and square election, to stop spreading hatred on race, religion, caste, creed, also to have reliable information about COVID 19, and finally from any social unrest, we need to keep a tab on fake news. This chapter presents a way to detect fake news using deep learning technique and natural language processing.

Download Full-text

A Comparison with Other Approaches

Instruction Modeling ◽

10.1093/oso/9780190910709.003.0008 ◽

2020 ◽

pp. 168-187

Author(s):

George A. Khachatryan

Keyword(s):

Big Data ◽

Cognitive Psychology ◽

Natural Language Processing ◽

Blended Learning ◽

Language Processing ◽

Usability Testing ◽

Learning Science ◽

Instructional Designers ◽

Wide Range ◽

Learning Programs

What are the relative merits of instruction modeling and other approaches to the design of blended learning programs? This chapter discusses several prevailing approaches, including applied learning science, personalization, and the use of big data in education. Many programs are designed around a single claimed feature of good instruction; terming such thinking “featurism,” this chapter argues that it is reductionist and less likely to be successful than more comprehensive approaches (such as instruction modeling). However, instruction modeling is not simply an alternative to other approaches: as the example of cognitive psychology illustrates, instruction modeling can often be fruitfully combined with other methods. Just as good software developers blend different approaches (e.g., using usability testing and the psychology of attention in designing interfaces), good instructional designers should draw on a wide range of techniques. This chapter discusses how instruction modeling can work in concert with big data, natural language processing, and other important approaches.

Download Full-text