An End-to-End Rumor Detection Model Based on Feature Aggregation

The social network has become the primary medium of rumor propagation. Moreover, manual identification of rumors is extremely time-consuming and laborious. It is crucial to identify rumors automatically. Machine learning technology is widely implemented in the identification and detection of misinformation on social networks. However, the traditional machine learning methods profoundly rely on feature engineering and domain knowledge, and the learning ability of temporal features is insufficient. Furthermore, the features used by the deep learning method based on natural language processing are heavily limited. Therefore, it is of great significance and practical value to study the rumor detection method independent of feature engineering and effectively aggregate heterogeneous features to adapt to the complex and variable social network. In this paper, a deep neural network- (DNN-) based feature aggregation modeling method is proposed, which makes full use of the knowledge of propagation pattern feature and text content feature of social network event without feature engineering and domain knowledge. The experimental results show that the feature aggregation model has achieved 94.4% of accuracy as the best performance in recent works.

Download Full-text

Proceedings of the ACL Workshop on Feature Engineering for Machine Learning in Natural Language Processing - FeatureEng '05

10.3115/1610230 ◽

2005 ◽

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Feature Engineering

Download Full-text

Leveraging machine learning technology to efficiently identify and match patients for precision oncology clinical trials.

Journal of Clinical Oncology ◽

10.1200/jco.2021.39.15_suppl.e13588 ◽

2021 ◽

Vol 39 (15_suppl) ◽

pp. e13588-e13588

Author(s):

Laura Sachse ◽

Smriti Dasari ◽

Marc Ackermann ◽

Emily Patnaude ◽

Stephanie OLeary ◽

...

Keyword(s):

Machine Learning ◽

Clinical Trials ◽

Language Processing ◽

Time Trial ◽

High Fidelity ◽

Learning Technology ◽

Precision Oncology ◽

Sequencing Data ◽

Clinical Criteria ◽

Patient Table

e13588 Background: Pre-screening for clinical trials is becoming more challenging as inclusion/exclusion criteria becomes increasingly complex. Oncology precision medicine provides an exciting opportunity to simplify this process and quickly match patients with trials by leveraging machine learning technology. The Tempus TIME Trial site network matches patients to relevant, open, and recruiting clinical trials, personalized to each patient’s clinical and molecular biology. Methods: Tempus screens patients at sites within the TIME Trial Network to find high-fidelity matches to clinical trials. The patient records include documentation submitted alongside NGS orders as well as electronic medical records (EMR) ingested through EMR Integrations. While Tempus-sequenced patients were automatically matched to trials using a Tempus-built matching application, EMR records were run through a natural language processing (NLP) data abstraction model to identify patients with an actionable gene of interest. Structured data were analyzed to filter to patients that lack a deceased date and have an encounter date within a predefined time period. Tempus abstractors manually validated the resulting unstructured records to ensure each patient was matched to a TIME Trial at a site capable of running the trial. For all high-level patient matches, a Tempus Clinical Navigator manually evaluated other clinical criteria to confirm trial matches and communicated with the site about trial options. Results: Patient matching was accelerated by implementing NLP gene and report detection (which isolated 17% of records) and manual screening. As a result, Tempus facilitated screening of over 190,000 patients efficiently using proprietary NLP technology to match 332 patients to 21 unique interventional clinical trials since program launch. Tempus continues to optimize its NLP models to increase high-fidelity trial matching at scale. Conclusions: The TIME Trial Network is an evolving, dynamic program that efficiently matches patients with clinical trial sites using both EMR and Tempus sequencing data. Here, we show how machine learning technology can be utilized to efficiently identify and recruit patients to clinical trials, thereby personalizing trial enrollment for each patient.[Table: see text]

Download Full-text

Can Linguistic Predictors Detect Fraudulent Financial Filings?

Journal of Emerging Technologies in Accounting ◽

10.2308/jeta.2010.7.1.25 ◽

2010 ◽

Vol 7 (1) ◽

pp. 25-46 ◽

Cited By ~ 41

Author(s):

Sunita Goel ◽

Jagdish Gangolly ◽

Sue R. Faerman ◽

Ozlem Uzuner

Keyword(s):

Language Processing ◽

Domain Knowledge ◽

Effective Means ◽

Annual Reports ◽

Linguistic Features ◽

Empirical Examination ◽

Basic Premise ◽

Detection Model ◽

Verbal Content ◽

Percent Accuracy

ABSTRACT: Extensive research has been done on the analytical and empirical examination of financial data in annual reports to detect fraud; however, there is scant research on the analysis of text in annual reports to detect fraud. The basic premise of this research is that there are clues hidden in the text that can be detected to determine the likelihood of fraud. In this research, we examine both the verbal content and the presentation style of the qualitative portion of the annual reports using natural language processing tools and explore linguistic features that distinguish fraudulent annual reports from nonfraudulent annual reports. Our results indicate that employment of linguistic features is an effective means for detecting fraud. We were able to improve the prediction accuracy of our fraud detection model from initial baseline results of 56.75 percent accuracy, using a “bag of words” approach, to 89.51 percent accuracy when we incorporated linguistically motivated features inspired by our informed reasoning and domain knowledge.

Download Full-text

Introducing the Deep Learning for Digital Age

Handbook of Research on Applications and Implementations of Machine Learning Techniques - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-5225-9902-9.ch017 ◽

2020 ◽

pp. 317-333

Author(s):

Shaila S. G. ◽

Sunanda Rajkumari ◽

Vadivel Ayyasamy

Keyword(s):

Machine Learning ◽

Image Processing ◽

Deep Learning ◽

Language Processing ◽

Vital Role ◽

Learning Ability ◽

Learning Approaches ◽

Data Set ◽

Use Of Data ◽

Repetitive Learning

Deep learning is playing vital role with greater success in various applications, such as digital image processing, human-computer interaction, computer vision and natural language processing, robotics, biological applications, etc. Unlike traditional machine learning approaches, deep learning has effective ability of learning and makes better use of data set for feature extraction. Because of its repetitive learning ability, deep learning has become more popular in the present-day research works.

Download Full-text

Recent Trends in the Use of Graph Neural Network Models for Natural Language Processing

Advances in Systems Analysis, Software Engineering, and High Performance Computing - Deep Learning Techniques and Optimization Strategies in Big Data Analytics ◽

10.4018/978-1-7998-1192-3.ch016 ◽

2020 ◽

pp. 274-289

Author(s):

BURCU YILMAZ ◽

Hilal Genc ◽

Mustafa Agriman ◽

Bugra Kaan Demirdover ◽

Mert Erdemir ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Language Processing ◽

Network Models ◽

Feature Engineering ◽

Graph Structure ◽

Neural Network Models ◽

Graph Models ◽

Learning Tasks ◽

Low Dimensional

Graphs are powerful data structures that allow us to represent varying relationships within data. In the past, due to the difficulties related to the time complexities of processing graph models, graphs rarely involved machine learning tasks. In recent years, especially with the new advances in deep learning techniques, increasing number of graph models related to the feature engineering and machine learning are proposed. Recently, there has been an increase in approaches that automatically learn to encode graph structure into low dimensional embedding. These approaches are accompanied by models for machine learning tasks, and they fall into two categories. The first one focuses on feature engineering techniques on graphs. The second group of models assembles graph structure to learn a graph neighborhood in the machine learning model. In this chapter, the authors focus on the advances in applications of graphs on NLP using the recent deep learning models.

Download Full-text

Multi-Feature Fusion and Machine Learning

Advances in Data Mining and Database Management - Handbook of Research on Automated Feature Engineering and Advanced Applications in Data Science ◽

10.4018/978-1-7998-6659-6.ch006 ◽

2021 ◽

pp. 95-118

Author(s):

Hadeer Elziaat ◽

Nashwa El-Bendary ◽

Ramadan Moawad

Keyword(s):

Machine Learning ◽

Early Detection ◽

Feature Fusion ◽

Freezing Of Gait ◽

Feature Learning ◽

Feature Engineering ◽

Common Symptom ◽

Fusion Model ◽

Feature Sets ◽

Detection Model

Freezing of gait (FoG) is a common symptom of Parkinson's disease (PD) that causes intermittent absence of forward progression of patient's feet while walking. Accordingly, FoG momentary episodes are always accompanied with falls. This chapter presents a novel multi-feature fusion model for early detection of FoG episodes in patients with PD. In this chapter, two feature engineering schemes are investigated, namely time-domain hand-crafted feature engineering and convolutional neural network (CNN)-based spectrogram feature learning. Data of tri-axial accelerometer sensors for patients with PD is utilized to characterize the performance of the proposed model through several experiments with various machine learning (ML) algorithms. Obtained experimental results showed that the multi-feature fusion approach has outperformed typical single feature sets. Conclusively, the significance of this chapter is to highlight the impact of using feature fusion of multi-feature sets through investigating the performance of a FoG episodes early detection model.

Download Full-text

Ai and Ml Based Google Assistant for an Organization using Google Cloud Platform and Dialogflow

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.e6354.018520 ◽

2020 ◽

Vol 8 (5) ◽

pp. 2722-2727

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Natural Language ◽

Language Processing ◽

User Interaction ◽

Machine Learning Algorithms ◽

Learning Technology ◽

Cloud Platform ◽

Organization Process ◽

Academic Information

Many people adopting Smart Assistant Devices such as Google Home. Now a days of solely engaging with a service through a keyboard are over. The new modes of user interaction are aided in part by this research will investigate how advancements in Artificial Intelligence and Machine Learning technology are being used to improve many services. In particular, it will look at the development of google assistants as a channel for information distribution. This project is aimed to implement an android-based chatbot to assist with Organization basic processes, using google tools such as Dialogflow that uses Natural language processing NLP, Actions on Google and Google Cloud Platform that expose artificial intelligence and Machine Learning methods such as natural language understanding. Allowing users to interact with the google assistant using natural language as input and to train the chatbot i.e. google assistant using Dialogflow Machine learning tool and some appropriate methods so it will be able to generate a dynamic response. The chatbot will allow users to view all their personal academic information, schedule meetings with higher officials, automating the organization process and organization resources information all from within the chatbot i.e. Google Assistant. This project uses the OAuth authentication for security purpose. The Dialogflow helps to understand the users query by using machine learning algorithms. By using this google assistant we are going to use the Cloud Vision API for advancement. We will use Dialogflow as key part to develop Google assistant.

Download Full-text

A College Student Behavior Analysis and Management Method Based on Machine Learning Technology

Wireless Communications and Mobile Computing ◽

10.1155/2021/3126347 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Xiaoying Shen ◽

Chao Yuan

Keyword(s):

Machine Learning ◽

College Students ◽

Behavior Analysis ◽

School Safety ◽

Student Behavior ◽

Learning Algorithm ◽

Learning Ability ◽

Learning Technology ◽

Digital Campus ◽

Management Policies

A digital campus will generate a large amount of student-related data. How to analyze and apply these data has become the key to improving the management level of students. The analysis of student behavior data can not only assist schools in early warning of dangerous events and strengthen school safety but also can use real data to describe student behavior, thereby providing quantitative data support for scholarship and grant evaluation. This paper takes a university student as the research object, collects various data in the digital campus platform, and uses an adaptive K -means algorithm in the machine learning algorithm to cluster the data. Analyze the behavior of college students from the clustering results, so as to provide a basis for the education management and learning ability improvement of college students. Specifically, the student’s study, life, and consumption data are selected as the data to describe the student’s behavior at school. This data is input into the adaptive K -means algorithm to obtain different types of student consumption habits, living habits, and learning habits. Through the analysis results, it can be found that the problem of the group of students with low financial ability, the problem of too long online time for students, and the number of books borrowed are too low. According to the characteristics of these problems, teachers and schools are provided with targeted management suggestions. The analysis of student behavior based on machine learning technology provides a reference for the formulation of students’ school management policies and provides teachers with information on students’ personality characteristics, which is conducive to improving teachers’ teaching effects. In short, the management of the results of student behavior analysis can provide a basis for the school to formulate reasonable management policies, thereby promoting precision management and scientific decision-making.

Download Full-text

Comparing Pre-trained and Feature-Based Models for Prediction of Alzheimer's Disease Based on Speech

Frontiers in Aging Neuroscience ◽

10.3389/fnagi.2021.635945 ◽

2021 ◽

Vol 13 ◽

Author(s):

Aparna Balagopalan ◽

Benjamin Eyre ◽

Jessica Robin ◽

Frank Rudzicz ◽

Jekaterina Novikova

Keyword(s):

Machine Learning ◽

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Domain Knowledge ◽

Linguistic Features ◽

Feature Based ◽

Description Task

Introduction: Research related to the automatic detection of Alzheimer's disease (AD) is important, given the high prevalence of AD and the high cost of traditional diagnostic methods. Since AD significantly affects the content and acoustics of spontaneous speech, natural language processing, and machine learning provide promising techniques for reliably detecting AD. There has been a recent proliferation of classification models for AD, but these vary in the datasets used, model types and training and testing paradigms. In this study, we compare and contrast the performance of two common approaches for automatic AD detection from speech on the same, well-matched dataset, to determine the advantages of using domain knowledge vs. pre-trained transfer models.Methods: Audio recordings and corresponding manually-transcribed speech transcripts of a picture description task administered to 156 demographically matched older adults, 78 with Alzheimer's Disease (AD) and 78 cognitively intact (healthy) were classified using machine learning and natural language processing as “AD” or “non-AD.” The audio was acoustically-enhanced, and post-processed to improve quality of the speech recording as well control for variation caused by recording conditions. Two approaches were used for classification of these speech samples: (1) using domain knowledge: extracting an extensive set of clinically relevant linguistic and acoustic features derived from speech and transcripts based on prior literature, and (2) using transfer-learning and leveraging large pre-trained machine learning models: using transcript-representations that are automatically derived from state-of-the-art pre-trained language models, by fine-tuning Bidirectional Encoder Representations from Transformer (BERT)-based sequence classification models.Results: We compared the utility of speech transcript representations obtained from recent natural language processing models (i.e., BERT) to more clinically-interpretable language feature-based methods. Both the feature-based approaches and fine-tuned BERT models significantly outperformed the baseline linguistic model using a small set of linguistic features, demonstrating the importance of extensive linguistic information for detecting cognitive impairments relating to AD. We observed that fine-tuned BERT models numerically outperformed feature-based approaches on the AD detection task, but the difference was not statistically significant. Our main contribution is the observation that when tested on the same, demographically balanced dataset and tested on independent, unseen data, both domain knowledge and pretrained linguistic models have good predictive performance for detecting AD based on speech. It is notable that linguistic information alone is capable of achieving comparable, and even numerically better, performance than models including both acoustic and linguistic features here. We also try to shed light on the inner workings of the more black-box natural language processing model by performing an interpretability analysis, and find that attention weights reveal interesting patterns such as higher attribution to more important information content units in the picture description task, as well as pauses and filler words.Conclusion: This approach supports the value of well-performing machine learning and linguistically-focussed processing techniques to detect AD from speech and highlights the need to compare model performance on carefully balanced datasets, using consistent same training parameters and independent test datasets in order to determine the best performing predictive model.

Download Full-text

Tools for Educational Researchers Working With Big Data

The Community of Inquiry Framework in Contemporary Education - Advances in Educational Technologies and Instructional Design ◽

10.4018/978-1-5225-5161-4.ch004 ◽

2018 ◽

pp. 53-75

Keyword(s):

United States ◽

Machine Learning ◽

Big Data ◽

Social Network ◽

Network Analysis ◽

Language Processing ◽

The United States ◽

Institutional Setting ◽

University Of Texas ◽

The University

Among the foremost challenges with big data is how to go about analyzing it. What new tools are needed to be able to properly investigate and model the large quantities of highly complex, often messy data? Chapter 4 addresses this question by introducing and briefly exploring the fields of Machine Learning, Natural Language Processing, and Social Network Analysis, focusing on how these methods and toolsets can be utilized to make sense of big data. The authors provide a broad overview of tools, ideas, and caveats for each of these fields. This chapter ends with a look at how one major public university in the United States, the University of Texas at Arlington, is beginning to address some of the questions surrounding big data in an institutional setting. A list of additional readings is provided.

Download Full-text