scholarly journals An End-to-End Rumor Detection Model Based on Feature Aggregation

Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-16
Author(s):  
Aoshuang Ye ◽  
Lina Wang ◽  
Run Wang ◽  
Wenqi Wang ◽  
Jianpeng Ke ◽  
...  

The social network has become the primary medium of rumor propagation. Moreover, manual identification of rumors is extremely time-consuming and laborious. It is crucial to identify rumors automatically. Machine learning technology is widely implemented in the identification and detection of misinformation on social networks. However, the traditional machine learning methods profoundly rely on feature engineering and domain knowledge, and the learning ability of temporal features is insufficient. Furthermore, the features used by the deep learning method based on natural language processing are heavily limited. Therefore, it is of great significance and practical value to study the rumor detection method independent of feature engineering and effectively aggregate heterogeneous features to adapt to the complex and variable social network. In this paper, a deep neural network- (DNN-) based feature aggregation modeling method is proposed, which makes full use of the knowledge of propagation pattern feature and text content feature of social network event without feature engineering and domain knowledge. The experimental results show that the feature aggregation model has achieved 94.4% of accuracy as the best performance in recent works.

2021 ◽  
Vol 39 (15_suppl) ◽  
pp. e13588-e13588
Author(s):  
Laura Sachse ◽  
Smriti Dasari ◽  
Marc Ackermann ◽  
Emily Patnaude ◽  
Stephanie OLeary ◽  
...  

e13588 Background: Pre-screening for clinical trials is becoming more challenging as inclusion/exclusion criteria becomes increasingly complex. Oncology precision medicine provides an exciting opportunity to simplify this process and quickly match patients with trials by leveraging machine learning technology. The Tempus TIME Trial site network matches patients to relevant, open, and recruiting clinical trials, personalized to each patient’s clinical and molecular biology. Methods: Tempus screens patients at sites within the TIME Trial Network to find high-fidelity matches to clinical trials. The patient records include documentation submitted alongside NGS orders as well as electronic medical records (EMR) ingested through EMR Integrations. While Tempus-sequenced patients were automatically matched to trials using a Tempus-built matching application, EMR records were run through a natural language processing (NLP) data abstraction model to identify patients with an actionable gene of interest. Structured data were analyzed to filter to patients that lack a deceased date and have an encounter date within a predefined time period. Tempus abstractors manually validated the resulting unstructured records to ensure each patient was matched to a TIME Trial at a site capable of running the trial. For all high-level patient matches, a Tempus Clinical Navigator manually evaluated other clinical criteria to confirm trial matches and communicated with the site about trial options. Results: Patient matching was accelerated by implementing NLP gene and report detection (which isolated 17% of records) and manual screening. As a result, Tempus facilitated screening of over 190,000 patients efficiently using proprietary NLP technology to match 332 patients to 21 unique interventional clinical trials since program launch. Tempus continues to optimize its NLP models to increase high-fidelity trial matching at scale. Conclusions: The TIME Trial Network is an evolving, dynamic program that efficiently matches patients with clinical trial sites using both EMR and Tempus sequencing data. Here, we show how machine learning technology can be utilized to efficiently identify and recruit patients to clinical trials, thereby personalizing trial enrollment for each patient.[Table: see text]


2010 ◽  
Vol 7 (1) ◽  
pp. 25-46 ◽  
Author(s):  
Sunita Goel ◽  
Jagdish Gangolly ◽  
Sue R. Faerman ◽  
Ozlem Uzuner

ABSTRACT: Extensive research has been done on the analytical and empirical examination of financial data in annual reports to detect fraud; however, there is scant research on the analysis of text in annual reports to detect fraud. The basic premise of this research is that there are clues hidden in the text that can be detected to determine the likelihood of fraud. In this research, we examine both the verbal content and the presentation style of the qualitative portion of the annual reports using natural language processing tools and explore linguistic features that distinguish fraudulent annual reports from nonfraudulent annual reports. Our results indicate that employment of linguistic features is an effective means for detecting fraud. We were able to improve the prediction accuracy of our fraud detection model from initial baseline results of 56.75 percent accuracy, using a “bag of words” approach, to 89.51 percent accuracy when we incorporated linguistically motivated features inspired by our informed reasoning and domain knowledge.


Author(s):  
Shaila S. G. ◽  
Sunanda Rajkumari ◽  
Vadivel Ayyasamy

Deep learning is playing vital role with greater success in various applications, such as digital image processing, human-computer interaction, computer vision and natural language processing, robotics, biological applications, etc. Unlike traditional machine learning approaches, deep learning has effective ability of learning and makes better use of data set for feature extraction. Because of its repetitive learning ability, deep learning has become more popular in the present-day research works.


Author(s):  
BURCU YILMAZ ◽  
Hilal Genc ◽  
Mustafa Agriman ◽  
Bugra Kaan Demirdover ◽  
Mert Erdemir ◽  
...  

Graphs are powerful data structures that allow us to represent varying relationships within data. In the past, due to the difficulties related to the time complexities of processing graph models, graphs rarely involved machine learning tasks. In recent years, especially with the new advances in deep learning techniques, increasing number of graph models related to the feature engineering and machine learning are proposed. Recently, there has been an increase in approaches that automatically learn to encode graph structure into low dimensional embedding. These approaches are accompanied by models for machine learning tasks, and they fall into two categories. The first one focuses on feature engineering techniques on graphs. The second group of models assembles graph structure to learn a graph neighborhood in the machine learning model. In this chapter, the authors focus on the advances in applications of graphs on NLP using the recent deep learning models.


Author(s):  
Hadeer Elziaat ◽  
Nashwa El-Bendary ◽  
Ramadan Moawad

Freezing of gait (FoG) is a common symptom of Parkinson's disease (PD) that causes intermittent absence of forward progression of patient's feet while walking. Accordingly, FoG momentary episodes are always accompanied with falls. This chapter presents a novel multi-feature fusion model for early detection of FoG episodes in patients with PD. In this chapter, two feature engineering schemes are investigated, namely time-domain hand-crafted feature engineering and convolutional neural network (CNN)-based spectrogram feature learning. Data of tri-axial accelerometer sensors for patients with PD is utilized to characterize the performance of the proposed model through several experiments with various machine learning (ML) algorithms. Obtained experimental results showed that the multi-feature fusion approach has outperformed typical single feature sets. Conclusively, the significance of this chapter is to highlight the impact of using feature fusion of multi-feature sets through investigating the performance of a FoG episodes early detection model.


2020 ◽  
Vol 8 (5) ◽  
pp. 2722-2727

Many people adopting Smart Assistant Devices such as Google Home. Now a days of solely engaging with a service through a keyboard are over. The new modes of user interaction are aided in part by this research will investigate how advancements in Artificial Intelligence and Machine Learning technology are being used to improve many services. In particular, it will look at the development of google assistants as a channel for information distribution. This project is aimed to implement an android-based chatbot to assist with Organization basic processes, using google tools such as Dialogflow that uses Natural language processing NLP, Actions on Google and Google Cloud Platform that expose artificial intelligence and Machine Learning methods such as natural language understanding. Allowing users to interact with the google assistant using natural language as input and to train the chatbot i.e. google assistant using Dialogflow Machine learning tool and some appropriate methods so it will be able to generate a dynamic response. The chatbot will allow users to view all their personal academic information, schedule meetings with higher officials, automating the organization process and organization resources information all from within the chatbot i.e. Google Assistant. This project uses the OAuth authentication for security purpose. The Dialogflow helps to understand the users query by using machine learning algorithms. By using this google assistant we are going to use the Cloud Vision API for advancement. We will use Dialogflow as key part to develop Google assistant.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Xiaoying Shen ◽  
Chao Yuan

A digital campus will generate a large amount of student-related data. How to analyze and apply these data has become the key to improving the management level of students. The analysis of student behavior data can not only assist schools in early warning of dangerous events and strengthen school safety but also can use real data to describe student behavior, thereby providing quantitative data support for scholarship and grant evaluation. This paper takes a university student as the research object, collects various data in the digital campus platform, and uses an adaptive K -means algorithm in the machine learning algorithm to cluster the data. Analyze the behavior of college students from the clustering results, so as to provide a basis for the education management and learning ability improvement of college students. Specifically, the student’s study, life, and consumption data are selected as the data to describe the student’s behavior at school. This data is input into the adaptive K -means algorithm to obtain different types of student consumption habits, living habits, and learning habits. Through the analysis results, it can be found that the problem of the group of students with low financial ability, the problem of too long online time for students, and the number of books borrowed are too low. According to the characteristics of these problems, teachers and schools are provided with targeted management suggestions. The analysis of student behavior based on machine learning technology provides a reference for the formulation of students’ school management policies and provides teachers with information on students’ personality characteristics, which is conducive to improving teachers’ teaching effects. In short, the management of the results of student behavior analysis can provide a basis for the school to formulate reasonable management policies, thereby promoting precision management and scientific decision-making.


2021 ◽  
Vol 13 ◽  
Author(s):  
Aparna Balagopalan ◽  
Benjamin Eyre ◽  
Jessica Robin ◽  
Frank Rudzicz ◽  
Jekaterina Novikova

Introduction: Research related to the automatic detection of Alzheimer's disease (AD) is important, given the high prevalence of AD and the high cost of traditional diagnostic methods. Since AD significantly affects the content and acoustics of spontaneous speech, natural language processing, and machine learning provide promising techniques for reliably detecting AD. There has been a recent proliferation of classification models for AD, but these vary in the datasets used, model types and training and testing paradigms. In this study, we compare and contrast the performance of two common approaches for automatic AD detection from speech on the same, well-matched dataset, to determine the advantages of using domain knowledge vs. pre-trained transfer models.Methods: Audio recordings and corresponding manually-transcribed speech transcripts of a picture description task administered to 156 demographically matched older adults, 78 with Alzheimer's Disease (AD) and 78 cognitively intact (healthy) were classified using machine learning and natural language processing as “AD” or “non-AD.” The audio was acoustically-enhanced, and post-processed to improve quality of the speech recording as well control for variation caused by recording conditions. Two approaches were used for classification of these speech samples: (1) using domain knowledge: extracting an extensive set of clinically relevant linguistic and acoustic features derived from speech and transcripts based on prior literature, and (2) using transfer-learning and leveraging large pre-trained machine learning models: using transcript-representations that are automatically derived from state-of-the-art pre-trained language models, by fine-tuning Bidirectional Encoder Representations from Transformer (BERT)-based sequence classification models.Results: We compared the utility of speech transcript representations obtained from recent natural language processing models (i.e., BERT) to more clinically-interpretable language feature-based methods. Both the feature-based approaches and fine-tuned BERT models significantly outperformed the baseline linguistic model using a small set of linguistic features, demonstrating the importance of extensive linguistic information for detecting cognitive impairments relating to AD. We observed that fine-tuned BERT models numerically outperformed feature-based approaches on the AD detection task, but the difference was not statistically significant. Our main contribution is the observation that when tested on the same, demographically balanced dataset and tested on independent, unseen data, both domain knowledge and pretrained linguistic models have good predictive performance for detecting AD based on speech. It is notable that linguistic information alone is capable of achieving comparable, and even numerically better, performance than models including both acoustic and linguistic features here. We also try to shed light on the inner workings of the more black-box natural language processing model by performing an interpretability analysis, and find that attention weights reveal interesting patterns such as higher attribution to more important information content units in the picture description task, as well as pauses and filler words.Conclusion: This approach supports the value of well-performing machine learning and linguistically-focussed processing techniques to detect AD from speech and highlights the need to compare model performance on carefully balanced datasets, using consistent same training parameters and independent test datasets in order to determine the best performing predictive model.


Among the foremost challenges with big data is how to go about analyzing it. What new tools are needed to be able to properly investigate and model the large quantities of highly complex, often messy data? Chapter 4 addresses this question by introducing and briefly exploring the fields of Machine Learning, Natural Language Processing, and Social Network Analysis, focusing on how these methods and toolsets can be utilized to make sense of big data. The authors provide a broad overview of tools, ideas, and caveats for each of these fields. This chapter ends with a look at how one major public university in the United States, the University of Texas at Arlington, is beginning to address some of the questions surrounding big data in an institutional setting. A list of additional readings is provided.


Sign in / Sign up

Export Citation Format

Share Document