scholarly journals AI-Crime Hunter: An AI Mixture of Experts for Crime Discovery on Twitter

Author(s):  
Niloufar Shoeibi ◽  
Nastaran Shoeibi ◽  
Guillermo Hernández ◽  
Pablo Chamoso ◽  
Juan Manuel Corchado

Maintaining a healthy cyber society is a big challenge due to the users’ freedom of expression and behaving. It can be solved by monitoring and analyzing the users’ behavior and taking proper actions towards them. This research aims to present a platform that monitors the public content on Twitter by extracting tweet data. After maintaining the data, the users’ interactions are analyzed using Graph Analysis methods. Then the users’ behavioral patterns are analyzed by applying Metadata Analysis, in which the timeline of each profile is obtained; also, the time-series behavioral features of users are investigated. Then in the Abnormal Behavior Detection Filtering component, the interesting profiles are selected for further examinations. Finally, in the Contextual Analysis component, the contents will be analyzed using natural language processing techniques; A binary text classification model (SVM + TF-IDF with 88.89% accuracy) for detecting if the tweet is related to crime or not. Then, a sentiment analysis method is applied to the crime-related tweets to perform aspect-based sentiment analysis (DistilBERT + FFNN with 80% accuracy); because sharing positive opinions about a crime-related topic can threaten society. This platform aims to provide the end-user (Police) suggestions to control hate speech or terrorist propaganda.

Electronics ◽  
2021 ◽  
Vol 10 (24) ◽  
pp. 3081
Author(s):  
Niloufar Shoeibi ◽  
Nastaran Shoeibi ◽  
Guillermo Hernández ◽  
Pablo Chamoso ◽  
Juan M. Corchado

Maintaining a healthy cyber society is a great challenge due to the users’ freedom of expression and behavior. This can be solved by monitoring and analyzing the users’ behavior and taking proper actions. This research aims to present a platform that monitors the public content on Twitter by extracting tweet data. After maintaining the data, the users’ interactions are analyzed using graph analysis methods. Then, the users’ behavioral patterns are analyzed by applying metadata analysis, in which the timeline of each profile is obtained; also, the time-series behavioral features of users are investigated. Then, in the abnormal behavior detection and filtering component, the interesting profiles are selected for further examinations. Finally, in the contextual analysis component, the contents are analyzed using natural language processing techniques; a binary text classification model (SVM (Support Vector Machine) + TF-IDF (Term Frequency—Inverse Document Frequency) with 88.89% accuracy) is used to detect if a tweet is related to crime or not. Then, a sentiment analysis method is applied to the crime-related tweets to perform aspect-based sentiment analysis (DistilBERT + FFNN (Feed-Forward Neural Network) with 80% accuracy), because sharing positive opinions about a crime-related topic can threaten society. This platform aims to provide the end-user (the police) with suggestions to control hate speech or terrorist propaganda.


2020 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Omar Alqaryouti ◽  
Nur Siyam ◽  
Azza Abdel Monem ◽  
Khaled Shaalan

Digital resources such as smart applications reviews and online feedback information are important sources to seek customers’ feedback and input. This paper aims to help government entities gain insights on the needs and expectations of their customers. Towards this end, we propose an aspect-based sentiment analysis hybrid approach that integrates domain lexicons and rules to analyse the entities smart apps reviews. The proposed model aims to extract the important aspects from the reviews and classify the corresponding sentiments. This approach adopts language processing techniques, rules, and lexicons to address several sentiment analysis challenges, and produce summarized results. According to the reported results, the aspect extraction accuracy improves significantly when the implicit aspects are considered. Also, the integrated classification model outperforms the lexicon-based baseline and the other rules combinations by 5% in terms of Accuracy on average. Also, when using the same dataset, the proposed approach outperforms machine learning approaches that uses support vector machine (SVM). However, using these lexicons and rules as input features to the SVM model has achieved higher accuracy than other SVM models.


Information ◽  
2021 ◽  
Vol 12 (5) ◽  
pp. 204
Author(s):  
Charlyn Villavicencio ◽  
Julio Jerison Macrohon ◽  
X. Alphonse Inbaraj ◽  
Jyh-Horng Jeng ◽  
Jer-Guang Hsieh

A year into the COVID-19 pandemic and one of the longest recorded lockdowns in the world, the Philippines received its first delivery of COVID-19 vaccines on 1 March 2021 through WHO’s COVAX initiative. A month into inoculation of all frontline health professionals and other priority groups, the authors of this study gathered data on the sentiment of Filipinos regarding the Philippine government’s efforts using the social networking site Twitter. Natural language processing techniques were applied to understand the general sentiment, which can help the government in analyzing their response. The sentiments were annotated and trained using the Naïve Bayes model to classify English and Filipino language tweets into positive, neutral, and negative polarities through the RapidMiner data science software. The results yielded an 81.77% accuracy, which outweighs the accuracy of recent sentiment analysis studies using Twitter data from the Philippines.


2021 ◽  
pp. 1-13
Author(s):  
Qingtian Zeng ◽  
Xishi Zhao ◽  
Xiaohui Hu ◽  
Hua Duan ◽  
Zhongying Zhao ◽  
...  

Word embeddings have been successfully applied in many natural language processing tasks due to its their effectiveness. However, the state-of-the-art algorithms for learning word representations from large amounts of text documents ignore emotional information, which is a significant research problem that must be addressed. To solve the above problem, we propose an emotional word embedding (EWE) model for sentiment analysis in this paper. This method first applies pre-trained word vectors to represent document features using two different linear weighting methods. Then, the resulting document vectors are input to a classification model and used to train a text sentiment classifier, which is based on a neural network. In this way, the emotional polarity of the text is propagated into the word vectors. The experimental results on three kinds of real-world data sets demonstrate that the proposed EWE model achieves superior performances on text sentiment prediction, text similarity calculation, and word emotional expression tasks compared to other state-of-the-art models.


2021 ◽  
Author(s):  
Lucas Rodrigues ◽  
Antonio Jacob Junior ◽  
Fábio Lobato

Posts with defamatory content or hate speech are constantly foundon social media. The results for readers are numerous, not restrictedonly to the psychological impact, but also to the growth of thissocial phenomenon. With the General Law on the Protection ofPersonal Data and the Marco Civil da Internet, service providersbecame responsible for the content in their platforms. Consideringthe importance of this issue, this paper aims to analyze the contentpublished (news and comments) on the G1 News Portal with techniquesbased on data visualization and Natural Language Processing,such as sentiment analysis and topic modeling. The results showthat even with most of the comments being neutral or negative andclassified or not as hate speech, the majority of them were acceptedby the users.


2021 ◽  
Vol 20 ◽  
pp. 149-167
Author(s):  
Huanzhuo Ye ◽  
Yuan Li

This study proposes a service quality evaluation model framework which integrates automatic data acquisition, intelligent data processing and real-time data analysis with online comment data as data sources by introducing natural language processing technology based on management methods to break the traditional idea of over-reliance on human resources for service quality evaluation. The framework is mainly divided into text data preparation, fine-grained sentiment analysis and fuzzy cloud evaluation models. Data preparation module is responsible for preparing the initial data, and the fine-grained sentiment analysis module is responsible for pre-training a fine-grained sentiment classification model. The fuzzy cloud evaluation module uses the data obtained from the first two modules to evaluate service quality. By applying the model into catering industry, the feasibility of the model is proved and individuality, efficiency, dynamicity and intelligence of the model give it more advantage in the practice of service quality evaluation


10.28945/4319 ◽  
2019 ◽  

[This Proceedings paper was revised and published in the 2019 issue of the journal Informing Science: The International Journal of an Emerging Transdiscipline, Volume 22] Aim/Purpose: The aim of this paper is to propose an ensemble learners based classification model for classification clickbaits from genuine article headlines. Background: Clickbaits are online articles with deliberately designed misleading titles for luring more and more readers to open the intended web page. Clickbaits are used to tempted visitors to click on a particular link either to monetize the landing page or to spread the false news for sensationalization. The presence of clickbaits on any news aggregator portal may lead to an unpleasant experience for readers. Therefore, it is essential to distinguish clickbaits from authentic headlines to mitigate their impact on readers’ perception. Methodology: A total of one hundred thousand article headlines are collected from news aggregator sites consists of clickbaits and authentic news headlines. The collected data samples are divided into five training sets of balanced and unbalanced data. The natural language processing techniques are used to extract 19 manually selected features from article headlines. Contribution: Three ensemble learning techniques including bagging, boosting, and random forests are used to design a classifier model for classifying a given headline into the clickbait or non-clickbait. The performances of learners are evaluated using accuracy, precision, recall, and F-measures. Findings: It is observed that the random forest classifier detects clickbaits better than the other classifiers with an accuracy of 91.16 %, a total precision, recall, and f-measure of 91 %.


Author(s):  
Bhushan R. Chincholkar

Sentiment analysis is one of the fastest growing fields with its demand and potential benefits that are increasing every day. Sentiment analysis aims to classify the polarity of a document through natural language processing, text analysis. With the help of internet and modern technology, there has bee n a tremendous growth in the amount of data. Each individual is in position to precise his/her own ideas freely on social media. All of this data can be analyzed and used in order to draw benefits and quality information. In this paper, the focus is on cyber-hate classification based on for public opinion or views, since the spread of hate speech using social media can have disruptive impacts on social sentiment analysis. In particular, here proposing a modified approach with two stage training for dealing with text ambiguity and classifying three type approach positive, negative and neutral sentiment, and compare its performance with those popular methods also as well as some existing fuzzy approaches. Afterword comparing the performance of proposed approach with commonly used sentiment classifiers which are known to perform well in this task. The experimental results indicate that our modified approach performs marginally better than the other algorithms.


Author(s):  
C. Selvi ◽  
Niveda. C. P

Digital sources such as smart applications opinions and online feedback statistics are crucial resources to be seeking for customers’ remarks and input. However, the reviews are often disorganized, leading to difficulties in information navigation and knowledge acquisition. The aforementioned problem is overcome by generating aspect-sentiment based embedding for the hotels and companies by looking into reliable reviews of them. The important product aspects are identified based on two observations: 1) the important aspects are usually commented on by a large number of consumers and 2) consumer opinions on the important aspects greatly influence their overall opinions. Aspect frequency and the influence of consumer opinions given to each aspect over their overall opinions are identified for hotel reviews whereas for company reviews approach adopts language processing techniques, policies, and lexicons to address several sentiment evaluation challenges, and convey summarized results. Moreover, aspect ranking achieve significant performance improvements, which demonstrate the capacity of aspect ranking in facilitating real-world applications.


2020 ◽  
Author(s):  
Sohini Sengupta ◽  
Sareeta Mugde ◽  
Garima Sharma

Twitter is one of the world's biggest social media platforms for hosting abundant number of user-generated posts. It is considered as a gold mine of data. Majority of the tweets are public and thereby pullable unlike other social media platforms. In this paper we are analyzing the topics related to mental health that are recently (June, 2020) been discussed on Twitter. Also amidst the on-going pandemic, we are going to find out if covid-19 emerges as one of the factors impacting mental health. Further we are going to do an overall sentiment analysis to better understand the emotions of users.


Sign in / Sign up

Export Citation Format

Share Document