scholarly journals AI-Crime Hunter: An AI Mixture of Experts for Crime Discovery on Twitter

Electronics ◽  
2021 ◽  
Vol 10 (24) ◽  
pp. 3081
Author(s):  
Niloufar Shoeibi ◽  
Nastaran Shoeibi ◽  
Guillermo Hernández ◽  
Pablo Chamoso ◽  
Juan M. Corchado

Maintaining a healthy cyber society is a great challenge due to the users’ freedom of expression and behavior. This can be solved by monitoring and analyzing the users’ behavior and taking proper actions. This research aims to present a platform that monitors the public content on Twitter by extracting tweet data. After maintaining the data, the users’ interactions are analyzed using graph analysis methods. Then, the users’ behavioral patterns are analyzed by applying metadata analysis, in which the timeline of each profile is obtained; also, the time-series behavioral features of users are investigated. Then, in the abnormal behavior detection and filtering component, the interesting profiles are selected for further examinations. Finally, in the contextual analysis component, the contents are analyzed using natural language processing techniques; a binary text classification model (SVM (Support Vector Machine) + TF-IDF (Term Frequency—Inverse Document Frequency) with 88.89% accuracy) is used to detect if a tweet is related to crime or not. Then, a sentiment analysis method is applied to the crime-related tweets to perform aspect-based sentiment analysis (DistilBERT + FFNN (Feed-Forward Neural Network) with 80% accuracy), because sharing positive opinions about a crime-related topic can threaten society. This platform aims to provide the end-user (the police) with suggestions to control hate speech or terrorist propaganda.

Author(s):  
Niloufar Shoeibi ◽  
Nastaran Shoeibi ◽  
Guillermo Hernández ◽  
Pablo Chamoso ◽  
Juan Manuel Corchado

Maintaining a healthy cyber society is a big challenge due to the users’ freedom of expression and behaving. It can be solved by monitoring and analyzing the users’ behavior and taking proper actions towards them. This research aims to present a platform that monitors the public content on Twitter by extracting tweet data. After maintaining the data, the users’ interactions are analyzed using Graph Analysis methods. Then the users’ behavioral patterns are analyzed by applying Metadata Analysis, in which the timeline of each profile is obtained; also, the time-series behavioral features of users are investigated. Then in the Abnormal Behavior Detection Filtering component, the interesting profiles are selected for further examinations. Finally, in the Contextual Analysis component, the contents will be analyzed using natural language processing techniques; A binary text classification model (SVM + TF-IDF with 88.89% accuracy) for detecting if the tweet is related to crime or not. Then, a sentiment analysis method is applied to the crime-related tweets to perform aspect-based sentiment analysis (DistilBERT + FFNN with 80% accuracy); because sharing positive opinions about a crime-related topic can threaten society. This platform aims to provide the end-user (Police) suggestions to control hate speech or terrorist propaganda.


2018 ◽  
Vol 36 (4) ◽  
pp. 677-695 ◽  
Author(s):  
Shrawan Kumar Trivedi ◽  
Shubhamoy Dey ◽  
Anil Kumar

Purpose Sentiment analysis and opinion mining are emerging areas of research for analyzing Web data and capturing users’ sentiments. This research aims to present sentiment analysis of an Indian movie review corpus using natural language processing and various machine learning classifiers. Design/methodology/approach In this paper, a comparative study between three machine learning classifiers (Bayesian, naïve Bayesian and support vector machine [SVM]) was performed. All the classifiers were trained on the words/features of the corpus extracted, using five different feature selection algorithms (Chi-square, info-gain, gain ratio, one-R and relief-F [RF] attributes), and a comparative study was performed between them. The classifiers and feature selection approaches were evaluated using different metrics (F-value, false-positive [FP] rate and training time). Findings The results of this study show that, for the maximum number of features, the RF feature selection approach was found to be the best, with better F-values, a low FP rate and less time needed to train the classifiers, whereas for the least number of features, one-R was better than RF. When the evaluation was performed for machine learning classifiers, SVM was found to be superior, although the Bayesian classifier was comparable with SVM. Originality/value This is a novel research where Indian review data were collected and then a classification model for sentiment polarity (positive/negative) was constructed.


2020 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Omar Alqaryouti ◽  
Nur Siyam ◽  
Azza Abdel Monem ◽  
Khaled Shaalan

Digital resources such as smart applications reviews and online feedback information are important sources to seek customers’ feedback and input. This paper aims to help government entities gain insights on the needs and expectations of their customers. Towards this end, we propose an aspect-based sentiment analysis hybrid approach that integrates domain lexicons and rules to analyse the entities smart apps reviews. The proposed model aims to extract the important aspects from the reviews and classify the corresponding sentiments. This approach adopts language processing techniques, rules, and lexicons to address several sentiment analysis challenges, and produce summarized results. According to the reported results, the aspect extraction accuracy improves significantly when the implicit aspects are considered. Also, the integrated classification model outperforms the lexicon-based baseline and the other rules combinations by 5% in terms of Accuracy on average. Also, when using the same dataset, the proposed approach outperforms machine learning approaches that uses support vector machine (SVM). However, using these lexicons and rules as input features to the SVM model has achieved higher accuracy than other SVM models.


2021 ◽  
pp. 1-13
Author(s):  
Qingtian Zeng ◽  
Xishi Zhao ◽  
Xiaohui Hu ◽  
Hua Duan ◽  
Zhongying Zhao ◽  
...  

Word embeddings have been successfully applied in many natural language processing tasks due to its their effectiveness. However, the state-of-the-art algorithms for learning word representations from large amounts of text documents ignore emotional information, which is a significant research problem that must be addressed. To solve the above problem, we propose an emotional word embedding (EWE) model for sentiment analysis in this paper. This method first applies pre-trained word vectors to represent document features using two different linear weighting methods. Then, the resulting document vectors are input to a classification model and used to train a text sentiment classifier, which is based on a neural network. In this way, the emotional polarity of the text is propagated into the word vectors. The experimental results on three kinds of real-world data sets demonstrate that the proposed EWE model achieves superior performances on text sentiment prediction, text similarity calculation, and word emotional expression tasks compared to other state-of-the-art models.


2021 ◽  
Author(s):  
Lucas Rodrigues ◽  
Antonio Jacob Junior ◽  
Fábio Lobato

Posts with defamatory content or hate speech are constantly foundon social media. The results for readers are numerous, not restrictedonly to the psychological impact, but also to the growth of thissocial phenomenon. With the General Law on the Protection ofPersonal Data and the Marco Civil da Internet, service providersbecame responsible for the content in their platforms. Consideringthe importance of this issue, this paper aims to analyze the contentpublished (news and comments) on the G1 News Portal with techniquesbased on data visualization and Natural Language Processing,such as sentiment analysis and topic modeling. The results showthat even with most of the comments being neutral or negative andclassified or not as hate speech, the majority of them were acceptedby the users.


2018 ◽  
Vol 7 (3.34) ◽  
pp. 156
Author(s):  
Basavaraj G.M ◽  
Dr Ashok Kusagur

A many of researches have been carried out in the field of the crowd behavior recognition system. Recognizing crowd behavior in videos is most challenging and occlusions because of irregular human movement. This paper gives an overview of optical flow model along with the SVM (Support Vector Machine) classification model. This proposed approach evaluates sudden changes in motion of an event and classifies that event to a category: Normal and Abnormal.  Geometric means of location, direction, and displacement of the feature points of each frame are estimated. Harris corner Detector is used in each frame for tracking a set of feature points. Proposed approach is very effective in real time scenario like public places where security is most important. After analyzing result ROC curve (receiver operating characteristics) is plotted which gives classification accuracy. We also presented frame level comparison with Ground truth and social force model (SFM) techniques. Our proposed approach is giving a promising result compare to all state of art methods.  


2020 ◽  
Vol 24 (5) ◽  
pp. 1141-1160
Author(s):  
Tomás Alegre Sepúlveda ◽  
Brian Keith Norambuena

In this paper, we apply sentiment analysis methods in the context of the first round of the 2017 Chilean elections. The purpose of this work is to estimate the voting intention associated with each candidate in order to contrast this with the results from classical methods (e.g., polls and surveys). The data are collected from Twitter, because of its high usage in Chile and in the sentiment analysis literature. We obtained tweets associated with the three main candidates: Sebastián Piñera (SP), Alejandro Guillier (AG) and Beatriz Sánchez (BS). For each candidate, we estimated the voting intention and compared it to the traditional methods. To do this, we first acquired the data and labeled the tweets as positive or negative. Afterward, we built a model using machine learning techniques. The classification model had an accuracy of 76.45% using support vector machines, which yielded the best model for our case. Finally, we use a formula to estimate the voting intention from the number of positive and negative tweets for each candidate. For the last period, we obtained a voting intention of 35.84% for SP, compared to a range of 34–44% according to traditional polls and 36% in the actual elections. For AG we obtained an estimate of 37%, compared with a range of 15.40% to 30.00% for traditional polls and 20.27% in the elections. For BS we obtained an estimate of 27.77%, compared with the range of 8.50% to 11.00% given by traditional polls and an actual result of 22.70% in the elections. These results are promising, in some cases providing an estimate closer to reality than traditional polls. Some differences can be explained due to the fact that some candidates have been omitted, even though they held a significant number of votes.


in the last years, the relevance of sentiment analysis is broad and dominant. The capability to take out insights from social data is a tradition that is being extensively accepted by all over globe. Sentiment Analysis has turn out to be a hot-trend issue of technical and marketplace research in the area of Natural Language Processing (NLP) and Machine Learning. Sentiment analysis is enormously useful in social media supervising as it permits us to expand an impression of the wider open estimation behind definite topics. Investigation of social media streams is typically limited to just essential sentiment analysis and count based metrics. This is of the same kind to just scratching the outside and missing out on those elevated value insight that is ahead of you to be discovered. There’s a lot of effort to be done, but perfections are being prepared every day. It is a way to appraise on paper or verbal language to settle on if the expression is favorable, unfavorable, or unbiased, and to what level. Today’s algorithm-based sentiment analysis tools can touch vast amount of client response constantly and precisely. Balancing with text analytics, sentiment analysis exposes the customer’s estimation concerning topics ranging from your goods and services to your position, your advertisements, or even your challengers. These efforts scrutinize the crisis of studying texts, like posts and reviews, uploaded by user on Twitter. The Support Vector Machine (SVM), k-nearest neighbors algorithm (KNN) and proposed optimized feature sets model is offered to progression the tweet features and to recognize the out of sight sentiments from these tweets. These essential concepts when used in combinations become a very significant tool for analyzing millions of variety conversations with human echelon accurateness. The projected optimized feature sets model Sentiment Analysis exercise the assessment metrics of Precision, Recall, F-score, and Accuracy. Also, average measures weighted F1-scores are constructive for categorization of Positive, Negative and Neutral multi-class problems. The running time of the technique is evaluates by accomplishing diverse methods in the same investigational setup consisting a cluster of 8 nodes. Planned optimized feature sets model Sentiment Analysis reachs 82 % accuracy as compare with SVM 78.6 % and KNN 75 %. Further, while analyzing sentiments of tweets we have measured only tweets in English acknowledged by Twitter streaming API.


Sentiment Analysis is individuals' opinions and feedbacks study towards a substance, which can be items, services, movies, people or events. The opinions are mostly expressed as remarks or reviews. With the social network, gatherings and websites, these reviews rose as a significant factor for the client’s decision to buy anything or not. These days, a vast scalable computing environment provides us with very sophisticated way of carrying out various data-intensive natural language processing (NLP) and machine-learning tasks to examine these reviews. One such example is text classification, a compelling method for predicting the clients' sentiment. In this paper, we attempt to center our work of sentiment analysis on movie review database. We look at the sentiment expression to order the extremity of the movie reviews on a size of 0(highly disliked) to 4(highly preferred) and perform feature extraction and ranking and utilize these features to prepare our multilabel classifier to group the movie review into its right rating. This paper incorporates sentiment analysis utilizing feature-based opinion mining and managed machine learning. The principle center is to decide the extremity of reviews utilizing nouns, verbs, and adjectives as opinion words. In addition, a comparative study on different classification approaches has been performed to determine the most appropriate classifier to suit our concern problem space. In our study, we utilized six distinctive machine learning algorithms – Naïve Bayes, Logistic Regression, SVM (Support Vector Machine), RF (Random Forest) KNN (K nearest neighbors) and SoftMax Regression.


Sign in / Sign up

Export Citation Format

Share Document