scholarly journals Identification of HATE speech tweets in Pashto language using Machine Learning techniques

From the last few years, researchers are very much attracted to sentiment analysis, especially towards hate speech detectionsystems. As in different languages procreation of hate speech has compelling and symbolic consideration on social media. Hate speech has a great impact on society, using hate words harms others dignity. Hate speech detectionsystems areimportant to stop the transformation of hate words into crimes. In this research,a frameworkis developedfor hate speech detectionsystemin the Pashto language. A datasetis created for which data is collected from Twitter. Because there is no related data available. Most of the research work has been done in this domain for other languages, and it’s very maturein the context of detecting hate speech. But when it arrives at the morphological languages not much work has been done especially in the Pashto language. This researchaimed and collected data from Twitter, Tweets related to ethnicity and religion. The data collected from twitter has been annotated manually and categorized the data as hate or not by comparing it with the offensive content. For hate speechdetection systemsto view the impact of different features/attribute this study performed experiments on the existing classifiers i.e.,SVM, Naïve Bayes, Decision tree and KNN. SVM produced the highest result at dataset of 500 i.e.,74% among all the classifiers. KNN and Decision Tree produced same result at dataset of 1500 i.e.,65.0%. Dataset of 2800 Decision Tree produced the highest result i.e.,72% and SVM produced 71.9%.

2018 ◽  
Vol 34 (3) ◽  
pp. 569-581 ◽  
Author(s):  
Sujata Rani ◽  
Parteek Kumar

Abstract In this article, an innovative approach to perform the sentiment analysis (SA) has been presented. The proposed system handles the issues of Romanized or abbreviated text and spelling variations in the text to perform the sentiment analysis. The training data set of 3,000 movie reviews and tweets has been manually labeled by native speakers of Hindi in three classes, i.e. positive, negative, and neutral. The system uses WEKA (Waikato Environment for Knowledge Analysis) tool to convert these string data into numerical matrices and applies three machine learning techniques, i.e. Naive Bayes (NB), J48, and support vector machine (SVM). The proposed system has been tested on 100 movie reviews and tweets, and it has been observed that SVM has performed best in comparison to other classifiers, and it has an accuracy of 68% for movie reviews and 82% in case of tweets. The results of the proposed system are very promising and can be used in emerging applications like SA of product reviews and social media analysis. Additionally, the proposed system can be used in other cultural/social benefits like predicting/fighting human riots.


2020 ◽  
pp. 193-201 ◽  
Author(s):  
Hayder A. Alatabi ◽  
Ayad R. Abbas

Over the last period, social media achieved a widespread use worldwide where the statistics indicate that more than three billion people are on social media, leading to large quantities of data online. To analyze these large quantities of data, a special classification method known as sentiment analysis, is used. This paper presents a new sentiment analysis system based on machine learning techniques, which aims to create a process to extract the polarity from social media texts. By using machine learning techniques, sentiment analysis achieved a great success around the world. This paper investigates this topic and proposes a sentiment analysis system built on Bayesian Rough Decision Tree (BRDT) algorithm. The experimental results show the success of this system where the accuracy of the system is more than 95% on social media data.


Author(s):  
Jothikumar R. ◽  
Vijay Anand R. ◽  
Visu P. ◽  
Kumar R. ◽  
Susi S. ◽  
...  

Sentiment evaluation alludes to separate the sentiments from the characteristic language and to perceive the mentality about the exact theme. Novel corona infection, a harmful malady ailment, is spreading out of the blue through the quarter, which thought processes respiratory tract diseases that can change from gentle to extraordinary levels. Because of its quick nature of spreading and no conceived cure, it ushered in a vibe of stress and pressure. In this chapter, a framework perusing principally based procedure is utilized to discover the musings of the tweets related to COVID and its effect lockdown. The chapter examines the tweets identified with the hash tags of crown infection and lockdown. The tweets were marked fabulous, negative, or fair, and a posting of classifiers has been utilized to investigate the precision and execution. The classifiers utilized have been under the four models which incorporate decision tree, regression, helpful asset vector framework, and naïve Bayes forms.


With the huge development of Internet, more users have occupied with wellbeing networks, for example, medicinal discussions to assemble wellbeing related data, to share encounters about medications, treatments, analysis or to associate with different clients with comparable condition in social media. A lot of lookup has focused on examining Twitter health tweets for subject matter modeling using quite a number clustering approaches, but few have mentioned it for sentiment analysis. The truth that such statistics carries potential information for revealing the opinion of humans about fitness services and behaviors make it an interesting study. In these paper, universal sentiments about Twitter health data was investigated. Twitter, measuring and monitoring the occurrence of social health problems. The approach is based on two stages: In first stage separating perhaps applicable tweets utilizing a lot of uniquely made standard articulations, and afterward arranges these underlying messages utilizing machine learning techniques. Using the Twitter search API and Twitter metadata geocoded content, social media tweets were selected to start filtering. Once Tweets are correctly identified, the classifier was applied to data in order to filter out the tweets. Classification results were improved by detecting the values of ROC and f-measure. This report indicates that such a method provides a viable solution for quantifying and tracking the progression of health status within society


The challenges that are to be faced while handling with hate speech is not a new thing. From thepast few years due to the boosted usage of internet, hateful activities across social media is increasing rapidly. Improved technology has made it possible to create a platform where people can feel free to share their opinions and experiences.it wouldn't be a problem if this is just the case. but we can also see hateful comments running throughout the social media targeting a person or a community. Hate speech is the statement that targets a person or community of people discriminating based on caste, creed, nationality etc. Our project aims at resolving the above problem by using Machine Learning techniques to automatically detect hate speech and classify them into various classes such as extremely positive, positive neutral etc. We have used classifier that works based on the lexicons and finally compare it with other classifiers that doesn't use lexicons. Aimed beneficiaries of this model are the people who are being targeted on social media. Based on the results they can calculate intensity of the comments.


2021 ◽  
Vol 179 ◽  
pp. 821-828
Author(s):  
Andry Chowanda ◽  
Rhio Sutoyo ◽  
Meiliana ◽  
Sansiri Tanachutiwat

Sign in / Sign up

Export Citation Format

Share Document