hate speech
Recently Published Documents





2022 ◽  
Vol 2 (11) ◽  
pp. 1532-1554
Lilis Erna Yulianti

The virtual world is not a world without borders so we are free to do anything. But as in the real world that has norms, ethics and etiquette, in cyberspace also requires a netiquette. Netiquette as a healthy internet moral regulation is needed so that digital communication between netizens runs harmoniously and respect each other and away from conflict and deviant behavior so as to make the lives of netizens become more comfortable (comfort life). The implementation of netiket if done continuously in the long term will have a positive impact on netizens and their social environment. The positive impact for netizens towards strengthening their soft skills will form a generation of character, integrity, morality, having a healthy mentality, and getting appreciation from others who can be reinforcement for him to continue to do good to others. The positive impact on the environment makes interactions in the social environment healthier in more human communication patterns in their interaction patterns.In fact, there are still many disputes, violations and crimes that are implicated in social media and online media. For example: the rise of pornographic content, hate speech content, hoax issues, cyberbullying, insults, online fraud, digital sexual crimes, child trafficking, online prostitution, and various other cyber crimes. Based on the problems in the virtual world, the research entitled "Netiquette Strengthening Soft Skills Netizens for Generation of Character" aims to compare the phenomenon of ethical violations in social media and online media conducted by netizens associated with ethical guidelines in cyberspace (netiquette). This research uses qualitative methods with a literature review approach.

2022 ◽  
Vol 11 (2) ◽  
pp. 0-0

In the recent times transfer learning models have known to exhibited good results in the area of text classification for question-answering, summarization, next word prediction but these learning models have not been extensively used for the problem of hate speech detection yet. We anticipate that these networks may give better results in another task of text classification i.e. hate speech detection. This paper introduces a novel method of hate speech detection based on the concept of attention networks using the BERT attention model. We have conducted exhaustive experiments and evaluation over publicly available datasets using various evaluation metrics (precision, recall and F1 score). We show that our model outperforms all the state-of-the-art methods by almost 4%. We have also discussed in detail the technical challenges faced during the implementation of the proposed model.

Tharindu Ranasinghe ◽  
Marcos Zampieri

Offensive content is pervasive in social media and a reason for concern to companies and government organizations. Several studies have been recently published investigating methods to detect the various forms of such content (e.g., hate speech, cyberbullying, and cyberaggression). The clear majority of these studies deal with English partially because most annotated datasets available contain English data. In this article, we take advantage of available English datasets by applying cross-lingual contextual word embeddings and transfer learning to make predictions in low-resource languages. We project predictions on comparable data in Arabic, Bengali, Danish, Greek, Hindi, Spanish, and Turkish. We report results of 0.8415 F1 macro for Bengali in TRAC-2 shared task [23], 0.8532 F1 macro for Danish and 0.8701 F1 macro for Greek in OffensEval 2020 [58], 0.8568 F1 macro for Hindi in HASOC 2019 shared task [27], and 0.7513 F1 macro for Spanish in in SemEval-2019 Task 5 (HatEval) [7], showing that our approach compares favorably to the best systems submitted to recent shared tasks on these three languages. Additionally, we report competitive performance on Arabic and Turkish using the training and development sets of OffensEval 2020 shared task. The results for all languages confirm the robustness of cross-lingual contextual embeddings and transfer learning for this task.

2022 ◽  
Raed Toghuj

Recently, forensic linguistics has been an arena of significance in many fields of study especially in judicial systems, legal and forensic matters, investigation, and open-source intelligence across the globe. The term typically refers to legal and professional analysis of recorded or written language by experts (forensic linguists) to provide expert and correct interpretation. It is particularly used in legal matters especially in the court and criminal justice systems. In the court system, forensic linguistics is heavily applied to examine language evidence – either recorded in voice or handwritten in civil matters or crimes. The analysis or examination is carried out for two major reasons. First, the analysis is utilized when relevant investigations are carried out with a focus to help in identifying witnesses or suspects in specific cases or scenes, or the determination of the significance of writing or utterance to a case. Secondly, forensic linguistics plays a pivotal role when written or spoken language samples are presented to a court as evidence. In such contexts, forensic linguists provide expert testimonies of correct interpretation of the samples. As such, language analysis is significant in any judicial matters and systems provided the questionable language constitutes crimes. In most cases, crimes such as threats, hate speech, bribery, hate literature, coercion among others necessitate the use of a linguist expert for correct and most importantly professional interpretation. Evidently, the concept of forensic linguistics is ascribed to provide the truth from recorded speeches or voices and written languages in the face of a crime or relevant legal investigation matters. This paper will posit on the different ways and methods that forensic linguistics is applied to investigate and provide professional interpretation of recorded and written languages in evidentiary and investigative contexts.

2022 ◽  
Vol 3 (2) ◽  
Siva Sai ◽  
Naman Deep Srivastava ◽  
Yashvardhan Sharma

2022 ◽  
Vol 2022 ◽  
pp. 1-17
Rukhma Qasim ◽  
Waqas Haider Bangyal ◽  
Mohammed A. Alqarni ◽  
Abdulwahab Ali Almazroi

Text Classification problem has been thoroughly studied in information retrieval problems and data mining tasks. It is beneficial in multiple tasks including medical diagnose health and care department, targeted marketing, entertainment industry, and group filtering processes. A recent innovation in both data mining and natural language processing gained the attention of researchers from all over the world to develop automated systems for text classification. NLP allows categorizing documents containing different texts. A huge amount of data is generated on social media sites through social media users. Three datasets have been used for experimental purposes including the COVID-19 fake news dataset, COVID-19 English tweet dataset, and extremist-non-extremist dataset which contain news blogs, posts, and tweets related to coronavirus and hate speech. Transfer learning approaches do not experiment on COVID-19 fake news and extremist-non-extremist datasets. Therefore, the proposed work applied transfer learning classification models on both these datasets to check the performance of transfer learning models. Models are trained and evaluated on the accuracy, precision, recall, and F1-score. Heat maps are also generated for every model. In the end, future directions are proposed.

Ioannis Mollas ◽  
Zoe Chrysopoulou ◽  
Stamatis Karlos ◽  
Grigorios Tsoumakas

AbstractOnline hate speech is a recent problem in our society that is rising at a steady pace by leveraging the vulnerabilities of the corresponding regimes that characterise most social media platforms. This phenomenon is primarily fostered by offensive comments, either during user interaction or in the form of a posted multimedia context. Nowadays, giant corporations own platforms where millions of users log in every day, and protection from exposure to similar phenomena appears to be necessary to comply with the corresponding legislation and maintain a high level of service quality. A robust and reliable system for detecting and preventing the uploading of relevant content will have a significant impact on our digitally interconnected society. Several aspects of our daily lives are undeniably linked to our social profiles, making us vulnerable to abusive behaviours. As a result, the lack of accurate hate speech detection mechanisms would severely degrade the overall user experience, although its erroneous operation would pose many ethical concerns. In this paper, we present ‘ETHOS’ (multi-labEl haTe speecH detectiOn dataSet), a textual dataset with two variants: binary and multi-label, based on YouTube and Reddit comments validated using the Figure-Eight crowdsourcing platform. Furthermore, we present the annotation protocol used to create this dataset: an active sampling procedure for balancing our data in relation to the various aspects defined. Our key assumption is that, even gaining a small amount of labelled data from such a time-consuming process, we can guarantee hate speech occurrences in the examined material.

Sign in / Sign up

Export Citation Format

Share Document