scholarly journals Lexicon-Based Indonesian Local Language Abusive Words Dictionary to Detect Hate Speech in Social Media

Author(s):  
Mardhiya Hayaty ◽  
Sumarni Adi ◽  
Anggit Dwi Hartanto

Background: Hate speech is an expression to someone or a group of people that contain feelings of hate and/or anger at people or groups. On social media users are free to express themselves by writing harsh words and share them with a group of people so that it triggers separations and conflicts between groups. Currently, research has been conducted by several experts to detect hate speech in social media namely machine learning-based and lexicon-based, but the machine learning approach has a weakness namely the manual labelling process by an annotator in separating positive, negative or neutral opinions takes time long and tiringObjective: This study aims to produce a dictionary containing abusive words from local languages in Indonesia. Lexicon-base is very dependent on the language contained in dictionary words. Indonesia has thousands of tribes with 2500 local languages, and 80% of the population of Indonesia use local languages in communication, with the result that a significant challenge to detect hate speech of social media.Methods: Abusive words surveys are conducted by using proportionate stratified random sampling techniques in 4 major tribes on the island of Java, namely Betawi, Sundanese, Javanese, MadureseResults: The experimental results produce 250 abusive words dictionary from 4 major Indonesian tribes to detect hate speech in Indonesian social media by using the lexicon-based approach. Conclusion: A stratified random sampling technique has been conducted in 4 major Indonesian tribes to produce 250 abusive words for hate speech detection using the lexicon-based approach.

Jurnal Common ◽  
2020 ◽  
Vol 3 (2) ◽  
pp. 120-141
Author(s):  
Eddy Syarif

This research is based on the theory of social assessment, is part of the theory of communication that describes and describes how individuals assess the messages that begin when reading, listening or responding a message done. This research uses quantitative approach done by using survey method, which has focus on attitude effect on hate speech in social media at youth in Condet area, Jakarta The main hypothesis using path analisys test is calculated using SPSS (Statistical Programme Servive Solution) based on spreadsheets from Microsoft Excell. Obtaining data with questionnaire to 212 respondents as sample, through stratified random sampling technique. The research hypothesis was rejected, that there was no direct and indirect influence of hate speech on social media on the attitude and opinion of youth in Condet Jakarta area. The affective aspect relating to one's emotional outlook does not show any influence, nor is the conative aspect of hate speech acts showing no indirect effect in the appearance of influence on hate speech. Youth opinion is not good against hate speech in social media, also has been a change from social media often visited by youth, Facebook (FB), Twitter, YouTube to WhatsApp (WA) and Instagram (IG).


2021 ◽  
Vol 13 (3) ◽  
pp. 80
Author(s):  
Lazaros Vrysis ◽  
Nikolaos Vryzas ◽  
Rigas Kotsakis ◽  
Theodora Saridou ◽  
Maria Matsiola ◽  
...  

Social media services make it possible for an increasing number of people to express their opinion publicly. In this context, large amounts of hateful comments are published daily. The PHARM project aims at monitoring and modeling hate speech against refugees and migrants in Greece, Italy, and Spain. In this direction, a web interface for the creation and the query of a multi-source database containing hate speech-related content is implemented and evaluated. The selected sources include Twitter, YouTube, and Facebook comments and posts, as well as comments and articles from a selected list of websites. The interface allows users to search in the existing database, scrape social media using keywords, annotate records through a dedicated platform and contribute new content to the database. Furthermore, the functionality for hate speech detection and sentiment analysis of texts is provided, making use of novel methods and machine learning models. The interface can be accessed online with a graphical user interface compatible with modern internet browsers. For the evaluation of the interface, a multifactor questionnaire was formulated, targeting to record the users’ opinions about the web interface and the corresponding functionality.


2021 ◽  
Vol 11 (18) ◽  
pp. 8575
Author(s):  
Sudhir Kumar Mohapatra ◽  
Srinivas Prasad ◽  
Dwiti Krishna Bebarta ◽  
Tapan Kumar Das ◽  
Kathiravan Srinivasan ◽  
...  

Hate speech on social media may spread quickly through online users and subsequently, may even escalate into local vile violence and heinous crimes. This paper proposes a hate speech detection model by means of machine learning and text mining feature extraction techniques. In this study, the authors collected the hate speech of English-Odia code mixed data from a Facebook public page and manually organized them into three classes. In order to build binary and ternary datasets, the data are further converted into binary classes. The modeling of hate speech employs the combination of a machine learning algorithm and features extraction. Support vector machine (SVM), naïve Bayes (NB) and random forest (RF) models were trained using the whole dataset, with the extracted feature based on word unigram, bigram, trigram, combined n-grams, term frequency-inverse document frequency (TF-IDF), combined n-grams weighted by TF-IDF and word2vec for both the datasets. Using the two datasets, we developed two kinds of models with each feature—binary models and ternary models. The models based on SVM with word2vec achieved better performance than the NB and RF models for both the binary and ternary categories. The result reveals that the ternary models achieved less confusion between hate and non-hate speech than the binary models.


Author(s):  
Rizqa Harmiliya ◽  
Mulawarman Mulawarman ◽  
Eko Nusantoro

Penelitian ini bertujuan untuk mengetahui hubungan antara pola relasi sosial teman sebaya dengan penggunaan media sosial pada siswa sekolah menengah pertama baik secara parsial maupun secara bersama-sama. Penelitian ini menggunakan desain deskriptif kuantitatif korelasional. Sampel yang digunakan dalam penrelitian ini sejumlah 213 dari populasi siswa berjumlah 542 dengan teknik pengambilan sampel proportionate stratified random sampling, Alat pengumpulan data menggunakan skala pola relasi sosial teman sebaya dan angket penggunaan media sosial, yang masing-masing realibilitasnya 0,837 dan 0,886. Adapun teknik analisis data menggunakan product moment. Hasil penelitian menunjukkan bahwa antara pola relasi sosial teman sebaya dengan penggunaan media sosial memiliki hubungan yang signifikan (r= 0,221 ; p < 0,05). Jadi dapat disimpulkan bahwa ada hubungan yang signifikan antara pola relasi sosial teman sebaya dengan penggunaan media sosial. Semakin tinggi penggunaan media sosial maka semakin tinggi pola relasi sosial teman sebaya siswa.   This research has a purpose to know the relationship between the peer relation patern with social media utilizing by junior high school students either in partial or together manner. This research uses descriptive correlational quantitative design. The sample used in this research amount 213 from students population which are amount 542 by proportionate stratified random sampling technique. The instrument to accumulate the data uses peer relation patern and social media utilizing questionnaire, which each their realibility 0,837 and 0,886. The data analysis technique uses product moment. The result of this research shows that between the peer relation patern with social media utilizing has a significant relation (r= 0,221 ; p < 0,05). Conclusion, we can say that there is a significant relation between the peer relation patern withsocial media utilizing. The more high social media utilizing, the more high the peer relation patern of students.


2019 ◽  
Vol 6 (2) ◽  
pp. 129-140
Author(s):  
Marianus Mantovanny Tapung ◽  
Ambros Leonangung Edu ◽  
Petrus Redy Partus Jaya

Abstract:This study aims to describe the ability of media and the critical power of students in Manggarai Regency. The ability to media is measured by indicators: The types of social media that are most often used, the ability to apply social media, frequently sought media content, the critical power of social media content, and social media content that is often distributed. The type of research used was a descriptive cross sectional study. The research respondents were 353 students who were determined using the proportional stratified random sampling technique. Data was collected using an online questionnaire. To guarantee the credibility of the data, researchers applied the one vote method and cross-checked through interviews via mobile phones. Data is presented in the form of tables and graphs. The results of the study show that the types of media most used by students are Facebook and WhatsApp. Both types of social media are applied by students themselves. Most students are able to apply these social media proficiently. However, many students use the media to access content that does not support their intellectual knowledge and insight. The use of media is limited to building social relations or friendship among students. This habit has an impact on students' critical power in assessing hoax content and factual content. The results of this study illustrate the discrepancy between the ability to apply media and students' critical power. This condition must be the concern of educators and stakeholders in Manggarai Regency.Keywords: Media, Critical Power, Manggarai Flores Students Abstrak:Penelitian ini bertujuan mendeskripsikan kemampuan bermedia dan daya kritis mahasiswa di Kabupaten Manggarai. Kemampuan bermedia diukur berdasarkan indikator-indikator: Jenis media sosial yang paling sering digunakan, kemampuan mengaplikasikan media sosial, konten media yang sering dicari, daya kritis terhadap konten media sosial, dan konten media sosial yang sering disebarkan. Jenis penelitian yang digunakan adalah penelitian deskriptif cross sectional. Responden penelitian berjumlah 353 mahasiswa yang ditentukan menggunakan teknik proportional stratified random sampling. Data dikumpulkan menggunakan kuesioner online. Untuk menjamin kredibilitas data, peneliti menerapkan metode one vote dan melakukan crosscheck melalui wawancara via handphone.  Data disajikan dalam bentuk tabel dan grafik. Hasil penelitian menunjukkan bahwa jenis media yang paling banyak digunakan mahasiswa adalah facebook dan whatsapp. Kedua jenis media sosial ini diaplikasikan sendiri oleh mahasiswa. Sebagian besar mahasiswa mampu mengaplikasikan media-media sosial tersebut secara mahir. Namun, banyak mahasiswa menggunakan media untuk mengakses konten-konten yang kurang mendukung pengetahuan dan wawasan intelektual mereka. Penggunaan media hanya sebatas membangun relasi sosial atau pertemanan di antara mahasiswa. Kebiasaan ini berdampak pada daya kritis mahasiswa dalam menilai konten-konten hoax dan konten-konten fakta. Hasil penelitian ini menggambarkan adanya diskrepansi antara kemampuan mengaplikasikan media dengan daya kritis mahasiswa. Kondisi ini mesti menjadi perhatian para pendidik dan stakeholdersdi Kabupaten Manggarai.Kata Kunci: Media, Daya Kritis, Mahasiswa Manggarai Flores


Sign in / Sign up

Export Citation Format

Share Document