scholarly journals Aplikasi Text Mining untuk Klasterisasi Aduan Masyarakat Kota Semarang Menggunakan Algoritma K-means

2021 ◽  
Vol 18 (2) ◽  
pp. 215
Author(s):  
Dita Afida ◽  
Erika Devi Udayanti ◽  
Etika Kartikadarma

<p>Social media is a service that is very supportive for government activities, especially in providing openness and community-based government. One form of its implementation is the Semarang City government through the Center for Community Complaints Management (P3M), whose task is to manage community complaints that enter one of the communication channels namely social media twitter. The number of public complaints that enter every day is very varied. This is certainly quite difficult for managers in categorizing complaints reports according to the relevant Local Government Organizations (OPD). This paper focuses on the problem of how to conduct clustering of community complaints. The data source comes from Twitter using the keyword "Laporhendi". Text document data from community complaint tweets was analyzed by text mining methods. A number of pre-processing of text data processing begins with the process of case folding, tokenizing, stemming, stopword removal and word robbering with tf-idf. In conducting cluster mapping, clustering algorithm will be used in dividing the complaint cluster, namely the k-means algorithm. Evaluation of cluster results is done by using purity to determine the accuracy of the results of grouping or clustering.</p>

Author(s):  
Nourah F. Bin Hathlian ◽  
Alaaeldin M. Hafez

The need for designing Arabic text mining systems for the use on social media posts is increasingly becoming a significant and attractive research area. It serves and enhances the knowledge needed in various domains. The main focus of this paper is to propose a novel framework combining sentiment analysis with subjective analysis on Arabic social media posts to determine whether people are interested or not interested in a defined subject. For those purposes, text classification methods—including preprocessing and machine learning mechanisms—are applied. Essentially, the performance of the framework is tested using Twitter as a data source, where possible volunteers on a certain subject are identified based on their posted tweets along with their subject-related information. Twitter is considered because of its popularity and its rich content from online microblogging services. The results obtained are very promising with an accuracy of 89%, thereby encouraging further research.


2022 ◽  
pp. 57-90
Author(s):  
Surabhi Verma ◽  
Ankit Kumar Jain

People regularly use social media to express their opinions about a wide variety of topics, goods, and services which make it rich in text mining and sentiment analysis. Sentiment analysis is a form of text analysis determining polarity (positive, negative, or neutral) in text, document, paragraph, or clause. This chapter offers an overview of the subject by examining the proposed algorithms for sentiment analysis on Twitter and briefly explaining them. In addition, the authors also address fields related to monitoring sentiments over time, regional view of views, neutral tweet analysis, sarcasm detection, and various other tasks in this area that have drawn the researchers ' attention to this subject nearby. Within this chapter, all the services used are briefly summarized. The key contribution of this survey is the taxonomy based on the methods suggested and the debate on the theme's recent research developments and related fields.


Author(s):  
Junzo Watada ◽  
◽  
Keisuke Aoki ◽  
Masahiro Kawano ◽  
Muhammad Suzuri Hitam ◽  
...  

The availability of multimedia text document information has disseminated text mining among researchers. Text documents, integrate numerical and linguistic data, making text mining interesting and challenging. We propose text mining based on a fuzzy quantification model and fuzzy thesaurus. In text mining, we focus on: 1) Sentences included in Japanese text that are broken down into words. 2) Fuzzy thesaurus for finding words matching keywords in text. 3) Fuzzy multivariate analysis to analyze semantic meaning in predefined case studies. We use a fuzzy thesaurus to translate words using Chinese and Japanese characters into keywords. This speeds up processing without requiring a dictionary to separate words. Fuzzy multivariate analysis is used to analyze such processed data and to extract latent mutual related structures in text data, i.e., to extract otherwise obscured knowledge. We apply dual scaling to mining library and Web page text information, and propose integrating the result in Kansei engineering for possible application in sales, marketing, and production.


2020 ◽  
Vol 16 (3) ◽  
pp. 273
Author(s):  
Nawang Indah Cahyaningrum ◽  
Danty Welmin Yoshida Fatima ◽  
Wisnu Adi Kusuma ◽  
Sekar Ayu Ramadhani ◽  
Muhammad Rizqi Destanto ◽  
...  

Twitter is one of social media where its user can share many responses for a phenomenon through a tweet. This research used 5000 tweets from Twitter users in Bahasa Indonesia with keyword “RUU KUHP(Draft Law of KUHP)” from 16th of September until 22nd of September 2019. That tweets were processed using Rstudio software with sentiment analysis that is one of Text Mining methods. This research aims to classify Twitter users’ responses to RUU KUHP to be negative sentiment, poisitive negative, and neutral. Also, this research also aims to know about topics’ frequencies that were related to RUU KUHP through visualization with bar plot and also wordcloud. This research also aims to know words that are associated with the most frequent words. Form this research, can be known that Twitter users’ responses to RUU KUHP tend to have neutral sentiment that means they did not take side between agreeing or disagreeing. From this research, also can be known about 10 most frequent words, there are kpk, tunda, dpr, pasal, kesal, jokowi, presiden, masuk, ya, and sahkan. Beside that, can be known the other words that are associated with them and also their probability.


2021 ◽  
Author(s):  
Fei Shen ◽  
Wenting Yu ◽  
Chen Min ◽  
Qianying Ye ◽  
Chuanli Xia ◽  
...  

Text mining has been a dominant approach to extracting useful information from massive unstructured data online. But existing tools for Chinese word segmentation are not ideal for processing social media text data in Cantonese. This project developed CyberCan (https://github.com/shenfei1010/CyberCan), a lexicon of contemporary Cantonese based on more than 100 million pieces of internet texts. We compared the performance of CyberCan with existing Mandarin and Cantonese lexicons in terms of their word segmentation performance. Findings suggest that CyberCan outperforms all existing lexicons by a considerable margin.


2020 ◽  
Vol 34 (5) ◽  
pp. 826-844 ◽  
Author(s):  
Louis Tay ◽  
Sang Eun Woo ◽  
Louis Hickman ◽  
Rachel M. Saef

In the age of big data, substantial research is now moving toward using digital footprints like social media text data to assess personality. Nevertheless, there are concerns and questions regarding the psychometric and validity evidence of such approaches. We seek to address this issue by focusing on social media text data and (i) conducting a review of psychometric validation efforts in social media text mining (SMTM) for personality assessment and discussing additional work that needs to be done; (ii) considering additional validity issues from the standpoint of reference (i.e. ‘ground truth’) and causality (i.e. how personality determines variations in scores derived from SMTM); and (iii) discussing the unique issues of generalizability when validating SMTM for personality assessment across different social media platforms and populations. In doing so, we explicate the key validity and validation issues that need to be considered as a field to advance SMTM for personality assessment, and, more generally, machine learning personality assessment methods. © 2020 European Association of Personality Psychology


2020 ◽  
pp. 1483-1495
Author(s):  
Nourah F. Bin Hathlian ◽  
Alaaeldin M. Hafez

The need for designing Arabic text mining systems for the use on social media posts is increasingly becoming a significant and attractive research area. It serves and enhances the knowledge needed in various domains. The main focus of this paper is to propose a novel framework combining sentiment analysis with subjective analysis on Arabic social media posts to determine whether people are interested or not interested in a defined subject. For those purposes, text classification methods—including preprocessing and machine learning mechanisms—are applied. Essentially, the performance of the framework is tested using Twitter as a data source, where possible volunteers on a certain subject are identified based on their posted tweets along with their subject-related information. Twitter is considered because of its popularity and its rich content from online microblogging services. The results obtained are very promising with an accuracy of 89%, thereby encouraging further research.


In the stock market research, stock prediction is a challenging task due to its dynamic characteristic very similar to wealth of a nation and opinion about a stock. It is very difficult for the investor to buy or sell their stock because of noisy, chaotic properties of the stock data. Stock prediction mostly performed depends up on the numerical data obtained with technical measures or text data provided by the data sources as sentiments. A change in the fundamental measures obtained from exchange rate , gold price and crude oil price also determines the stock value. User generated contents of sentiments available in various social media like Twitter and News sites also play an important role for deciding the price of the stock. Most of the existing work deals any one of the measures technical measures or fundamental measures or sentiment measures for predicting the price of the stock. Hence, the proposed method employs combined measures derived from technical, fundamental and sentiments. Twitter and Money Control act as a data source for providing opinion data to predict the stock price. Results of the proposed system compared with the others by using various measures such as accuracy, sensitivity, specificity, Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE). It is found that the proposed methods for stock prediction outperform the existing techniques..


2018 ◽  
Vol 9 (1) ◽  
pp. 18-28 ◽  
Author(s):  
Amir Karami ◽  
London S. Bennett ◽  
Xiaoyun He

Opinion polls have been the bridge between public opinion and politicians in elections. However, developing surveys to disclose people's feedback with respect to economic issues is limited, expensive, and time-consuming. In recent years, social media such as Twitter has enabled people to share their opinions regarding elections. Social media has provided a platform for collecting a large amount of social media data. This article proposes a computational public opinion mining approach to explore the discussion of economic issues in social media during an election. Current related studies use text mining methods independently for election analysis and election prediction; this research combines two text mining methods: sentiment analysis and topic modeling. The proposed approach has effectively been deployed on millions of tweets to analyze economic concerns of people during the 2012 US presidential election.


Author(s):  
Jonathan S. Lewis

Text mining presents an efficient, scalable method to separate signals and noise in large-scale text data, and therefore to effectively analyze open-ended survey responses as well as the tremendous amount of text that students, faculty, and staff produce through their interactions online. Traditional qualitative methods are impractical when working with these data, and text mining methods are consonant with current literature on thematic analysis. This chapter provides a tutorial for researchers new to this method, including a lengthy discussion of preprocessing tasks and knowledge extraction from both supervised and unsupervised activities, potential data sources, and the range of software (both proprietary and open-source) available to them. Examples are provided throughout the paper of text mining at work in two studies involving data collected from college students. Limitations of this method and implications for future research and policy are discussed.


Sign in / Sign up

Export Citation Format

Share Document