scholarly journals PENGARUH STEMMING TERHADAP EKSTRAKSI TOPIK MENGGUNAKAN METODE TF*IDF*DF PADA APLIKASI PDS

2017 ◽  
Vol 2 (1) ◽  
Author(s):  
Luthfan Hadi Pramono ◽  
Cuk Subiyantoro

Personal Digital Secretary (PDS) is a system that was developed to be a "personal secretary" who work alongside users digitally. PDS convey information to users in the form of email, social media and news. In order to know the information and news from the outside, it must be done by extracting user topics through email and social media, with the result that news information will have corresponding relationships with users. User topic extraction through email and social media in PDS is using modified weighting method in TF*IDF algorithm named TF*IDF*DF. In the further development, added stemming process in hopes of obtaining an appropriate topic. From the research that has been done, there are differences in terms obtained from the topic extraction without addition stemming process and with addition of stemming process. News information obtained by the addition of stemming process has more focused results than the news information obtained from the topics extraction without additional stemming process. With the addition of stemming process on the TF*IDF*DF algorithm indicates that the word (terms) results obtained from the extraction process has become the basic words because of stemming process. These Basic words are the basic form that an indication of a topicKeywords: User topic, topic extraction, TF*IDF, topic model, fiture selection.

2020 ◽  
Vol 14 (02) ◽  
pp. 273-293
Author(s):  
Yingcheng Sun ◽  
Richard Kolacinski ◽  
Kenneth Loparo

With the explosive growth of online discussions published everyday on social media platforms, comprehension and discovery of the most popular topics have become a challenging problem. Conventional topic models have had limited success in online discussions because the corpus is extremely sparse and noisy. To overcome their limitations, we use the discussion thread tree structure and propose a “popularity” metric to quantify the number of replies to a comment to extend the frequency of word occurrences, and the “transitivity” concept to characterize topic dependency among nodes in a nested discussion thread. We build a Conversational Structure Aware Topic Model (CSATM) based on popularity and transitivity to infer topics and their assignments to comments. Experiments on real forum datasets are used to demonstrate improved performance for topic extraction with six different measurements of coherence and impressive accuracy for topic assignments.


2017 ◽  
Vol 35 (4) ◽  
pp. 770-782 ◽  
Author(s):  
Qingqing Zhou ◽  
Chengzhi Zhang

Purpose The development of social media has led to large numbers of internet users now producing massive amounts of user-generated content (UGC). UGC, which shows users’ opinions about events directly, is valuable for monitoring public opinion. Current researches have focused on analysing topic evolutions in UGC. However, few researches pay attention to emotion evolutions of sub-topics about popular events. Important details about users’ opinions might be missed, as users’ emotions are ignored. This paper aims to extract sub-topics about a popular event from UGC and investigate the emotion evolutions of each sub-topic. Design/methodology/approach This paper first collects UGC about a popular event as experimental data and conducts subjectivity classification on the data to get subjective corpus. Second, the subjective corpus is classified into different emotion categories using supervised emotion classification. Meanwhile, a topic model is used to extract sub-topics about the event from the subjective corpora. Finally, the authors use the results of emotion classification and sub-topic extraction to analyze emotion evolutions over time. Findings Experimental results show that specific primary emotions exist in each sub-topic and undergo evolutions differently. Moreover, the authors find that performance of emotion classifier is optimal with term frequency and relevance frequency as the feature-weighting method. Originality/value To the best of the authors’ knowledge, this is the first research to mine emotion evolutions of sub-topics about an event with UGC. It mines users’ opinions about sub-topics of event, which may offer more details that are useful for analysing users’ emotions in preparation for decision-making.


2021 ◽  
Vol 2 (2) ◽  
pp. 1-31
Author(s):  
Esteban A. Ríssola ◽  
David E. Losada ◽  
Fabio Crestani

Mental state assessment by analysing user-generated content is a field that has recently attracted considerable attention. Today, many people are increasingly utilising online social media platforms to share their feelings and moods. This provides a unique opportunity for researchers and health practitioners to proactively identify linguistic markers or patterns that correlate with mental disorders such as depression, schizophrenia or suicide behaviour. This survey describes and reviews the approaches that have been proposed for mental state assessment and identification of disorders using online digital records. The presented studies are organised according to the assessment technology and the feature extraction process conducted. We also present a series of studies which explore different aspects of the language and behaviour of individuals suffering from mental disorders, and discuss various aspects related to the development of experimental frameworks. Furthermore, ethical considerations regarding the treatment of individuals’ data are outlined. The main contributions of this survey are a comprehensive analysis of the proposed approaches for online mental state assessment on social media, a structured categorisation of the methods according to their design principles, lessons learnt over the years and a discussion on possible avenues for future research.


2021 ◽  
Vol 10 (7) ◽  
pp. 474
Author(s):  
Bingqing Wang ◽  
Bin Meng ◽  
Juan Wang ◽  
Siyu Chen ◽  
Jian Liu

Social media data contains real-time expressed information, including text and geographical location. As a new data source for crowd behavior research in the era of big data, it can reflect some aspects of the behavior of residents. In this study, a text classification model based on the BERT and Transformers framework was constructed, which was used to classify and extract more than 210,000 residents’ festival activities based on the 1.13 million Sina Weibo (Chinese “Twitter”) data collected from Beijing in 2019 data. On this basis, word frequency statistics, part-of-speech analysis, topic model, sentiment analysis and other methods were used to perceive different types of festival activities and quantitatively analyze the spatial differences of different types of festivals. The results show that traditional culture significantly influences residents’ festivals, reflecting residents’ motivation to participate in festivals and how residents participate in festivals and express their emotions. There are apparent spatial differences among residents in participating in festival activities. The main festival activities are distributed in the central area within the Fifth Ring Road in Beijing. In contrast, expressing feelings during the festival is mainly distributed outside the Fifth Ring Road in Beijing. The research integrates natural language processing technology, topic model analysis, spatial statistical analysis, and other technologies. It can also broaden the application field of social media data, especially text data, which provides a new research paradigm for studying residents’ festival activities and adds residents’ perception of the festival. The research results provide a basis for the design and management of the Chinese festival system.


2021 ◽  
Vol 19 (7) ◽  
pp. 59-82
Author(s):  
Md Ashraf Ahmed, PhD Candidate ◽  
Arif Mohaimin Sadri, PhD ◽  
M. Hadi Amini, PhD, DEng

Risk perception and risk averting behaviors of public agencies in the emergence and spread of COVID-19 can be retrieved through online social media (Twitter), and such interactions can be echoed in other information outlets. This study collected time-sensitive online social media data and analyzed patterns of health risk communication of public health and emergency agencies in the emergence and spread of novel coronavirus using data-driven methods. The major focus is toward understanding how policy-making agencies communicate risk and response information through social media during a pandemic and influence community response—ie, timing of lockdown, timing of reopening, etc.—and disease outbreak indicators—ie, number of confirmed cases and number of deaths. Twitter data of six major public organizations (1,000-4,500 tweets per organization) are collected from February 21, 2020 to June 6, 2020. Several machine learning algorithms, including dynamic topic model and sentiment analysis, are applied over time to identify the topic dynamics over the specific timeline of the pandemic. Organizations emphasized on various topics—eg, importance of wearing face mask, home quarantine, understanding the symptoms, social distancing and contact tracing, emerging community transmission, lack of personal protective equipment, COVID-19 testing and medical supplies, effect of tobacco, pandemic stress management, increasing hospitalization rate, upcoming hurricane season, use of convalescent plasma for COVID-19 treatment, maintaining hygiene, and the role of healthcare podcast in different timeline. The findings can benefit emergency management, policymakers, and public health agencies to identify targeted information dissemination policies for public with diverse needs based on how local, federal, and international agencies reacted to COVID-19.


2014 ◽  
Vol 24 (2) ◽  
pp. 181-204 ◽  
Author(s):  
Jeff McCarthy ◽  
Jennifer Rowley ◽  
Catherine Jane Ashworth ◽  
Elke Pioch

Purpose – The purpose of this paper is to contribute knowledge on the issues and benefits associated with managing brand presence and relationships through social media. UK football clubs are big businesses, with committed communities of fans, so are an ideal context from which to develop an understanding of the issues and challenges facing organisations as they seek to protect and promote their brand online. Design/methodology/approach – Due to the emergent nature of social media, and the criticality of the relationships between clubs and their fans, an exploratory study using a multiple case study approach was used to gather rich insights into the phenomenon. Findings – Clubs agreed that further development of social media strategies had potential to deliver interaction and engagement, community growth and belonging, traffic flow to official web sites and commercial gain. However, in developing their social media strategies they had two key concerns. The first concern was the control of the brand presence and image in social media, and how to respond to the opportunities that social media present to fans to impact on the brand. The second concern was how to strike an appropriate balance between strategies that deliver short-term revenue, and those that build longer term brand loyalty. Originality/value – This research is the first to offer insights into the issues facing organisations when developing their social media strategy.


Author(s):  
Nadejda Komendantova ◽  
Love Ekenberg ◽  
Mattias Svahn ◽  
Aron Larsson ◽  
Syed Iftikhar Hussain Shah ◽  
...  

AbstractMisinformation in social media is an actual and contested policy problem given its outreach and the variety of stakeholders involved. In particular, increased social media use makes the spread of misinformation almost universal. Here we demonstrate a framework for evaluating tools for detecting misinformation using a preference elicitation approach, as well as an integrated decision analytic process for evaluating desirable features of systems for combatting misinformation. The framework was tested in three countries (Austria, Greece, and Sweden) with three groups of stakeholders (policymakers, journalists, and citizens). Multi-criteria decision analysis was the methodological basis for the research. The results showed that participants prioritised information regarding the actors behind the distribution of misinformation and tracing the life cycle of misinformative posts. Another important criterion was whether someone intended to delude others, which shows a preference for trust, accountability, and quality in, for instance, journalism. Also, how misinformation travels is important. However, all criteria that involved active contributions to dealing with misinformation were ranked low in importance, which shows that participants may not have felt personally involved enough in the subject or situation. The results also show differences in preferences for tools that are influenced by cultural background and that might be considered in the further development of tools.


KALPATARU ◽  
2018 ◽  
Vol 27 (1) ◽  
pp. 31
Author(s):  
Marlon Ririmasse

Abstract. Social media has become a tool that links almost all aspects of human life, from the technology of information to the cultural segment where archaeology is part of it. For more than two decades, social media not only has become an informal place to encounter and exchange of ideas but also holds important role to share about archeological knowledge to the public in Maluku. This paper attempts to observe the correlation between archaeology and social media to support the effort of expanding the archaological knowledge and cultural history in Maluku. The method used in this research is literature study. The results of the study indicates that social media has become one of the main agents in the publication of archaeological knowledge in Maluku and is very prospective for further development. Keywords: Archaeology, public, social media, Maluku  Abstrak. Media sosial telah menjadi wahana yang bertautan dengan hampir seluruh aspek kehidupan manusia saat ini mulai dari ranah teknologi informasi hingga segmen kebudayaan, termasuk di dalamnya disiplin arkeologi. Sudah lebih dari dua dekade media sosial tidak saja menjadi ruang informal perjumpaan dan pertukaran gagasan, tetapi telah menjelma menjadi motor efektif yang turut menggerakkan dinamika akademis disiplin arkeologi, termasuk menjadi agen bagi interaksi arkeologi dan masyarakat. Media sosial berperan sebagai salah satu ruang paling efektif dalam meluaskan pengetahuan arkeologi bagi publik juga masuk di Maluku. Makalah ini mencoba mengamati hubungan disiplin arkeologi dan media sosial bagi perluasan pengetahuan arkeologi dan sejarah budaya untuk masyarakat di Maluku. Metode yang digunakan adalah kajian pustaka. Hasil studi menemukan bahwa media sosial telah menjadi salah satu agen utama dalam publikasi pengetahuan arkeologi di Maluku dan prospektif untuk terus dikembangkan ke depan.Kata kunci: Arkeologi, publik, media sosial, Maluku


2021 ◽  
pp. 1-10
Author(s):  
Wang Gao ◽  
Hongtao Deng ◽  
Xun Zhu ◽  
Yuan Fang

Harmful information identification is a critical research topic in natural language processing. Existing approaches have been focused either on rule-based methods or harmful text identification of normal documents. In this paper, we propose a BERT-based model to identify harmful information from social media, called Topic-BERT. Firstly, Topic-BERT utilizes BERT to take additional information as input to alleviate the sparseness of short texts. The GPU-DMM topic model is used to capture hidden topics of short texts for attention weight calculation. Secondly, the proposed model divides harmful short text identification into two stages, and different granularity labels are identified by two similar sub-models. Finally, we conduct extensive experiments on a real-world social media dataset to evaluate our model. Experimental results demonstrate that our model can significantly improve the classification performance compared with baseline methods.


2019 ◽  
Vol 1 (1) ◽  
pp. 45-78
Author(s):  
Chankyung Pak

Abstract To disseminate their stories efficiently via social media, news organizations make decisions that resemble traditional editorial decisions. However, the decisions for social media may deviate from traditional ones because they are often made outside the newsroom and guided by audience metrics. This study focuses on selective link sharing as quasi-gatekeeping on Twitter ‐ conditioning a link sharing decision about news content. It illustrates how selective link sharing resembles and deviates from gatekeeping for the publication of news stories. Using a computational data collection method and a machine learning technique called Structural Topic Model (STM), this study shows that selective link sharing generates a different topic distribution between news websites and Twitter and thus significantly revokes the specialty of news organizations. This finding implies that emergent logic, which governs news organizations’ decisions for social media, can undermine the provision of diverse news.


Sign in / Sign up

Export Citation Format

Share Document