Daily and hourly mood pattern discovery of Turkish twitter users

2015 ◽  
Vol 5 (2) ◽  
pp. 90
Author(s):  
Mete Celik ◽  
Ahmet Sakir Dokuz

<p>Massive amount of data-related applications and widespread usage of web technologies has started big data era. Social media data is one of the big data sources. Mining social media data provides useful insights for companies and organizations for developing their services, products or organizations. This study aims to analyze Turkish Twitter users based on daily and hourly social media sharings. By this way, daily and hourly mood patterns of Turkish social media users could be revealed in positive or negative manner. For this purpose, Support Vector Machines (SVM) classification algorithm and Term Frequency – Inverse Document Frequency (TF-IDF) feature selection technique was used. As far as our knowledge, this is the first attempt to analyze people’s all sharings on social media and generate results for temporal-based indicators like macro and micro levels.</p><p> </p><p>Keywords: big data, social media, text classification, svm, tf-idf term weighting, daily and hourly mood patterns.</p>

2021 ◽  
Author(s):  
Steven F. Lehrer ◽  
Tian Xie

There exists significant hype regarding how much machine learning and incorporating social media data can improve forecast accuracy in commercial applications. To assess if the hype is warranted, we use data from the film industry in simulation experiments that contrast econometric approaches with tools from the predictive analytics literature. Further, we propose new strategies that combine elements from each literature in a bid to capture richer patterns of heterogeneity in the underlying relationship governing revenue. Our results demonstrate the importance of social media data and value from hybrid strategies that combine econometrics and machine learning when conducting forecasts with new big data sources. Specifically, although both least squares support vector regression and recursive partitioning strategies greatly outperform dimension reduction strategies and traditional econometrics approaches in forecast accuracy, there are further significant gains from using hybrid approaches. Further, Monte Carlo experiments demonstrate that these benefits arise from the significant heterogeneity in how social media measures and other film characteristics influence box office outcomes. This paper was accepted by J. George Shanthikumar, big data analytics.


JAMIA Open ◽  
2021 ◽  
Vol 4 (2) ◽  
Author(s):  
Yuan-Chi Yang ◽  
Mohammed Ali Al-Garadi ◽  
Jennifer S Love ◽  
Jeanmarie Perrone ◽  
Abeed Sarker

Abstract Objective Biomedical research involving social media data is gradually moving from population-level to targeted, cohort-level data analysis. Though crucial for biomedical studies, social media user’s demographic information (eg, gender) is often not explicitly known from profiles. Here, we present an automatic gender classification system for social media and we illustrate how gender information can be incorporated into a social media-based health-related study. Materials and Methods We used a large Twitter dataset composed of public, gender-labeled users (Dataset-1) for training and evaluating the gender detection pipeline. We experimented with machine learning algorithms including support vector machines (SVMs) and deep-learning models, and public packages including M3. We considered users’ information including profile and tweets for classification. We also developed a meta-classifier ensemble that strategically uses the predicted scores from the classifiers. We then applied the best-performing pipeline to Twitter users who have self-reported nonmedical use of prescription medications (Dataset-2) to assess the system’s utility. Results and Discussion We collected 67 181 and 176 683 users for Dataset-1 and Dataset-2, respectively. A meta-classifier involving SVM and M3 performed the best (Dataset-1 accuracy: 94.4% [95% confidence interval: 94.0–94.8%]; Dataset-2: 94.4% [95% confidence interval: 92.0–96.6%]). Including automatically classified information in the analyses of Dataset-2 revealed gender-specific trends—proportions of females closely resemble data from the National Survey of Drug Use and Health 2018 (tranquilizers: 0.50 vs 0.50; stimulants: 0.50 vs 0.45), and the overdose Emergency Room Visit due to Opioids by Nationwide Emergency Department Sample (pain relievers: 0.38 vs 0.37). Conclusion Our publicly available, automated gender detection pipeline may aid cohort-specific social media data analyses (https://bitbucket.org/sarkerlab/gender-detection-for-public).


Author(s):  
Philip Habel ◽  
Yannis Theocharis

In the last decade, big data, and social media in particular, have seen increased popularity among citizens, organizations, politicians, and other elites—which in turn has created new and promising avenues for scholars studying long-standing questions of communication flows and influence. Studies of social media play a prominent role in our evolving understanding of the supply and demand sides of the political process, including the novel strategies adopted by elites to persuade and mobilize publics, as well as the ways in which citizens react, interact with elites and others, and utilize platforms to persuade audiences. While recognizing some challenges, this chapter speaks to the myriad of opportunities that social media data afford for evaluating questions of mobilization and persuasion, ultimately bringing us closer to a more complete understanding Lasswell’s (1948) famous maxim: “who, says what, in which channel, to whom, [and] with what effect.”


2018 ◽  
Vol 03 (03) ◽  
pp. 1850003 ◽  
Author(s):  
Jared Oliverio

Big Data is a very popular term today. Everywhere you turn companies and organizations are talking about their Big Data solutions and Analytic applications. The source of the data used in these applications varies. However, one type of data is of great interest to most organizations, Social Media Data. Social Media applications are used by a large percentage of the world’s population. The ability to instantly connect and reach other people and companies over distributed distances is an important part of today’s society. Social Media applications allow users to share comments, opinions, ideas, and media with friends, family, businesses, and organizations. The data contained in these comments, ideas, and media are valuable to many types of organizations. Through Data Mining and Analysis, it is possible to predict specific behavior in users of the applications. Currently, several technologies aid in collecting, analyzing, and displaying this data. These technologies allow users to apply this data to solve different problems, in different organizations, including the finance, medicine, environmental, education, and advertising industries. This paper aims to highlight the current technologies used in Data Mining and Analyzing Social Media data, the industries using this data, as well as the future of this field.


2018 ◽  
Vol 5 (2) ◽  
pp. 205395171880773 ◽  
Author(s):  
Cheryl Cooky ◽  
Jasmine R Linabary ◽  
Danielle J Corple

Social media offers an attractive site for Big Data research. Access to big social media data, however, is controlled by companies that privilege corporate, governmental, and private research firms. Additionally, Institutional Review Boards’ regulative practices and slow adaptation to emerging ethical dilemmas in online contexts creates challenges for Big Data researchers. We examine these challenges in the context of a feminist qualitative Big Data analysis of the hashtag event #WhyIStayed. We argue power, context, and subjugated knowledges must each be central considerations in conducting Big Data social media research. In doing so, this paper offers a feminist practice of holistic reflexivity in order to help social media researchers navigate and negotiate this terrain.


2016 ◽  
Author(s):  
Jonathan Mellon

This chapter discusses the use of large quantities of incidentallycollected data (ICD) to make inferences about politics. This type of datais sometimes referred to as “big data” but I avoid this term because of itsconflicting definitions (Monroe, 2012; Ward & Barker, 2013). ICD is datathat was created or collected primarily for a purpose other than analysis.Within this broad definition, this chapter focuses particularly on datagenerated through user interactions with websites. While ICD has beenaround for at least half a century, the Internet greatly expanded theavailability and reduced the cost of ICD. Examples of ICD include data onInternet searches, social media data, and user data from civic platforms.This chapter briefly explains some sources and uses of ICD and thendiscusses some of the potential issues of analysis and interpretation thatarise when using ICD, including the different approaches to inference thatresearchers can use.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Fengjun Tian ◽  
Yang Yang ◽  
Zhenxing Mao ◽  
Wenyue Tang

Purpose This paper aims to compare the forecasting performance of different models with and without big data predictors from search engines and social media. Design/methodology/approach Using daily tourist arrival data to Mount Longhu, China in 2018 and 2019, the authors estimated ARMA, ARMAX, Markov-switching auto-regression (MSAR), lasso model, elastic net model and post-lasso and post-elastic net models to conduct one- to seven-days-ahead forecasting. Search engine data and social media data from WeChat, Douyin and Weibo were incorporated to improve forecasting accuracy. Findings Results show that search engine data can substantially reduce forecasting error, whereas social media data has very limited value. Compared to the ARMAX/MSAR model without big data predictors, the corresponding post-lasso model reduced forecasting error by 39.29% based on mean square percentage error, 33.95% based on root mean square percentage error, 46.96% based on root mean squared error and 45.67% based on mean absolute scaled error. Practical implications Results highlight the importance of incorporating big data predictors into daily demand forecasting for tourism attractions. Originality/value This study represents a pioneering attempt to apply the regularized regression (e.g. lasso model and elastic net) in tourism forecasting and to explore various daily big data indicators across platforms as predictors.


Sign in / Sign up

Export Citation Format

Share Document