scholarly journals BESOCIAL: A Sustainable Knowledge Graph-Based Workflow for Social Media Archiving

2021 ◽  
Author(s):  
Sven Lieber ◽  
Dylan Van Assche ◽  
Sally Chambers ◽  
Fien Messens ◽  
Friedel Geeraert ◽  
...  

Social media as infrastructure for public discourse provide valuable information that needs to be preserved. Several tools for social media harvesting exist, but still only fragmented workflows may be formed with different combinations of such tools. On top of that, social media data but also preservation-related metadata standards are heterogeneous, resulting in a costly manual process. In the framework of BESOCIAL at the Royal Library of Belgium (KBR), we develop a sustainable social media archiving workflow that integrates heterogeneous data sources in a Europeana and PREMIS-based data model to describe data preserved by open source tools. This allows data stewardship on a uniform representation and we generate metadata records automatically via queries. In this paper, we present a comparison of social media harvesting tools and our Knowledge Graph-based solution which reuses off-the-shelf open source tools to harvest social media and automatically generate preservation-related metadata records. We validate our solution by generating Encoded Archival Description (EAD) and bibliographic MARC records for preservation of harvested social media collections from Twitter collected at KBR. Other archiving institutions can build upon our solution and customize it to their own social media archiving policies.

Author(s):  
Igor Araujo ◽  
Paulo Henrique Lopes Rettore ◽  
João Guilherme Maia de Menezes

Nowadays, understanding urban mobility, transit, people viewpoint, and social behaviors has been the focus of many research and investments. However, data access is restricted to private companies and governments. In addition, the costs to create a sensor infrastructure on a given area is prohibitive. Then, using Location-Based Social Media (LBSM) may provide a new way to better comprehend the social behaviors, by the use of a users viewpoint. In this work, we propose the use of LBSM as participatory sensing, designing the Participatory Social Sensor (PSS), a friendly framework to social media data acquisition and analysis. We develop the Twitter data acquisition and analysis process, aiming to achieve the user application goals through a file setup,where the user specifies the spatial area, temporal interval, tags, and other parameters. As a result, the PSS shows a set of visual analysis which provides a context overview, allowing an easy way to researchers make-decision. A case study, Detection and Enrichment Service for Road Events Based on Heterogeneous Data Merger for VANETs, based on PSS framework was published in the current conference.


10.2196/24889 ◽  
2021 ◽  
Vol 23 (1) ◽  
pp. e24889
Author(s):  
Shi Chen ◽  
Lina Zhou ◽  
Yunya Song ◽  
Qian Xu ◽  
Ping Wang ◽  
...  

Background Social media plays a critical role in health communications, especially during global health emergencies such as the current COVID-19 pandemic. However, there is a lack of a universal analytical framework to extract, quantify, and compare content features in public discourse of emerging health issues on different social media platforms across a broad sociocultural spectrum. Objective We aimed to develop a novel and universal content feature extraction and analytical framework and contrast how content features differ with sociocultural background in discussions of the emerging COVID-19 global health crisis on major social media platforms. Methods We sampled the 1000 most shared viral Twitter and Sina Weibo posts regarding COVID-19, developed a comprehensive coding scheme to identify 77 potential features across six major categories (eg, clinical and epidemiological, countermeasures, politics and policy, responses), quantified feature values (0 or 1, indicating whether or not the content feature is mentioned in the post) in each viral post across social media platforms, and performed subsequent comparative analyses. Machine learning dimension reduction and clustering analysis were then applied to harness the power of social media data and provide more unbiased characterization of web-based health communications. Results There were substantially different distributions, prevalence, and associations of content features in public discourse about the COVID-19 pandemic on the two social media platforms. Weibo users were more likely to focus on the disease itself and health aspects, while Twitter users engaged more about policy, politics, and other societal issues. Conclusions We extracted a rich set of content features from social media data to accurately characterize public discourse related to COVID-19 in different sociocultural backgrounds. In addition, this universal framework can be adopted to analyze social media discussions of other emerging health issues beyond the COVID-19 pandemic.


2020 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Francesco Bolici ◽  
Chiara Acciarini ◽  
Lucia Marchegiani ◽  
Luca Pirolo

PurposeTechnological innovations provide huge opportunities to expand and revolutionize the scope of products and services offered. This is particularly true for tourism, which is undergoing significant changes due to the development of new technologies. The level of technology diffusion depends on several factors like the exchange of information among peers, and the attitude and shared perception among the contributors. The aim of the study is to explore the diffusion of technology in tourism with a specific focus on the social media discourse around new technologies. Thus, the paper investigates the level of interest in these new technologies analysing the information exchange occurring between individuals on Twitter in order to explore the influence of reciprocal networking.Design/methodology/approachTo capture the attitudes expressed in the industry, the study analyses the ongoing discourse on Twitter as a proxy for the participants “interest in new technologies. Through a social network analysis of the tweets and retweets conducted over a period of nine months, the research maps the level of information exchange about the diffusion of new technologies. Moreover, the sentiment analysis provides an interesting overview of the individuals” attitudes towards the awareness or the adoption of new technologies.FindingsOur analysis has provided several insights: (1) the information network on blockchain in tourism consists of participants who change very quickly over time (high turnover of accounts); (2) some contributors have an extremely important role in influencing the flow of information in the system (information centralization), they can have a generalist (discussing several topics) or a specialist (focusing on a specific topic) behaviour and this strategic choice influences their network's structure; (3) these central nodes also have an impact on the definition of positive and negative sentiment towards a topic (sentiment influencer).Research limitations/implicationsThe paper contributes to the literature on technology diffusion, by focusing on one of the preconditions of diffusion that is the shared positive attitude towards technological innovation. More specifically, we adopt a network-based approach, which is useful to explain the level of information exchange and the public discourse that can impact the shared perception and attitude towards technological innovation. The study also highlights the role of knowledge brokers in influencing this public discourse. Future studies can deepen the association between positive perception, higher levels of information exchange and increasing usage of specific technologies. Our results also suggest further exploring the opportunity to combine social media data and other sources of information to shed more light on the technological innovation diffusion processes.Practical implicationsThis paper shows how practitioners can benefit from the analysis of information exchange about new technologies in tourism adopting a network perspective with the aim of understanding the level of influence among contributors. Moreover, the increasing interest in blockchain technology and the potential combination between social media data and other sources of information can offer promising insights.Social implicationsThe present study explores the level of technology diffusion through the analysis of information exchange on social media (Twitter). Furthermore, the dynamics of individual user behaviour offers a better understanding about media effects.Originality/valueWhile previous research is focused on the users' perception towards the development of new technologies in tourism, the aim of this study is to investigate the dynamics behind the level of diffusion of information and awareness about these new technologies, which still represents an unexplored area of research.


2019 ◽  
Vol 3 (3) ◽  
pp. 38 ◽  
Author(s):  
Stefan Spettel ◽  
Dimitrios Vagianos

Social media are heavily used to shape political discussions. Thus, it is valuable for corporations and political parties to be able to analyze the content of those discussions. This is exemplified by the work of Cambridge Analytica, in support of the 2016 presidential campaign of Donald Trump. One of the most straightforward metrics is the sentiment of a message, whether it is considered as positive or negative. There are many commercial and/or closed-source tools available which make it possible to analyze social media data, including sentiment analysis (SA). However, to our knowledge, not many publicly available tools have been developed that allow for analyzing social media data and help researchers around the world to enter this quickly expanding field of study. In this paper, we provide a thorough description of implementing a tool that can be used for performing sentiment analysis on tweets. In an effort to underline the necessity for open tools and additional monitoring on the Twittersphere, we propose an implementation model based exclusively on publicly available open-source software. The resulting tool is capable of downloading Tweets in real-time based on hashtags or account names and stores the sentiment for replies to specific tweets. It is therefore capable of measuring the average reaction to one tweet by a person or a hashtag, which can be represented with graphs. Finally, we tested our open-source tool within a case study based on a data set of Twitter accounts and hashtags referring to the Syrian war, covering a short time window of one week in the spring of 2018. The results show that while high accuracy of commercial or other complicated tools may not be achieved, our proposed open source tool makes it possible to get a good overview of the overall replies to specific tweets, as well as a practical perception of tweets, related to specific hashtags, identifying them as positive or negative.


Author(s):  
Shalin Hai-Jew

Network analysis is widely used to mine social media. This involves both the study of structural metadata (information about information) and the related contents (the textual messaging, the related imagery, videos, URLs, and others). A semantic-based network analysis relies on the analysis of relationships between words and phrases (as meaningful concepts), and this approach may be applied effectively to social media data to extract insights. To gain a sense of how this might work, a trending topic of the day was chosen (namely, the free-information and data leakage movement) to see what might be illuminated using this semantic-based network analysis, an open-source technology, NodeXL, and access to multiple social media platforms. Three types of networks are extracted: (1) conversations (#hashtag microblogging networks on Twitter; #eventgraphs on Twitter; and keyword searches on Twitter; (2) contents (video networks on YouTube, related tags networks on Flickr, and article networks on Wikipedia; and (3) user accounts on Twitter, YouTube, Flickr, and Wikipedia.


2020 ◽  
Author(s):  
Shi Chen ◽  
Lina Zhou ◽  
Yunya Song ◽  
Qian Xu ◽  
Ping Wang ◽  
...  

BACKGROUND Social media plays a critical role in health communications, especially during global health emergencies such as the current COVID-19 pandemic. However, there is a lack of a universal analytical framework to extract, quantify, and compare content features in public discourse of emerging health issues on different social media platforms across a broad sociocultural spectrum. OBJECTIVE We aimed to develop a novel and universal content feature extraction and analytical framework and contrast how content features differ with sociocultural background in discussions of the emerging COVID-19 global health crisis on major social media platforms. METHODS We sampled the 1000 most shared viral Twitter and Sina Weibo posts regarding COVID-19, developed a comprehensive coding scheme to identify 77 potential features across six major categories (eg, clinical and epidemiological, countermeasures, politics and policy, responses), quantified feature values (0 or 1, indicating whether or not the content feature is mentioned in the post) in each viral post across social media platforms, and performed subsequent comparative analyses. Machine learning dimension reduction and clustering analysis were then applied to harness the power of social media data and provide more unbiased characterization of web-based health communications. RESULTS There were substantially different distributions, prevalence, and associations of content features in public discourse about the COVID-19 pandemic on the two social media platforms. Weibo users were more likely to focus on the disease itself and health aspects, while Twitter users engaged more about policy, politics, and other societal issues. CONCLUSIONS We extracted a rich set of content features from social media data to accurately characterize public discourse related to COVID-19 in different sociocultural backgrounds. In addition, this universal framework can be adopted to analyze social media discussions of other emerging health issues beyond the COVID-19 pandemic.


Author(s):  
Suppawong Tuarob ◽  
Conrad S. Tucker

Some of the challenges that designers face in getting broad external input from customers during and after product launch include geographic limitations and the need for physical interaction with the design artifact(s). Having to conduct such user-based studies would require huge amounts of time and financial resources. In the past decade, social media has emerged as an increasingly important medium of communication and information sharing. Being able to mine and harness product-relevant knowledge within such a massive, readily accessible collection of data would give designers an alternative way to learn customers' preferences in a timely and cost-effective manner. In this paper, we propose a data mining driven methodology that identifies product features and associated customer opinions favorably received in the market space which can then be integrated into the design of next generation products. Two unique product domains (smartphones and automobiles) are investigated to validate the proposed methodology and establish social media data as a viable source of large scale, heterogeneous data relevant to next generation product design and development. We demonstrate in our case studies that incorporating suggested features into next generation products can result in favorable sentiment from social media users.


Sign in / Sign up

Export Citation Format

Share Document