A survey of Twitter research: Data model, graph structure, sentiment analysis and attacks

2021 ◽  
Vol 164 ◽  
pp. 114006
Author(s):  
Despoina Antonakaki ◽  
Paraskevi Fragopoulou ◽  
Sotiris Ioannidis
Data ◽  
2019 ◽  
Vol 4 (2) ◽  
pp. 83 ◽  
Author(s):  
Timm Fitschen ◽  
Alexander Schlemmer ◽  
Daniel Hornung ◽  
Henrik tom Wörden ◽  
Ulrich Parlitz ◽  
...  

We present CaosDB, a Research Data Management System (RDMS) designed to ensure seamless integration of inhomogeneous data sources and repositories of legacy data in a FAIR way. Its primary purpose is the management of data from biomedical sciences, both from simulations and experiments during the complete research data lifecycle. An RDMS for this domain faces particular challenges: research data arise in huge amounts, from a wide variety of sources, and traverse a highly branched path of further processing. To be accepted by its users, an RDMS must be built around workflows of the scientists and practices and thus support changes in workflow and data structure. Nevertheless, it should encourage and support the development and observation of standards and furthermore facilitate the automation of data acquisition and processing with specialized software. The storage data model of an RDMS must reflect these complexities with appropriate semantics and ontologies while offering simple methods for finding, retrieving, and understanding relevant data. We show how CaosDB responds to these challenges and give an overview of its data model, the CaosDB Server and its easy-to-learn CaosDB Query Language. We briefly discuss the status of the implementation, how we currently use CaosDB, and how we plan to use and extend it.


2021 ◽  
Vol 20 ◽  
pp. 160940692110024
Author(s):  
Gisela Sender ◽  
Flavio Carvalho ◽  
Gustavo Guedes

Happiness at Work is considered the Holy Grail of organizational sciences. The belief that happier workers are more productive leads to a win-win situation for both individuals and organizations. Nevertheless, years of research have not brought a convergent conclusion about the topic, mainly due to the lack of a widely accepted measure. Usually, questionnaires and self-report surveys are used; however, these methods embed shortcomings that allow studies’ results to be questioned. In order to overcome these shortcomings, the present study proposes a different approach to measure Happiness at Work, bringing mixed methods to encompass the complexity of the phenomenon. Based on work-life narratives and following Kahneman’s concepts, the proposed approach puts together Narrative Analysis and Sentiment Analysis. Although increasingly used to assess social media reviews, Sentiment Analysis is not yet applied to narratives related to Happiness at Work. Four methods to calculate the Happy Level indicator were tested on actual research data: one manual, through traditional coding processes, and three automatic methods to provide scalability. An example of the Happy Level application is also provided to illustrate how the indicator could improve analyses. The present study concludes that despite the manual method presents better results at this moment; the automatic ones are promising. The results also indicate paths for improvement of these methods.


2017 ◽  
Vol 51 (1) ◽  
pp. 75-100 ◽  
Author(s):  
Adrian Burton ◽  
Hylke Koers ◽  
Paolo Manghi ◽  
Sandro La Bruzzo ◽  
Amir Aryani ◽  
...  

Purpose Research data publishing is today widely regarded as crucial for reproducibility, proper assessment of scientific results, and as a way for researchers to get proper credit for sharing their data. However, several challenges need to be solved to fully realize its potential, one of them being the development of a global standard for links between research data and literature. Current linking solutions are mostly based on bilateral, ad hoc agreements between publishers and data centers. These operate in silos so that content cannot be readily combined to deliver a network graph connecting research data and literature in a comprehensive and reliable way. The Research Data Alliance (RDA) Publishing Data Services Working Group (PDS-WG) aims to address this issue of fragmentation by bringing together different stakeholders to agree on a common infrastructure for sharing links between datasets and literature. The paper aims to discuss these issues. Design/methodology/approach This paper presents the synergic effort of the RDA PDS-WG and the OpenAIRE infrastructure toward enabling a common infrastructure for exchanging data-literature links by realizing and operating the Data-Literature Interlinking (DLI) Service. The DLI Service populates and provides access to a graph of data set-literature links (at the time of writing close to five million, and growing) collected from a variety of major data centers, publishers, and research organizations. Findings To achieve its objectives, the Service proposes an interoperable exchange data model and format, based on which it collects and publishes links, thereby offering the opportunity to validate such common approach on real-case scenarios, with real providers and consumers. Feedback of these actors will drive continuous refinement of the both data model and exchange format, supporting the further development of the Service to become an essential part of a universal, open, cross-platform, cross-discipline solution for collecting, and sharing data set-literature links. Originality/value This realization of the DLI Service is the first technical, cross-community, and collaborative effort in the direction of establishing a common infrastructure for facilitating the exchange of data set-literature links. As a result of its operation and underlying community effort, a new activity, name Scholix, has been initiated involving the technological level stakeholders such as DataCite and CrossRef.


2021 ◽  
Vol 4 (3) ◽  
pp. 102-106
Author(s):  
Hendra Saputra Batubara ◽  
Ambiyar Ambiyar ◽  
Syahril Syahril ◽  
Fadhilah Fadhilah ◽  
Ronal Watrianthos

The use of restricted face-to-face learning during the epidemic in Indonesia was discussed not just by education and health professionals, but also on social media. The study used the Twitter dataset with the keywords 'school' and 'face-to-face' to examine public opinion about face-to-face learning. The research data was obtained from Twitter utilizing Drone Emprit Academic, and it was then processed using the Naive Bayes method to create sentiment analysis. During that time, research revealed that 32% of people were positive, 54% were negative, and 14% were indifferent. Because of worries about the dangers associated with the use of face-to-face learning, negative attitudes predominate.  


2021 ◽  
Vol 5 (2) ◽  
pp. 92-96
Author(s):  
Irina E. Kalabikhina ◽  
Evgeny P. Banin

The database contains an upload of text comments in Russian from the social network VKontakte in .csv format (UTF-8 encoding). The comments are collected from communities, which discuss pregnancy, childhood, motherhood, paternity, etc. The upload contains comments under the posts with which the interaction took place. The absolute amount of likes is used as a criterion (comments are collected where the number of likes is greater than or equal to 5). The text data is processed (stemmization and lemmatization). The data are suitable for thematic analysis (e.g. LDA — Latent Dirichlet Allocation), sentiment analysis of statements, modelling the graph structure of communities (the link_comment variable contains a unique identifier of the post, link_author contains a unique user identifier), and forming a dictionary of demographic connotation in Russian. Sentiment analysis of statements enables measuring the dynamics of «demographic temperature» in antinatalist communities. The database is a supplement to the publication Kalabikhina IE, Banin EP (2020) Database «Pro-family (pronatalist) communities in the social network VKontakte». Population and Economics 4(3): 98–130. https://doi.org/10.3897/popecon.4.e60915.


Author(s):  
Valentina Bartalesi ◽  
Carlo Meghini ◽  
Costantino Thanos

2018 ◽  
Author(s):  
Maria J. Cruz ◽  
Jasmin K. Böhmer ◽  
Egbert Gramsbergen ◽  
Marta Teperek ◽  
Madeleine de Smaele ◽  
...  

Founded in 2008 as an initiative of the libraries of three of the four technical universities in the Netherlands, the 4TU.Centre for Research Data (4TU.Research Data) provides since 2010 a fully operational, cross-institutional, long-term archive that stores data from all subjects in applied sciences and engineering. Presently, over 90% of the data in the archive is geoscientific data coded in netCDF (Network Common Data Form) – a data format and data model that, although generic, is mostly used in climate, ocean and atmospheric sciences. In this practice paper, we explore the question of how 4TU.Research Data can stay relevant and forward-looking in a rapidly evolving research data management landscape. In particular, we describe the motivation behind this question and how we propose to address it.


2021 ◽  
Vol 4 (2) ◽  
pp. 139-145
Author(s):  
Thalita Meisya Permata Aulia ◽  
Nur Arifin ◽  
Rini Mayasari

In early 2020, the first recorded death from the COVID-19 virus in China [3]. Followed by WHO which later stated that the COVID-19 virus caused a pandemic. Various efforts were made to minimize the transmission of COVID-19, such as physical distancing and large-scale social circulation. However, this resulted in a paralyzed economy, many factories or business shops closed, eliminating the livelihoods of many people. Vaccines may be a solution, various International Research Communities have conducted research on the COVID-19 vaccine. In early 2021 the Sinovac vaccine from China arrived in Indonesia and was declared a BPOM clinical trial, but the existence of the vaccine still raises pros and cons, some have responded well and others have not. For this reason, a sentiment analysis of the COVID-19 vaccine will be carried out by taking data from Twitter, then classified using the Support Vector Machine algorithm. The research data is nonlinear data so it requires a kernel space for the text mining process, while there has been no specific research regarding which kernel is good for sentiment analysis, so a test will be carried out to find the best kernel among linear, sigmoid, polynomial, and RBF kernels. The result is that sigmoid and linear kernels have a better value, namely 0.87 compared to RBF and polynomial, namely 0.86


Relational databases are holding the maximum amount of data underpinning the web. They show excellent record of convenience and efficiency in repository, optimized query execution, scalability, security and accuracy. Recently graph databases are seen as an good replacement for relational database. When compared to the relational data model, graph data model is more vivid, strong and data expressed in it models relationships among data properly. An important requirement is to increase the vast quantities of data stored in RDB into web. In this situation, migration from relational to graph format is very advantageous. Both databases have advantages and limitations depending on the form of queries. Thus, this paper converts relational to graph database by utilizing the schema in order to develop a dual database system through migration, which merges the capability of both relational db and graph db. The experimental results are provided to demonstrate the practicability of the method and query response time over the target database. The proposed concept is proved by implementing it on MySQL and Neo4j


Sign in / Sign up

Export Citation Format

Share Document