scholarly journals Semantic Annotation, Representation and Linking of Survey Data

Author(s):  
Felix Bensmann ◽  
Andrea Papenmeier ◽  
Dagmar Kern ◽  
Benjamin Zapilko ◽  
Stefan Dietze

Abstract Semantic technologies offer significant potential for improving data search applications. Ongoing work thrives to equip data catalogs with new semantic search features to supplement existing keyword search and browsing capabilities. In particular within the social sciences, searching and reusing data is essential to foster efficient research. In this paper, we introduce an approach and experimental results aimed at improving interoperability and findability of social sciences survey items. Our contributions include a conceptual model for semantically representing survey items and questions, detailing meaningful dimensions of items, as well as experimental results geared towards the automated prediction of such item features using state-of-the-art machine learning models. Dimensions of interest include, for instance, references to geolocation and time periods or the scope and style of particular questions. We define classification tasks using neural and traditional machine learning models combined with sentence structure features. Applications of our work include semantic and faceted search for questions as part of our GESIS Search. We also provide the lifted data as a knowledge graph via a SPARQL endpoint for further reuse and sharing.

Author(s):  
Jiaxing Zhang ◽  
Shuaishuai Feng

Improvements in big data and machine learning algorithms have helped AI technologies reach a new breakthrough and have provided a new opportunity for quantitative research in the social sciences. Traditional quantitative models rely heavily on theoretical hypotheses and statistics but fail to acknowledge the problem of overfitting, causing the research results to be less generalizable, and further leading to societal predictions in the social sciences being ignored when they should have been meaningful. Machine learning models that use cross validation and regularization can effectively solve the problem of overfitting, providing support for the societal predictions based on these models. This paper first discusses the sources and internal mechanisms of overfitting, and then introduces machine learning modeling by discussing its high-level ideas, goals, and concrete methods. Finally, we discuss the shortcomings and limiting factors of machine learning models. We believe that using machine learning in social sciences research is an opportunity and not a threat. Researchers should adopt an objective attitude and make sure that they know how to combine traditional methods with new methods in their research based on their needs.


2019 ◽  
pp. 29-43
Author(s):  
Anastasiya A. Korepanova ◽  
◽  
Valerii D. Oliseenko ◽  
Maxim V. Abramov ◽  
Alexander L. Tulupyev ◽  
...  

The article describes the approach to solving the problem of comparing user profiles of different social networks and identifying those that belong to one person. An appropriate method is proposed based on a comparison of the social environment and the values of account profile attributes in two different social networks. The results of applying various machine learning models to solving this problem are compared. The novelty of the approach lies in the proposed new combination of various methods and application to new social networks. The practical significance of the study is to automate the process of determining the ownership of profiles in various social networks to one user. These results can be applied in the task of constructing a meta-profile of a user of an information system for the subsequent construction of a profile of his vulnerabilities, as well as in other studies devoted to social networks.


Author(s):  
Jingying Wang ◽  
Baobin Li ◽  
Changye Zhu ◽  
Shun Li ◽  
Tingshao Zhu

Automatic emotion recognition was of great value in many applications; however, to fully display the application value of emotion recognition, more portable, non-intrusive, inexpensive technologies need to be developed. Except face expression and voices, human gaits could reflect the walker's emotional state too. By utilizing 59 participants' gaits data with emotion labels, the authors train machine learning models that are able to “sense” individual emotion. Experimental results show these models work very well and prove that gait features are effective in characterizing and recognizing emotions.


2007 ◽  
Vol 16 (06) ◽  
pp. 1001-1014 ◽  
Author(s):  
PANAGIOTIS ZERVAS ◽  
IOSIF MPORAS ◽  
NIKOS FAKOTAKIS ◽  
GEORGE KOKKINAKIS

This paper presents and discusses the problem of emotion recognition from speech signals with the utilization of features bearing intonational information. In particular parameters extracted from Fujisaki's model of intonation are presented and evaluated. Machine learning models were build with the utilization of C4.5 decision tree inducer, instance based learner and Bayesian learning. The datasets utilized for the purpose of training machine learning models were extracted from two emotional databases of acted speech. Experimental results showed the effectiveness of Fujisaki's model attributes since they enhanced the recognition process for most of the emotion categories and learning approaches helping to the segregation of emotion categories.


2021 ◽  
Vol 2070 (1) ◽  
pp. 012079
Author(s):  
V Jagadishwari ◽  
A Indulekha ◽  
Kiran Raghu ◽  
P Harshini

Abstract Social Media is an arena in recent times for people to share their perspectives on a variety of topics. Most of the social interactions are through the Social Media. Though all the Online Social Networks allow users to express their views and opinions in many forms like audio, video, text etc, the most popular form of expression is text, Emoticons and Emojis. The work presented in this paper aims at detecting the sentiments expressed in the Social Media posts. The Machine Learning Models namely Bernoulli Bayes, Multinomial Bayes, Regression and SVM were implemented. All these models were trained and tested with Twitter Data sets. Users on Twitter express their opinions in the form of tweets with limited characters. Tweets also contain Emoticons and Emojis therefore Twitter data sets are best suited for the sentiment analysis. The effect of emoticons present in the tweet is also analyzed. The models are first trained only with the text and then they are trained with text and emoticon in the tweet. The performance of all the four models in both cases are tested and the results are presented in the paper.


Author(s):  
Kacper Sokol ◽  
Peter Flach

Machine learning models have become pervasive in our everyday life; they decide on important matters influencing our education, employment and judicial system. Many of these predictive systems are commercial products protected by trade secrets, hence their decision-making is opaque. Therefore, in our research we address interpretability and explainability of predictions made by machine learning models. Our work draws heavily on human explanation research in social sciences: contrastive and exemplar explanations provided through a dialogue. This user-centric design, focusing on a lay audience rather than domain experts, applied to machine learning allows explainees to drive the explanation to suit their needs instead of being served a precooked template.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Qingchun Li ◽  
Yang Yang ◽  
Wanqiu Wang ◽  
Sanghyeon Lee ◽  
Xin Xiao ◽  
...  

AbstractThe objective of this study was to investigate the importance of multiple county-level features in the trajectory of COVID-19. We examined feature importance across 2787 counties in the United States using data-driven machine learning models. Existing mathematical models of disease spread usually focused on the case prediction with different infection rates without incorporating multiple heterogeneous features that could impact the spatial and temporal trajectory of COVID-19. Recognizing this, we trained a data-driven model using 23 features representing six key influencing factors affecting the pandemic spread: social demographics of counties, population activities, mobility within the counties, movement across counties, disease attributes, and social network structure. Also, we categorized counties into multiple groups according to their population densities, and we divided the trajectory of COVID-19 into three stages: the outbreak stage, the social distancing stage, and the reopening stage. The study aimed to answer two research questions: (1) The extent to which the importance of heterogeneous features evolved at different stages; (2) The extent to which the importance of heterogeneous features varied across counties with different characteristics. We fitted a set of random forest models to determine weekly feature importance. The results showed that: (1) Social demographic features, such as gross domestic product, population density, and minority status maintained high-importance features throughout stages of COVID-19 across 2787 studied counties; (2) Within-county mobility features had the highest importance in counties with higher population densities; (3) The feature reflecting the social network structure (Facebook, social connectedness index), had higher importance for counties with higher population densities. The results showed that the data-driven machine learning models could provide important insights to inform policymakers regarding feature importance for counties with various population densities and at different stages of a pandemic life cycle.


2021 ◽  
Author(s):  
Giancarlo Canales Barreto ◽  
Nicholas Lamb

We present a cache attack monitoring methodology that leverages statistical machine learning models to detect n-day hardware attacks by analyzing the electromagnetic emanations of a device. Experimental results from a Raspberry Pi 4 hosting Linux and a Jetson TX2 development board running a Linux guest hosted by seL4 demonstrate that our approach can sense Spectre attacks with a concordance statistic of 97% and 95%.


Author(s):  
Jingying Wang ◽  
Baobin Li ◽  
Changye Zhu ◽  
Shun Li ◽  
Tingshao Zhu

Automatic emotion recognition was of great value in many applications, however, to fully display the application value of emotion recognition, more portable, non-intrusive, inexpensive technologies need to be developed. Except face expression and voices, human gaits could reflect the walker's emotional state too. By utilizing 59 participants' gaits data with emotion labels, we train machine learning models that are able to “sense” individual emotion. Experimental results show these models work very well, proved that gait features are effective in characterizing and recognizing emotions.


2021 ◽  
Author(s):  
Giancarlo Canales Barreto ◽  
Nicholas Lamb

We present a cache attack monitoring methodology that leverages statistical machine learning models to detect n-day hardware attacks by analyzing the electromagnetic emanations of a device. Experimental results from a Raspberry Pi 4 hosting Linux and a Jetson TX2 development board running a Linux guest hosted by seL4 demonstrate that our approach can sense Spectre attacks with a concordance statistic of 97% and 95%.


Sign in / Sign up

Export Citation Format

Share Document