scholarly journals Machine Learning for Social Science: An Agnostic Approach

Author(s):  
Justin Grimmer ◽  
Margaret E. Roberts ◽  
Brandon M. Stewart

Social scientists are now in an era of data abundance, and machine learning tools are increasingly used to extract meaning from data sets both massive and small. We explain how the inclusion of machine learning in the social sciences requires us to rethink not only applications of machine learning methods but also best practices in the social sciences. In contrast to the traditional tasks for machine learning in computer science and statistics, when machine learning is applied to social scientific data, it is used to discover new concepts, measure the prevalence of those concepts, assess causal effects, and make predictions. The abundance of data and resources facilitates the move away from a deductive social science to a more sequential, interactive, and ultimately inductive approach to inference. We explain how an agnostic approach to machine learning methods focused on the social science tasks facilitates progress across a wide range of questions. Expected final online publication date for the Annual Review of Political Science, Volume 24 is May 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.

Author(s):  
Jason M. Chin ◽  
Kathryn Zeiler

As part of a broader methodological reform movement, scientists are increasingly interested in improving the replicability of their research. Replicability allows others to perform replications to explore potential errors and statistical issues that might call the original results into question. Little attention, however, has been paid to the state of replicability in the field of empirical legal research (ELR). Quality is especially important in this field because empirical legal researchers produce work that is regularly relied upon by courts and other legal bodies. In this review, we summarize the current state of ELR relative to the broader movement toward replicability in the social sciences. As part of that aim, we summarize recent collective replication efforts in ELR and transparency and replicability guidelines adopted by journals that publish ELR. Based on this review, ELR seems to be lagging other fields in implementing reforms. We conclude with suggestions for reforms that might encourage improved replicability. Expected final online publication date for the Annual Review of Law and Social Science, Volume 17 is October 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.


Author(s):  
Emir Kocer ◽  
Tsz Wai Ko ◽  
Jörg Behler

In the past two decades, machine learning potentials (MLPs) have reached a level of maturity that now enables applications to large-scale atomistic simulations of a wide range of systems in chemistry, physics, and materials science. Different machine learning algorithms have been used with great success in the construction of these MLPs. In this review, we discuss an important group of MLPs relying on artificial neural networks to establish a mapping from the atomic structure to the potential energy. In spite of this common feature, there are important conceptual differences among MLPs, which concern the dimensionality of the systems, the inclusion of long-range electrostatic interactions, global phenomena like nonlocal charge transfer, and the type of descriptor used to represent the atomic structure, which can be either predefined or learnable. A concise overview is given along with a discussion of the open challenges in the field. Expected final online publication date for the Annual Review of Physical Chemistry, Volume 73 is April 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.


The present study relates to the analysis of attribute data related to users of the social network VK. The general population N = 52,614 users is the intersection of audiences from two communities for social media marketing. Based on the collected statistics on the “interests” attribute, one can compile a generalized portrait of an IT specialist and online marketer: this is a man aged about 30 years old, not married, or who defines his family status as “everything is complicated”. He speaks an average of two languages, works for an organization, or studies at a university. He has about 370 followers on VK. The result based on the data from the field 'activities' is very close to the data from the field 'interests', and gives a similar picture of the generalized portrait of a specialist. As part of the study, the authors have learned how to segment users into the users that identify themselves as „IT specialists or online marketers‟, and „other‟ users, using machine learning methods


Author(s):  
Oleksandr Dudin ◽  
◽  
Ozar Mintser ◽  
Oksana Sulaieva ◽  
◽  
...  

Introduction. Over the past few decades, thanks to advances in algorithm development, the introduction of available computing power, and the management of large data sets, machine learning methods have become active in various fields of life. Among them, deep learning possesses a special place, which is used in many spheres of health care and is an integral part and prerequisite for the development of digital pathology. Objectives. The purpose of the review was to gather the data on existing image analysis technologies and machine learning tools developed for the whole-slide digital images in pathology. Methods: Analysis of the literature on machine learning methods used in pathology, staps of automated image analysis, types of neural networks, their application and capabilities in digital pathology was performed. Results. To date, a wide range of deep learning strategies have been developed, which are actively used in digital pathology, and demonstrated excellent diagnostic accuracy. In addition to diagnostic solutions, the integration of artificial intelligence into the practice of pathomorphological laboratory provides new tools for assessing the prognosis and prediction of sensitivity to different treatments. Conclusions: The synergy of artificial intelligence and digital pathology is a key tool to improve the accuracy of diagnostics, prognostication and personalized medicine facilitation


2020 ◽  
Author(s):  
Yaakov Ophir ◽  
Refael Tikochinski ◽  
Christa Asterhan ◽  
Itay Sisso ◽  
Roi Reichart

Background: Detection of suicide risk is a highly prioritized, yet complicated task. In fact, five decades of suicide research produced predictions that were only marginally better than chance (AUCs = 0.56 – 0.58). Advanced machine learning methods open up new opportunities for progress in mental health research. In the present study, Artificial Neural Network (ANN) models were constructed to predict externally valid suicide risk from everyday language of social media users. Method: The dataset included 83,292 postings authored by 1,002 authenticated, active Facebook users, alongside clinically valid psychosocial information about the users. Results: Using Deep Contextualized Word Embeddings (CWEs) for text representation, two models were constructed: A Single Task Model (STM), to predict suicide risk from Facebook postings directly (Facebook texts → suicide) and a Multi-Task Model (MTM), which included hierarchical, multilayered sets of theory-driven risk factors (Facebook texts → personality traits → psychosocial risks → psychiatric disorders → suicide). Compared with the STM predictions (.606 ≤ AUC ≤ .608), the MTM produced improved prediction accuracy (.690 ≤ AUC ≤ .759), with substantially larger effect sizes (.701 ≤ d ≤ .994). Subsequent content analyses suggest that predictions did not rely on explicit suicide-related themes, but on a wide range of content. Conclusions: Advanced machine learning methods can improve our ability to predict suicide risk from everyday social media activities. The knowledge generated by this research may eventually lead to the development of more accurate and objective detection tools and get individuals the help they need in time.


2020 ◽  
Vol 4 (1) ◽  
Author(s):  
Shadd Maruna ◽  
Marieke Liem

Over the past decade, a growing body of literature has emerged under the umbrella of narrative criminology. We trace the origins of this field to narrative scholarship in the social sciences more broadly and review the recent history of criminological engagement in this field. We then review contemporary developments, paying particular attention to research around desistance and victimology. Our review highlights the most important critiques and challenges for narrative criminology and suggests fruitful directions in moving forward. We conclude by making a case for the consolidation and integration of narrative criminology, in hopes that this movement becomes more than an isolated clique. Expected final online publication date for the Annual Review of Criminology, Volume 4 is January 13, 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.


Author(s):  
Francisco Beneke ◽  
Mark-Oliver Mackenrodt

Abstract There is growing evidence that tacit collusion can be autonomously achieved by machine learning technology, at least in some real-life examples identified in the literature and experimental settings. Although more work needs to be done to assess the competitive risks of widespread adoption of autonomous pricing agents, this is still an appropriate time to examine which possible remedies can be used in case competition law shifts towards the prohibition of tacit collusion. This is because outlawing such conduct is pointless unless there are suitable remedies that can be used to address the social harm. This article explores how fines and structural and behavioural remedies can serve to discourage collusive results while preserving the incentives to use efficiency-enhancing algorithms. We find that this could be achieved if fines and remedies can target structural conditions that facilitate collusion. In addition, the problem of unfeasibility of injunctions to remedy traditional price coordination changes with the use of pricing software, which in theory can be programmed to avoid collusive outcomes. Finally, machine-learning methods can be used by the authorities themselves as a tool to test the effects of any given combination of remedies and to estimate a more accurate competitive benchmark for the calculation of the appropriate fine.


2020 ◽  
Author(s):  
Yaakov Ophir ◽  
Refael Tikochinski ◽  
Christa Asterhan ◽  
Itay Sisso ◽  
Roi Reichart

Background: Detection of suicide risk is a highly prioritized, yet complicated task. In fact, five decades of suicide research produced predictions that were only marginally better than chance (AUCs = 0.56 – 0.58). Advanced machine learning methods open up new opportunities for progress in mental health research. In the present study, Artificial Neural Network (ANN) models were constructed to predict externally valid suicide risk from everyday language of social media users. Method: The dataset included 83,292 postings authored by 1,002 authenticated, active Facebook users, alongside clinically valid psychosocial information about the users. Results: Using Deep Contextualized Word Embeddings (CWEs) for text representation, two models were constructed: A Single Task Model (STM), to predict suicide risk from Facebook postings directly (Facebook texts → suicide) and a Multi-Task Model (MTM), which included hierarchical, multilayered sets of theory-driven risk factors (Facebook texts → personality traits → psychosocial risks → psychiatric disorders → suicide). Compared with the STM predictions (.606 ≤ AUC ≤ .608), the MTM produced improved prediction accuracy (.690 ≤ AUC ≤ .759), with substantially larger effect sizes (.701 ≤ d ≤ .994). Subsequent content analyses suggest that predictions did not rely on explicit suicide-related themes, but on a wide range of content. Conclusions: Advanced machine learning methods can improve our ability to predict suicide risk from everyday social media activities. The knowledge generated by this research may eventually lead to the development of more accurate and objective detection tools and get individuals the help they need in time.


2018 ◽  
Vol 226 (4) ◽  
pp. 259-273 ◽  
Author(s):  
Ranjith Vijayakumar ◽  
Mike W.-L. Cheung

Abstract. Machine learning tools are increasingly used in social sciences and policy fields due to their increase in predictive accuracy. However, little research has been done on how well the models of machine learning methods replicate across samples. We compare machine learning methods with regression on the replicability of variable selection, along with predictive accuracy, using an empirical dataset as well as simulated data with additive, interaction, and non-linear squared terms added as predictors. Methods analyzed include support vector machines (SVM), random forests (RF), multivariate adaptive regression splines (MARS), and the regularized regression variants, least absolute shrinkage and selection operator (LASSO), and elastic net. In simulations with additive and linear interactions, machine learning methods performed similarly to regression in replicating predictors; they also performed mostly equal or below regression on measures of predictive accuracy. In simulations with square terms, machine learning methods SVM, RF, and MARS improved predictive accuracy and replicated predictors better than regression. Thus, in simulated datasets, the gap between machine learning methods and regression on predictive measures foreshadowed the gap in variable selection. In replications on the empirical dataset, however, improved prediction by machine learning methods was not accompanied by a visible improvement in replicability in variable selection. This disparity is explained by the overall explanatory power of the models. When predictors have small effects and noise predominates, improved global measures of prediction in a sample by machine learning methods may not lead to the robust selection of predictors; thus, in the presence of weak predictors and noise, regression remains a useful tool for model building and replication.


Sign in / Sign up

Export Citation Format

Share Document