scholarly journals Simulating systematic bias in attributed social networks and its effect on rankings of minority nodes

2021 ◽  
Vol 6 (1) ◽  
Author(s):  
Leonie Neuhäuser ◽  
Felix I. Stamm ◽  
Florian Lemmerich ◽  
Michael T. Schaub ◽  
Markus Strohmaier

AbstractNetwork analysis provides powerful tools to learn about a variety of social systems. However, most analyses implicitly assume that the considered relational data is error-free, and reliable and accurately reflects the system to be analysed. Especially if the network consists of multiple groups (e.g., genders, races), this assumption conflicts with a range of systematic biases, measurement errors and other inaccuracies that are well documented in the literature. To investigate the effects of such errors we introduce a framework for simulating systematic bias in attributed networks. Our framework enables us to model erroneous edge observations that are driven by external node attributes or errors arising from the (hidden) network structure itself. We exemplify how systematic inaccuracies distort conclusions drawn from network analyses on the task of minority representations in degree-based rankings. By analysing synthetic and real networks with varying homophily levels and group sizes, we find that the effect of introducing systematic edge errors depends on both the type of edge error and the level of homophily in the system: in heterophilic networks, minority representations in rankings are very sensitive to the type of systematic edge error. In contrast, in homophilic networks we find that minorities are at a disadvantage regardless of the type of error present. We thus conclude that the implications of systematic bias in edge data depend on an interplay between network topology and type of systematic error. This emphasises the need for an error model framework as developed here, which provides a first step towards studying the effects of systematic edge-uncertainty for various network analysis tasks.

2021 ◽  
Vol 30 (4) ◽  
pp. 441-455
Author(s):  
Rinat Aynulin ◽  
◽  
Pavel Chebotarev ◽  
◽  

Proximity measures on graphs are extensively used for solving various problems in network analysis, including community detection. Previous studies have considered proximity measures mainly for networks without attributes. However, attribute information, node attributes in particular, allows a more in-depth exploration of the network structure. This paper extends the definition of a number of proximity measures to the case of attributed networks. To take node attributes into account, attribute similarity is embedded into the adjacency matrix. Obtained attribute-aware proximity measures are numerically studied in the context of community detection in real-world networks.


2014 ◽  
Vol 14 (23) ◽  
pp. 12897-12914 ◽  
Author(s):  
J. S. Wang ◽  
S. R. Kawa ◽  
J. Eluszkiewicz ◽  
D. F. Baker ◽  
M. Mountain ◽  
...  

Abstract. Top–down estimates of the spatiotemporal variations in emissions and uptake of CO2 will benefit from the increasing measurement density brought by recent and future additions to the suite of in situ and remote CO2 measurement platforms. In particular, the planned NASA Active Sensing of CO2 Emissions over Nights, Days, and Seasons (ASCENDS) satellite mission will provide greater coverage in cloudy regions, at high latitudes, and at night than passive satellite systems, as well as high precision and accuracy. In a novel approach to quantifying the ability of satellite column measurements to constrain CO2 fluxes, we use a portable library of footprints (surface influence functions) generated by the Stochastic Time-Inverted Lagrangian Transport (STILT) model in combination with the Weather Research and Forecasting (WRF) model in a regional Bayesian synthesis inversion. The regional Lagrangian particle dispersion model framework is well suited to make use of ASCENDS observations to constrain weekly fluxes in North America at a high resolution, in this case at 1° latitude × 1° longitude. We consider random measurement errors only, modeled as a function of the mission and instrument design specifications along with realistic atmospheric and surface conditions. We find that the ASCENDS observations could potentially reduce flux uncertainties substantially at biome and finer scales. At the grid scale and weekly resolution, the largest uncertainty reductions, on the order of 50%, occur where and when there is good coverage by observations with low measurement errors and the a priori uncertainties are large. Uncertainty reductions are smaller for a 1.57 μm candidate wavelength than for a 2.05 μm wavelength, and are smaller for the higher of the two measurement error levels that we consider (1.0 ppm vs. 0.5 ppm clear-sky error at Railroad Valley, Nevada). Uncertainty reductions at the annual biome scale range from ~40% to ~75% across our four instrument design cases and from ~65% to ~85% for the continent as a whole. Tests suggest that the quantitative results are moderately sensitive to assumptions regarding a priori uncertainties and boundary conditions. The a posteriori flux uncertainties we obtain, ranging from 0.01 to 0.06 Pg C yr−1 across the biomes, would meet requirements for improved understanding of long-term carbon sinks suggested by a previous study.


PLoS ONE ◽  
2021 ◽  
Vol 16 (8) ◽  
pp. e0256696
Author(s):  
Anna Keuchenius ◽  
Petter Törnberg ◽  
Justus Uitermark

Despite the prevalence of disagreement between users on social media platforms, studies of online debates typically only look at positive online interactions, represented as networks with positive ties. In this paper, we hypothesize that the systematic neglect of conflict that these network analyses induce leads to misleading results on polarized debates. We introduce an approach to bring in negative user-to-user interaction, by analyzing online debates using signed networks with positive and negative ties. We apply this approach to the Dutch Twitter debate on ‘Black Pete’—an annual Dutch celebration with racist characteristics. Using a dataset of 430,000 tweets, we apply natural language processing and machine learning to identify: (i) users’ stance in the debate; and (ii) whether the interaction between users is positive (supportive) or negative (antagonistic). Comparing the resulting signed network with its unsigned counterpart, the retweet network, we find that traditional unsigned approaches distort debates by conflating conflict with indifference, and that the inclusion of negative ties changes and enriches our understanding of coalitions and division within the debate. Our analysis reveals that some groups are attacking each other, while others rather seem to be located in fragmented Twitter spaces. Our approach identifies new network positions of individuals that correspond to roles in the debate, such as leaders and scapegoats. These findings show that representing the polarity of user interactions as signs of ties in networks substantively changes the conclusions drawn from polarized social media activity, which has important implications for various fields studying online debates using network analysis.


2021 ◽  
Vol 4 ◽  
Author(s):  
Monica Billio ◽  
Roberto Casarin ◽  
Michele Costola ◽  
Matteo Iacopini

Networks represent a useful tool to describe relationships among financial firms and network analysis has been extensively used in recent years to study financial connectedness. An aspect, which is often neglected, is that network observations come with errors from different sources, such as estimation and measurement errors, thus a proper statistical treatment of the data is needed before network analysis can be performed. We show that node centrality measures can be heavily affected by random errors and propose a flexible model based on the matrix-variate t distribution and a Bayesian inference procedure to de-noise the data. We provide an application to a network among European financial institutions.


2017 ◽  
Vol 43 (11) ◽  
pp. 1566-1581 ◽  
Author(s):  
Ralf Wölfer ◽  
Eva Jaspers ◽  
Danielle Blaylock ◽  
Clarissa Wigoder ◽  
Joanne Hughes ◽  
...  

Traditionally, studies of intergroup contact have primarily relied on self-reports, which constitute a valid method for studying intergroup contact, but has limitations, especially if researchers are interested in negative or extended contact. In three studies, we apply social network analyses to generate alternative contact parameters. Studies 1 and 2 examine self-reported and network-based parameters of positive and negative contact using cross-sectional datasets ( N = 291, N = 258), indicating that both methods help explain intergroup relations. Study 3 examines positive and negative direct and extended contact using the previously validated network-based contact parameters in a large-scale, international, and longitudinal dataset ( N = 12,988), demonstrating that positive and negative direct and extended contact all uniquely predict intergroup relations (i.e., intergroup attitudes and future outgroup contact). Findings highlight the value of social network analysis for examining the full complexity of contact including positive and negative forms of direct and extended contact.


1999 ◽  
Vol 89 (4) ◽  
pp. 989-1003 ◽  
Author(s):  
István Bondár ◽  
Robert G. North ◽  
Gregory Beall

Abstract The prototype International Data Center (PIDC) in Arlington, Virginia, has been developing and testing software and procedures for use in the verification of the Comprehensive Test Ban Treaty. After three years of operation with a global network of array and three-component stations, it has been possible to characterize various systematic biases of those stations that are designated in the Treaty as part of the International Monitoring System (IMS). These biases include deviations of azimuth and slowness measurements from predicted values, caused largely by lateral heterogeneity. For events recorded by few stations, azimuth and slowness are used in addition to arrival-time data for location by the PIDC. Corrections to teleseismic azimuth and slowness observations have been empirically determined for most IMS stations providing data to the PIDC. Application of these corrections is shown to improve signal association and event location. At some stations an overall systematic bias can be ascribed to local crustal structure or to unreported instrumental problems. The corrections have been applied in routine operation of the PIDC since February 1998.


2021 ◽  
pp. 53-76
Author(s):  
Marie J. E. Charpentier ◽  
Marie Pelé ◽  
Julien P. Renoult ◽  
Cédric Sueur

Sampling accurate and quantitative behavioural data requires the description of fine-grained patterns of social relationships and/or spatial associations, which is highly challenging, especially in natural environments. Although behavioural ecologists have tackled systematic studies on animals’ societies since the nineteenth century, new biologging technologies have the potential to revolutionise the sampling of animals’ social relationships. However, the tremendous quantity of data sampled and the diversity of biologgers (such as proximity loggers) currently available that allow the sampling of a large array of biological and physiological data bring new analytical challenges. The high spatiotemporal resolution of data needed when studying social processes, such as disease or information diffusion, requires new analytical tools, such as social network analyses, developed to analyse large data sets. The quantity and quality of the data now available on a large array of social systems bring undiscovered outputs, consistently opening new and exciting research avenues.


2019 ◽  
Vol 24 (1) ◽  
pp. 5-21 ◽  
Author(s):  
Claudia Colicchia ◽  
Alessandro Creazza ◽  
Carlo Noè ◽  
Fernanda Strozzi

Purpose The purpose of this paper is to identify and discuss the most important research areas on information sharing in supply chains and related risks, taking into account their evolution over time. This paper sheds light on what is happening today and what the trajectories for the future are, with particular respect to the implications for supply chain management. Design/methodology/approach The dynamic literature review method called Systematic Literature Network Analysis (SLNA) was adopted. It combines the Systematic Literature Review approach and bibliographic network analyses, and it relies on objective measures and algorithms to perform quantitative literature-based detection of emerging topics. Findings The focus of the literature seems to be on threats that are internal to the extended supply chain rather than on external attacks, such as viruses, traditionally related to information technology (IT). The main arising risk appears to be the intentional or non-intentional leakage of information. Also, papers analyze the implications for information sharing coming from “soft” factors such as trust and collaboration among supply chain partners. Opportunities are also highlighted and include how information sharing can be leveraged to confront disruptions and increase resilience. Research limitations/implications The adopted methodology allows for providing an original perspective on the investigated topic, that is, how information sharing in supply chains and related risks are evolving over time because of the turbulent advances in technology. Practical implications Emergent and highly critical risks related to information sharing are highlighted to support the design of supply chain risks strategies. Also, critical areas to the development of “beyond-the-dyad” initiatives to manage information sharing risks emerge. Opportunities coming from information sharing that are less known and exploited by companies are provided. Originality/value This paper focuses on the supply chain perspective rather than the traditional IT-based view of information sharing. According to this perspective, this paper provides a dynamic representation of the literature on the investigated topic. This is an important contribution to the topic of information sharing in supply chains is continuously evolving and shaping new supply chain models.


Sign in / Sign up

Export Citation Format

Share Document