Online Sampling of Temporal Networks

2021 ◽  
Vol 15 (4) ◽  
pp. 1-27
Author(s):  
Nesreen K. Ahmed ◽  
Nick Duffield ◽  
Ryan A. Rossi

Temporal networks representing a stream of timestamped edges are seemingly ubiquitous in the real world. However, the massive size and continuous nature of these networks make them fundamentally challenging to analyze and leverage for descriptive and predictive modeling tasks. In this work, we propose a general framework for temporal network sampling with unbiased estimation. We develop online, single-pass sampling algorithms, and unbiased estimators for temporal network sampling. The proposed algorithms enable fast, accurate, and memory-efficient statistical estimation of temporal network patterns and properties. In addition, we propose a temporally decaying sampling algorithm with unbiased estimators for studying networks that evolve in continuous time, where the strength of links is a function of time, and the motif patterns are temporally weighted. In contrast to the prior notion of a △ t -temporal motif, the proposed formulation and algorithms for counting temporally weighted motifs are useful for forecasting tasks in networks such as predicting future links, or a future time-series variable of nodes and links. Finally, extensive experiments on a variety of temporal networks from different domains demonstrate the effectiveness of the proposed algorithms. A detailed ablation study is provided to understand the impact of the various components of the proposed framework.

2019 ◽  
Vol 7 (1) ◽  
pp. 52-69 ◽  
Author(s):  
Petter Holme ◽  
Luis E. C. Rocha

AbstractWe investigate the impact of misinformation about the contact structure on the ability to predict disease outbreaks. We base our study on 31 empirical temporal networks and tune the frequencies in errors in the node identities or time stamps of contacts. We find that for both these spreading scenarios, the maximal misprediction of both the outbreak size and time to extinction follows an stretched exponential convergence as a function of the error frequency. We furthermore determine the temporal-network structural factors influencing the parameters of this convergence.


2020 ◽  
Vol 26 (2) ◽  
pp. 113-129
Author(s):  
Hamza M. Ruzayqat ◽  
Ajay Jasra

AbstractIn the following article, we consider the non-linear filtering problem in continuous time and in particular the solution to Zakai’s equation or the normalizing constant. We develop a methodology to produce finite variance, almost surely unbiased estimators of the solution to Zakai’s equation. That is, given access to only a first-order discretization of solution to the Zakai equation, we present a method which can remove this discretization bias. The approach, under assumptions, is proved to have finite variance and is numerically compared to using a particular multilevel Monte Carlo method.


2019 ◽  
Author(s):  
Aaron P. Ragsdale ◽  
Simon Gravel

AbstractLinkage disequilibrium is used to infer evolutionary history and to identify regions under selection or associated with a given trait. In each case, we require accurate estimates of linkage disequilibrium from sequencing data. Unphased data presents a challenge because the co-occurrence of alleles at different loci is ambiguous. Commonly used estimators for the common statistics r2 and D2 exhibit large and variable upward biases that complicate interpretation and comparison across cohorts. Here, we show how to find unbiased estimators for a wide range of two-locus statistics, including D2, for both single and multiple randomly mating populations. These provide accurate estimates over three orders of magnitude in LD. We also use these estimators to construct an estimator for r2 that is less biased than commonly used estimators, but nevertheless argue for using rather than r2 for population size estimates.


Author(s):  
Kritika Jain ◽  
Ankit Garg ◽  
Somya Jain

In today's competitive world, organizations take advantage of widely-available data to promote their products and increase their revenue. This is achieved by identifying the reader's preference for news genre and patterns in news spread network. Spreading news over the internet seems to be a continuous process which eventually triggers the evolution of temporal networks. This temporal network comprises of nodes and edges, where node corresponds to published articles and similar articles are connected via edges. The main focus of this article is to reconstruct a susceptible-infected (SI) diffusion model to discover the spreading pattern of news articles for virality detection. For experimental analysis, a dataset of news articles from four domains (business, technology, entertainment, and health) is considered and the articles' rate of diffusion is inferred and compared. This will help to build a recommendation system, i.e. recommending a particular domain for advertisement and marketing. Hence, it will assist to build strategies for effective product endorsement for sustainable profitability.


Algorithms ◽  
2019 ◽  
Vol 12 (10) ◽  
pp. 211 ◽  
Author(s):  
Pierluigi Crescenzi ◽  
Clémence Magnien ◽  
Andrea Marino

Temporal networks are graphs in which edges have temporal labels, specifying their starting times and their traversal times. Several notions of distances between two nodes in a temporal network can be analyzed, by referring, for example, to the earliest arrival time or to the latest starting time of a temporal path connecting the two nodes. In this paper, we mostly refer to the notion of temporal reachability by using the earliest arrival time. In particular, we first show how the sketch approach, which has already been used in the case of classical graphs, can be applied to the case of temporal networks in order to approximately compute the sizes of the temporal cones of a temporal network. By making use of this approach, we subsequently show how we can approximate the temporal neighborhood function (that is, the number of pairs of nodes reachable from one another in a given time interval) of large temporal networks in a few seconds. Finally, we apply our algorithm in order to analyze and compare the behavior of 25 public transportation temporal networks. Our results can be easily adapted to the case in which we want to refer to the notion of distance based on the latest starting time.


2019 ◽  
Vol 35 (18) ◽  
pp. 3527-3529 ◽  
Author(s):  
David Aparício ◽  
Pedro Ribeiro ◽  
Tijana Milenković ◽  
Fernando Silva

Abstract Motivation Network alignment (NA) finds conserved regions between two networks. NA methods optimize node conservation (NC) and edge conservation. Dynamic graphlet degree vectors are a state-of-the-art dynamic NC measure, used within the fastest and most accurate NA method for temporal networks: DynaWAVE. Here, we use graphlet-orbit transitions (GoTs), a different graphlet-based measure of temporal node similarity, as a new dynamic NC measure within DynaWAVE, resulting in GoT-WAVE. Results On synthetic networks, GoT-WAVE improves DynaWAVE’s accuracy by 30% and speed by 64%. On real networks, when optimizing only dynamic NC, the methods are complementary. Furthermore, only GoT-WAVE supports directed edges. Hence, GoT-WAVE is a promising new temporal NA algorithm, which efficiently optimizes dynamic NC. We provide a user-friendly user interface and source code for GoT-WAVE. Availability and implementation http://www.dcc.fc.up.pt/got-wave/ Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Vol 220 (3) ◽  
pp. 1845-1856 ◽  
Author(s):  
W Marzocchi ◽  
I Spassiani ◽  
A Stallone ◽  
M Taroni

SUMMARY An unbiased estimation of the b-value and of its variability is essential to verify empirically its physical contribution to the earthquake generation process, and the capability to improve earthquake forecasting and seismic hazard. Notwithstanding the vast literature on the b-value estimation, we note that some potential sources of bias that may lead to non-physical b-value variations are too often ignored in seismological common practice. The aim of this paper is to discuss some of them in detail, when the b-value is estimated through the popular Aki’s formula. Specifically, we describe how a finite data set can lead to biased evaluations of the b-value and its uncertainty, which are caused by the correlation between the b-value and the maximum magnitude of the data set; we quantify analytically the bias on the b-value caused by the magnitude binning; we show how departures from the exponential distribution of the magnitude, caused by a truncated Gutenberg–Richter law and by catalogue incompleteness, can affect the b-value estimation and the search for statistically significant variations; we derive explicitly the statistical distribution of the magnitude affected by random symmetrical error, showing that the magnitude error does not induce any further significant bias, at least for reasonable amplitude of the measurement error. Finally, we provide some recipes to minimize the impact of these potential sources of bias.


2017 ◽  
Vol 5 (1) ◽  
pp. 1-21
Author(s):  
Steven K Thompson

Abstract In this paper, I discuss some of the wider uses of adaptive and network sampling designs. Three uses of sampling designs are to select units from a population to make inferences about population values, to select units to use in an experiment, and to distribute interventions to benefit a population. The most useful approaches for inference from adaptively selected samples are design-based methods and Bayesian methods. Adaptive link-tracing network sampling methods are important for sampling populations that are otherwise hard to reach. Sampling in changing populations involves temporal network or spatial sampling design processes with units selected both into and out of the sample over time. Averaging or smoothing fast-moving versions of these designs provides simple estimates of network-related characteristics. The effectiveness of intervention programs to benefit populations depends a great deal on the sampling and assignment designs used in spreading the intervention.


2021 ◽  
Vol 23 ◽  
Author(s):  
Caijun Qin

This paper proposes a novel, exploration-based network sampling algorithm called caterpillar quota walk sampling (CQWS) inspired by the caterpillar tree. Network sampling identifies a subset of nodes and edges from a network, creating an induced graph. Beginning from an initial node, exploration-based sampling algorithms grow the induced set by traversing and tracking unvisited neighboring nodes from the original network. Tunable and trainable parameters allow CQWS to maximize the sum of the degrees of the induced graph from multiple trials when sampling dense networks. A network spread model renders effective use in various applications, including tracking the spread of epidemics, visualizing information transmissions through social media, and cell-to-cell spread of neurodegenerative diseases. CQWS generates a spread model as its sample by visiting the highest-degree neighbors of previously visited nodes. For each previously visited node, a top proportion of the highest-degree neighbors fulfills a quota and branches into a new caterpillar tree. Sampling more high-degree nodes constitutes an objective among various applications. Many exploration-based sampling algorithms suffer drawbacks that limit the sum of degrees of visited nodes and thus the number of high-degree nodes visited. Furthermore, a strategy may not be adaptable to volatile degree frequencies throughout the original network architecture, which influences how deep into the original network an algorithm could sample. This paper analyzes CQWS in comparison to four other exploration-based network in tackling these two problems by sampling sparse and dense randomly generated networks.


2019 ◽  
Vol 22 (03) ◽  
pp. 1950006
Author(s):  
ANDREW MELLOR

Recent advances in data collection and storage have allowed both researchers and industry alike to collect data in real time. Much of this data comes in the form of ‘events’, or timestamped interactions, such as email and social media posts, website clickstreams, or protein–protein interactions. This type of data poses new challenges for modeling, especially if we wish to preserve all temporal features and structure. We highlight several recent approaches in modeling higher-order temporal interaction and bring them together under the umbrella of event graphs. Through examples, we demonstrate how event graphs can be used to understand the higher-order topological-temporal structure of temporal networks and capture properties of the network that are unobservable when considering either a static (or time-aggregated) model. We introduce new algorithms for temporal motif enumeration and provide a novel analysis of the communicability centrality for temporal networks. Furthermore, we show that by modeling a temporal network as an event graph our analysis extends easily to non-dyadic interactions, known as hyper-events.


Sign in / Sign up

Export Citation Format

Share Document