Toward cost-efficient sampling methods

2015 ◽  
Vol 26 (05) ◽  
pp. 1550050 ◽  
Author(s):  
Peng Luo ◽  
Yongli Li ◽  
Chong Wu ◽  
Guijie Zhang

The sampling method has been paid much attention in the field of complex network in general and statistical physics in particular. This paper proposes two new sampling methods based on the idea that a small part of vertices with high node degree could possess the most structure information of a complex network. The two proposed sampling methods are efficient in sampling high degree nodes so that they would be useful even if the sampling rate is low, which means cost-efficient. The first new sampling method is developed on the basis of the widely used stratified random sampling (SRS) method and the second one improves the famous snowball sampling (SBS) method. In order to demonstrate the validity and accuracy of two new sampling methods, we compare them with the existing sampling methods in three commonly used simulation networks that are scale-free network, random network, small-world network, and also in two real networks. The experimental results illustrate that the two proposed sampling methods perform much better than the existing sampling methods in terms of achieving the true network structure characteristics reflected by clustering coefficient, Bonacich centrality and average path length, especially when the sampling rate is low.

2014 ◽  
Vol 25 (05) ◽  
pp. 1440007 ◽  
Author(s):  
Qi Gao ◽  
Xintong Ding ◽  
Feng Pan ◽  
Weixing Li

Sampling subnet is an important topic of complex network research. Sampling methods influence the structure and characteristics of subnet. Random multiple snowball with Cohen (RMSC) process sampling which combines the advantages of random sampling and snowball sampling is proposed in this paper. It has the ability to explore global information and discover the local structure at the same time. The experiments indicate that this novel sampling method could keep the similarity between sampling subnet and original network on degree distribution, connectivity rate and average shortest path. This method is applicable to the situation where the prior knowledge about degree distribution of original network is not sufficient.


Entropy ◽  
2020 ◽  
Vol 22 (8) ◽  
pp. 904
Author(s):  
Aldo Ramirez-Arellano

A complex network as an abstraction of a language system has attracted much attention during the last decade. Linguistic typological research using quantitative measures is a current research topic based on the complex network approach. This research aims at showing the node degree, betweenness, shortest path length, clustering coefficient, and nearest neighbourhoods’ degree, as well as more complex measures such as: the fractal dimension, the complexity of a given network, the Area Under Box-covering, and the Area Under the Robustness Curve. The literary works of Mexican writers were classify according to their genre. Precisely 87% of the full word co-occurrence networks were classified as a fractal. Also, empirical evidence is presented that supports the conjecture that lemmatisation of the original text is a renormalisation process of the networks that preserve their fractal property and reveal stylistic attributes by genre.


Author(s):  
Kryztopher D. Tung ◽  
Yoko E. Fukumura ◽  
Nancy A. Baker ◽  
Jane L. Forrest ◽  
Shawn C. Roll

Introduction: The Rapid Upper Limb Assessment (RULA) is an ergonomic assessment tool used to screen for risk of musculoskeletal injury due to working posture. The RULA is traditionally applied once during a work task to approximate overall risk. No method exists for estimating a RULA score for work requiring frequent shifts in posture across an extended period of time. Purpose: The goal of this study was to identify an optimal sampling method for applying the RULA across a long time-period that accurately represents overall risk. Methods: Four right-handed female dental hygiene students were video recorded from three angles while performing hand scaling during patient clinic visits (88.97 minutes on average). RULA was continuously scored across the entire session, updating the score when a significant postural shift lasting for more than 15 seconds occurred. A time-weighted average (TWA) RULA score was calculated. Three sampling methods were evaluated: equivalent interval samples, random samples, and random samples selection weighted within “clock positions.” Each method was compared to the TWA using a paired samples t-test and percent difference. Results: TWA RULA across the four students ranged from 3.4 to 4.3. Preliminary sampling averages using 10 samples were all within 0.2 of the TWA. Further iterations evaluating various sample sizes is ongoing. Discussion: Preliminary results suggest that all three sampling methods provide a reasonably accurate approximation of the TWA score at the sampling rate tested. Future iterations of this analysis will be continued to identify the minimum required sampling rate to meet our TWA criterion.


Author(s):  
Georgiy Bobashev ◽  
R. Joey Morris ◽  
Elizabeth Costenbader ◽  
Kyle Vincent

Using data from an enumerated network of worldwide flight connections between airports, we examine how sampling designs and sample size influence network metrics. Specifically, we apply three types of sampling designs: simple random sampling, nonrandom strategic sampling (i.e., selection of the largest airports), and a variation of snowball sampling. For the latter sampling method, we design what we refer to as a controlled snowball sampling design, which selects nodes in a manner analogous to a respondent-driven sampling design. For each design, we evaluate five commonly used measures of network structure and examine the percentage of total air traffic accounted for by each design. The empirical application shows that (1) the random and controlled snowball sampling designs give rise to more efficient estimates of the true underlying structure, and (2) the strategic sampling method can account for a greater proportion of the total number of passenger movements occurring in the network.


2019 ◽  
Vol 11 (2) ◽  
pp. 523 ◽  
Author(s):  
Wei Yu ◽  
Jun Chen ◽  
Xingchen Yan

Many cities in China have opened a subway, which has become an important part of urban public transport. How the metro line forms the metro network, and then changes the urban traffic pattern, is a problem worthy of attention. From 2005 to 2018, 10 metro lines were opened in Nanjing, which provides important reference data for the study of the spatial and temporal evolution of the Metro network. In this study, using the complex network method, according to the opening sequence of 10 metro lines in Nanjing, space L and space P models are established, respectively. In view of the evolution of metro network parameters, four parameters—network density, network centrality, network clustering coefficient, and network average distance—are proposed for evaluation. In view of the spatial structure change of the metro network, this study combines the concept of node degree in a complex network, analyzes the starting point, terminal point, and intersection point of metro line, and puts forward the concepts of star structure and ring structure. The analysis of the space‒time evolution of Nanjing metro network shows that with the gradual opening of metro lines, the metro network presents a more complex structure; the line connection tends to important nodes, and gradually outlines the city’s commercial space pattern.


Complexity ◽  
2018 ◽  
Vol 2018 ◽  
pp. 1-11 ◽  
Author(s):  
Yu Wei ◽  
Sun Ning

In recent years, many researchers have applied complex network theory to urban public transport network to construct complex network and analyze its network performance. The original analysis method generally uses the Space L and Space R model to establish a simple link between public sites but ignores the organic link between the overall network system and the line subsystem. As an important part of urban public transport system, subway plays an important role in alleviating traffic pressure. In this paper, a supernetwork model of Nanjing metro network is established by using the supernetwork method. Three parameters, node-hyperedge degree, hyperedge-node degree, and hyperedge degree, are proposed to describe the model. The model is compared with the traditional Space L and Space P models. The study on the supernetwork model of Nanjing metro complex network shows that the network density, network centrality, and network clustering coefficient are large, and the average network distance is small, which meets the requirements of traffic planning and design. In this study, the subway line is considered as a subsystem and further simplified as a node, so that the complex network analysis method can be applied to the new supernetwork model, expanding the thinking of complex network research.


2014 ◽  
Vol 672-674 ◽  
pp. 2173-2177
Author(s):  
Yang Yang He ◽  
Ling Wang

According to the international coal trade data of the years from 1996 to 2011 published by UN COMTRADE (UNSD), it can be inferred that the data is mainly about international trade of raw coal and related coal products. By adopting the theory of complex network analysis, this paper calculates the complex network of international coal trade in the aspect of its density, node degree, centrality, point strength, clustering coefficient. Based on these properties, this paper further analyzes the evolution rule for international coal trade network of raw coal, coal briquettes and ovate coal over the last 16 years, as well as the difference between the pre-and after financial crisis.


2021 ◽  
Author(s):  
Mayuri Gadhawe ◽  
Ravi Kumar Guntu ◽  
Ankit Agarwal

<p>Complex network is a relatively young, multidisciplinary field with an objective to unravel the spatiotemporal interaction in natural processes. Though network theory has become a very important paradigm in many fields, the applications in the hydrology field are still at an emerging stage.  In this study, we employed the Pearson correlation coefficient and Spearman correlation coefficient as a similarity measure with varying threshold ranges to construct the precipitation network of the Ganga River Basin (GRB). Ground-based observed dataset (IMD) and satellite precipitation product (TRMM) are used. Different network properties such as node degree, degree distribution, clustering coefficient, and architecture were computed on each resultant precipitation network of GRB. We also ranked influential grid points in the precipitation network by using weighted degree betweenness to identify the importance of each grid station in the network Our results reveal that the choice of correlation method does not significantly affect the network measures and reconfirm that the thresholds significantly influence network construction and network properties in the case of both datasets. The spatial distribution of the clustering coefficient value is high to low from center to boundary and inverse in the case of degree.  In addition, there is a positive correlation between the average neighbor degree and node degree. Again, we analyzed the architecture of precipitation networks and found that the network has a small world with random network behavior.   Our results also indicated that both products have similar network measures and showed similar kinds of spatial patterns.</p>


Sign in / Sign up

Export Citation Format

Share Document