scholarly journals Comparing alternatives to the fixed degree sequence model for extracting the backbone of bipartite projections

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Zachary P. Neal ◽  
Rachel Domagalski ◽  
Bruce Sagan

AbstractProjections of bipartite or two-mode networks capture co-occurrences, and are used in diverse fields (e.g., ecology, economics, bibliometrics, politics) to represent unipartite networks. A key challenge in analyzing such networks is determining whether an observed number of co-occurrences between two nodes is significant, and therefore whether an edge exists between them. One approach, the fixed degree sequence model (FDSM), evaluates the significance of an edge’s weight by comparison to a null model in which the degree sequences of the original bipartite network are fixed. Although the FDSM is an intuitive null model, it is computationally expensive because it requires Monte Carlo simulation to estimate each edge’s p value, and therefore is impractical for large projections. In this paper, we explore four potential alternatives to FDSM: fixed fill model, fixed row model, fixed column model, and stochastic degree sequence model (SDSM). We compare these models to FDSM in terms of accuracy, speed, statistical power, similarity, and ability to recover known communities. We find that the computationally-fast SDSM offers a statistically conservative but close approximation of the computationally-impractical FDSM under a wide range of conditions, and that it correctly recovers a known community structure even when the signal is weak. Therefore, although each backbone model may have particular applications, we recommend SDSM for extracting the backbone of bipartite projections when FDSM is impractical.

10.37236/3414 ◽  
2013 ◽  
Vol 20 (4) ◽  
Author(s):  
Sarah Behrens ◽  
Catherine Erbes ◽  
Michael Ferrara ◽  
Stephen G. Hartke ◽  
Benjamin Reiniger ◽  
...  

A sequence of nonnegative integers is $k$-graphic if it is the degree sequence of a $k$-uniform hypergraph. The only known characterization of $k$-graphic sequences is due to Dewdney in 1975. As this characterization does not yield an efficient algorithm, it is a fundamental open question to determine a more practical characterization. While several necessary conditions appear in the literature, there are few conditions that imply a sequence is $k$-graphic. In light of this, we present sharp sufficient conditions for $k$-graphicality based on a sequence's length and degree sum.Kocay and Li gave a family of edge exchanges (an extension of 2-switches) that could be used to transform one realization of a 3-graphic sequence into any other realization. We extend their result to $k$-graphic sequences for all $k \geq 3$. Finally we give several applications of edge exchanges in hypergraphs, including generalizing a result of Busch et al. on packing graphic sequences.


1998 ◽  
Vol 21 (2) ◽  
pp. 228-235 ◽  
Author(s):  
Siu L. Chow

Entertaining diverse assumptions about empirical research, commentators give a wide range of verdicts on the NHSTP defence in Statistical significance. The null-hypothesis significance-test procedure (NHSTP) is defended in a framework in which deductive and inductive rules are deployed in theory corroboration in the spirit of Popper's Conjectures and refutations (1968b). The defensible hypothetico-deductive structure of the framework is used to make explicit the distinctions between (1) substantive and statistical hypotheses, (2) statistical alternative and conceptual alternative hypotheses, and (3) making statistical decisions and drawing theoretical conclusions. These distinctions make it easier to show that (1) H0 can be true, (2) the effect size is irrelevant to theory corroboration, and (3) “strong” hypotheses make no difference to NHSTP. Reservations about statistical power, meta-analysis, and the Bayesian approach are still warranted.


2017 ◽  
Vol 5 (6) ◽  
pp. 839-857 ◽  
Author(s):  
Asma Azizi Boroojeni ◽  
Jeremy Dewar ◽  
Tong Wu ◽  
James M Hyman

Abstract We describe a class of new algorithms to construct bipartite networks that preserves a prescribed degree and joint-degree (degree–degree) distribution of the nodes. Bipartite networks are graphs that can represent real-world interactions between two disjoint sets, such as actor–movie networks, author–article networks, co-occurrence networks and heterosexual partnership networks. Often there is a strong correlation between the degree of a node and the degrees of the neighbours of that node that must be preserved when generating a network that reflects the structure of the underling system. Our bipartite $2K$ ($B2K$) algorithms generate an ensemble of networks that preserve prescribed degree sequences for the two disjoint set of nodes in the bipartite network, and the joint-degree distribution that is the distribution of the degrees of all neighbours of nodes with the same degree. We illustrate the effectiveness of the algorithms on a romance network using the NetworkX software environment to compare other properties of a target network that are not directly enforced by the $B2K$ algorithms. We observe that when average degree of nodes is low, as is the case for romance and heterosexual partnership networks, then the $B2K$ networks tend to preserve additional properties, such as the cluster coefficients, than algorithms that do not preserve the joint-degree distribution of the original network.


2019 ◽  
Vol 85 (6) ◽  
Author(s):  
L. Hesslow ◽  
L. Unnerfelt ◽  
O. Vallhagen ◽  
O. Embreus ◽  
M. Hoppe ◽  
...  

Integrated modelling of electron runaway requires computationally expensive kinetic models that are self-consistently coupled to the evolution of the background plasma parameters. The computational expense can be reduced by using parameterized runaway generation rates rather than solving the full kinetic problem. However, currently available generation rates neglect several important effects; in particular, they are not valid in the presence of partially ionized impurities. In this work, we construct a multilayer neural network for the Dreicer runaway generation rate which is trained on data obtained from kinetic simulations performed for a wide range of plasma parameters and impurities. The neural network accurately reproduces the Dreicer runaway generation rate obtained by the kinetic solver. By implementing it in a fluid runaway-electron modelling tool, we show that the improved generation rates lead to significant differences in the self-consistent runaway dynamics as compared to the results using the previously available formulas for the runaway generation rate.


2009 ◽  
Vol 18 (5) ◽  
pp. 775-801 ◽  
Author(s):  
MICHAEL KRIVELEVICH ◽  
BENNY SUDAKOV ◽  
DAN VILENCHIK

In this work we suggest a new model for generating random satisfiable k-CNF formulas. To generate such formulas. randomly permute all $2^k\binom{n}{k}$ possible clauses over the variables x1,. . .,xn, and starting from the empty formula, go over the clauses one by one, including each new clause as you go along if, after its addition, the formula remains satisfiable. We study the evolution of this process, namely the distribution over formulas obtained after scanning through the first m clauses (in the random permutation's order).Random processes with conditioning on a certain property being respected are widely studied in the context of graph properties. This study was pioneered by Ruciński and Wormald in 1992 for graphs with a fixed degree sequence, and also by Erdős, Suen and Winkler in 1995 for triangle-free and bipartite graphs. Since then many other graph properties have been studied, such as planarity and H-freeness. Thus our model is a natural extension of this approach to the satisfiability setting.Our main contribution is as follows. For m ≥ cn, c = c(k) a sufficiently large constant, we are able to characterize the structure of the solution space of a typical formula in this distribution. Specifically, we show that typically all satisfying assignments are essentially clustered in one cluster, and all but e−Ω(m/n)n of the variables take the same value in all satisfying assignments. We also describe a polynomial-time algorithm that finds w.h.p. a satisfying assignment for such formulas.


2017 ◽  
Vol 42 (6) ◽  
pp. 563-570 ◽  
Author(s):  
Martin J. MacInnis ◽  
Chris McGlory ◽  
Martin J. Gibala ◽  
Stuart M. Phillips

Direct sampling of human skeletal muscle using the needle biopsy technique can facilitate insight into the biochemical and histological responses resulting from changes in exercise or feeding. However, the muscle biopsy procedure is invasive, and analyses are often expensive, which places pragmatic restraints on sample sizes. The unilateral exercise model can serve to increase statistical power and reduce the time and cost of a study. With this approach, 2 limbs of a participant are randomized to 1 of 2 treatments that can be applied almost concurrently or sequentially depending on the nature of the intervention. Similar to a typical repeated measures design, comparisons are made within participants, which increases statistical power by reducing the amount of between-person variability. A washout period is often unnecessary, reducing the time needed to complete the experiment and the influence of potential confounding variables such as habitual diet, activity, and sleep. Variations of the unilateral exercise model have been employed to investigate the influence of exercise, diet, and the interaction between the 2, on a wide range of variables including mitochondrial content, capillary density, and skeletal muscle hypertrophy. Like any model, unilateral exercise has some limitations: it cannot be used to study variables that potentially transfer across limbs, and it is generally limited to exercises that can be performed in pairs of treatments. Where appropriate, however, the unilateral exercise model can yield robust, well-controlled investigations of skeletal muscle responses to a wide range of interventions and conditions including exercise, dietary manipulation, and disuse or immobilization.


2018 ◽  
Vol 20 (91) ◽  
pp. 51-56
Author(s):  
M. Dorosh-Kizym ◽  
O. Dadak ◽  
T. Gachek

In recent years, e-commerce has been able to penetrate practically in all spheres of life of the population and in Ukraine. The domestic e-commerce market is only at the inception stage, but at the same time it has a significant potential for development. The Internet and related technologies continue to actively and effectively interfere with logistics. Modern logistics technologies are inextricably linked with intensive information exchange. Due to the timely receipt of information, high accuracy, speed and consistency of goods exchange in logistic chains are ensured. Over the past few years, the structure of the logistics space of the Internet has significantly changed, which was reflected in the transformation of the content of logistics-oriented sites. If advertising information dominated previously, now, often in interactive mode, organizational, reference and design services are presented. Business commerce, advertising, production and modern technologies of the Internet are all more vibrant in all strata of human life. Today, the penetration of Internet technologies into business and economics is growing rapidly. The Internet is now – a huge market with a multi-level structure with abyss of opportunities that has an incredible potential for solving problems and business. This penetration also affects logistics. Logistics technologies are unthinkable without intensive information exchange. It is owing to the timely receipt of information that provides high accuracy, speed and consistency of turnover in logistics chains. Therefore, logistics as a modern scientific and practical direction of commodity distribution is also rapidly mastering these technologies and in its own way is equipped on the network. The Internet as a technology of global open networks is the best means to attract a wide range of logistics service users. With the help of this technology can be provided: advertising company; providing a list of services and price lists; account of regular clients and companions; providing consumers with the necessary documents on a paid and free basis; interactive advisory service; counteragent search service; registries of logistics companies and the database of information and logistics resources in the network; electronic freight; monitoring of goods and vehicles; virtual agency and forwarding. Today, the process of accumulation of logistics resources in the Internet has reached a level that allows us to talk about the process of forming virtual logistics centers (commercial or conditional commercial). This, in turn, with further development, can form a single logistics information space on the Internet.


Sign in / Sign up

Export Citation Format

Share Document