scholarly journals Structural Entropy of the Stochastic Block Models

Entropy ◽  
2022 ◽  
Vol 24 (1) ◽  
pp. 81
Author(s):  
Jie Han ◽  
Tao Guo ◽  
Qiaoqiao Zhou ◽  
Wei Han ◽  
Bo Bai ◽  
...  

With the rapid expansion of graphs and networks and the growing magnitude of data from all areas of science, effective treatment and compression schemes of context-dependent data is extremely desirable. A particularly interesting direction is to compress the data while keeping the “structural information” only and ignoring the concrete labelings. Under this direction, Choi and Szpankowski introduced the structures (unlabeled graphs) which allowed them to compute the structural entropy of the Erdos–Rényi random graph model. Moreover, they also provided an asymptotically optimal compression algorithm that (asymptotically) achieves this entropy limit and runs in expectation in linear time. In this paper, we consider the stochastic block models with an arbitrary number of parts. Indeed, we define a partitioned structural entropy for stochastic block models, which generalizes the structural entropy for unlabeled graphs and encodes the partition information as well. We then compute the partitioned structural entropy of the stochastic block models, and provide a compression scheme that asymptotically achieves this entropy limit.

2013 ◽  
Vol 23 (1) ◽  
pp. 29-49 ◽  
Author(s):  
YAEL DEKEL ◽  
ORI GUREL-GUREVICH ◽  
YUVAL PERES

We are given a graph G with n vertices, where a random subset of k vertices has been made into a clique, and the remaining edges are chosen independently with probability $\frac12$. This random graph model is denoted $G(n,\frac12,k)$. The hidden clique problem is to design an algorithm that finds the k-clique in polynomial time with high probability. An algorithm due to Alon, Krivelevich and Sudakov [3] uses spectral techniques to find the hidden clique with high probability when $k = c \sqrt{n}$ for a sufficiently large constant c > 0. Recently, an algorithm that solves the same problem was proposed by Feige and Ron [12]. It has the advantages of being simpler and more intuitive, and of an improved running time of O(n2). However, the analysis in [12] gives a success probability of only 2/3. In this paper we present a new algorithm for finding hidden cliques that both runs in time O(n2) (that is, linear in the size of the input) and has a failure probability that tends to 0 as n tends to ∞. We develop this algorithm in the more general setting where the clique is replaced by a dense random graph.


Author(s):  
Mark Newman

A discussion of the most fundamental of network models, the configuration model, which is a random graph model of a network with a specified degree sequence. Following a definition of the model a number of basic properties are derived, including the probability of an edge, the expected number of multiedges, the excess degree distribution, the friendship paradox, and the clustering coefficient. This is followed by derivations of some more advanced properties including the condition for the existence of a giant component, the size of the giant component, the average size of a small component, and the expected diameter. Generating function methods for network models are also introduced and used to perform some more advanced calculations, such as the calculation of the distribution of the number of second neighbors of a node and the complete distribution of sizes of small components. The chapter ends with a brief discussion of extensions of the configuration model to directed networks, bipartite networks, networks with degree correlations, networks with high clustering, and networks with community structure, among other possibilities.


Author(s):  
Mark Newman

An introduction to the mathematics of the Poisson random graph, the simplest model of a random network. The chapter starts with a definition of the model, followed by derivations of basic properties like the mean degree, degree distribution, and clustering coefficient. This is followed with a detailed derivation of the large-scale structural properties of random graphs, including the position of the phase transition at which a giant component appears, the size of the giant component, the average size of the small components, and the expected diameter of the network. The chapter ends with a discussion of some of the shortcomings of the random graph model.


Biology ◽  
2021 ◽  
Vol 10 (6) ◽  
pp. 499
Author(s):  
Ali Andalibi ◽  
Naoru Koizumi ◽  
Meng-Hao Li ◽  
Abu Bakkar Siddique

Kanagawa and Hokkaido were affected by COVID-19 in the early stage of the pandemic. Japan’s initial response included contact tracing and PCR analysis on anyone who was suspected of having been exposed to SARS-CoV-2. In this retrospective study, we analyzed publicly available COVID-19 registry data from Kanagawa and Hokkaido (n = 4392). Exponential random graph model (ERGM) network analysis was performed to examine demographic and symptomological homophilies. Age, symptomatic, and asymptomatic status homophilies were seen in both prefectures. Symptom homophilies suggest that nuanced genetic differences in the virus may affect its epithelial cell type range and can result in the diversity of symptoms seen in individuals infected by SARS-CoV-2. Environmental variables such as temperature and humidity may also play a role in the overall pathogenesis of the virus. A higher level of asymptomatic transmission was observed in Kanagawa. Moreover, patients who contracted the virus through secondary or tertiary contacts were shown to be asymptomatic more frequently than those who contracted it from primary cases. Additionally, most of the transmissions stopped at the primary and secondary levels. As expected, significant viral transmission was seen in healthcare settings.


2018 ◽  
Vol 68 (9) ◽  
pp. 1547-1555
Author(s):  
David P Bui ◽  
Eyal Oren ◽  
Denise J Roe ◽  
Heidi E Brown ◽  
Robin B Harris ◽  
...  

Abstract Background The majority of tuberculosis transmission occurs in community settings. Our primary aim in this study was to assess the association between exposure to community venues and multidrug-resistant (MDR) tuberculosis. Our secondary aim was to describe the social networks of MDR tuberculosis cases and controls. Methods We recruited laboratory-confirmed MDR tuberculosis cases and community controls that were matched on age and sex. Whole-genome sequencing was used to identify genetically clustered cases. Venue tracing interviews (nonblinded) were conducted to enumerate community venues frequented by participants. Logistic regression was used to assess the association between MDR tuberculosis and person-time spent in community venues. A location-based social network was constructed, with respondents connected if they reported frequenting the same venue, and an exponential random graph model (ERGM) was fitted to model the network. Results We enrolled 59 cases and 65 controls. Participants reported 729 unique venues. The mean number of venues reported was similar in both groups (P = .92). Person-time in healthcare venues (adjusted odds ratio [aOR] = 1.67, P = .01), schools (aOR = 1.53, P < .01), and transportation venues (aOR = 1.25, P = .03) was associated with MDR tuberculosis. Healthcare venues, markets, cinemas, and transportation venues were commonly shared among clustered cases. The ERGM indicated significant community segregation between cases and controls. Case networks were more densely connected. Conclusions Exposure to healthcare venues, schools, and transportation venues was associated with MDR tuberculosis. Intervention across the segregated network of case venues may be necessary to effectively stem transmission.


2018 ◽  
Vol 39 (3) ◽  
pp. 443-464 ◽  
Author(s):  
Francesca P. Vantaggiato

AbstractThe literature on transnational regulatory networks identified interdependence as their main rationale, downplaying domestic factors. Typically, relevant contributions use the word “network” only metaphorically. Yet, informal ties between regulators constitute networked structures of collaboration, which can be measured and explained. Regulators choose their frequent, regular network partners. What explains those choices? This article develops an Exponential Random Graph Model of the network of European national energy regulators to identify the drivers of informal regulatory networking. The results show that regulators tend to network with peers who regulate similarly organised market structures. Geography and European policy frameworks also play a role. Overall, the British regulator is significantly more active and influential than its peers, and a divide emerges between regulators from EU-15 and others. Therefore, formal frameworks of cooperation (i.e. a European Agency) were probably necessary to foster regulatory coordination across the EU.


Sign in / Sign up

Export Citation Format

Share Document