Privacy preserving social graphs for high precision community detection

Author(s):  
Himel Dev
2014 ◽  
Vol 17 (01) ◽  
pp. 1450001 ◽  
Author(s):  
MICHEL CRAMPES ◽  
MICHEL PLANTIÉ

With the widespread social networks on the Internet, community detection in social graphs has recently become an important research domain. Interest was initially limited to unipartite graph inputs and partitioned community outputs. More recently, bipartite graphs, directed graphs and overlapping communities have all been investigated. Few contributions however have encompassed all three types of graphs simultaneously. In this paper, we present a method that unifies community detection for these three types of graphs while at the same time it merges partitioned and overlapping communities. Moreover, the results are visualized in a way that allows for analysis and semantic interpretation. For validation purposes this method is experimented on some well-known simple benchmarks and then applied to real data: photos and tags in Facebook and Human Brain Tractography data. This last application leads to the possibility of applying community detection methods to other fields such as data analysis with original enhanced performances.


Information ◽  
2020 ◽  
Vol 11 (4) ◽  
pp. 199 ◽  
Author(s):  
Christos Makris ◽  
Georgios Pispirigos ◽  
Ioannis Orestis Rizos

Presently, due to the extended availability of gigantic information networks and the beneficial application of graph analysis in various scientific fields, the necessity for efficient and highly scalable community detection algorithms has never been more essential. Despite the significant amount of published research, the existing methods—such as the Girvan–Newman, random-walk edge betweenness, vertex centrality, InfoMap, spectral clustering, etc.—have virtually been proven incapable of handling real-life social graphs due to the intrinsic computational restrictions that lead to mediocre performance and poor scalability. The purpose of this article is to introduce a novel, distributed community detection methodology which in accordance with the community prediction concept, leverages the reduced complexity and the decreased variance of the bagging ensemble methods, to unveil the subjacent community hierarchy. The proposed approach has been thoroughly tested, meticulously compared against different classic community detection algorithms, and practically proven exceptionally scalable, eminently efficient, and promisingly accurate in unfolding the underlying community structure.


Author(s):  
Christina Boura ◽  
Ilaria Chillotti ◽  
Nicolas Gama ◽  
Dimitar Jetchev ◽  
Stanislav Peceny ◽  
...  

Author(s):  
Rainer Schnell ◽  
Christian Borgs

IntroductionNational mortality registers are essential for medical research. Therefore, most nations operate such registers. Due to the administrative structure and data protection legislation, there is no such registry in Germany. We demonstrate that a national mortality registry is technically feasible under the given constraints with privacy preserving record linkage (PPRL). Objectives and ApproachGetting the legal permission to operate a national mortality registry for research will be easier if the linkage can be done without revealing personal identifiers by using PPRL. To estimate precision and recall of different encodings, we used two settings: (1) matching a local mortality registry (n = 14,003) with mortality data of a university hospital (n = 2,466); (2) matching 1 million simulated records from a national database of names with a corrupted subset. This corresponds to a match of all deceased persons with the deceased persons in the largest federal state (n = 205,000). ResultsLinkage results for clear-text identifiers show very high recall and precision. Bloom-Filter based encryptions yield comparable results. Neither precision nor recall declines more than 2%. Phonetic codes yield high precision but low recall. Some variants of Bloom Filter-based encodings yield better results than probabilistic linkage on clear-text identifiers. This is mainly due to the rarely mentioned detail of using different passwords for different identifiers in the same Bloom Filter. Therefore, implementation details of Bloom Filters are more important than commonly thought. Overall, we recommend the use of salted Bloom Filter-based methods with different passwords for different identifiers to increase security and to prevent all known attacks on identifier encryptions. Conclusion/ImplicationsAlthough most PPRL techniques would yield acceptable results in the given setting of a national register, salted Bloom filter encodings are more secure against attacks while still showing high precision and recall. Therefore, we consider a national mortality register using only encrypted identifiers of deceased persons as feasible.


Algorithms ◽  
2019 ◽  
Vol 12 (8) ◽  
pp. 175
Author(s):  
Konstantinos Georgiou ◽  
Christos Makris ◽  
Georgios Pispirigos

Nowadays, the amount of digitally available information has tremendously grown, with real-world data graphs outreaching the millions or even billions of vertices. Hence, community detection, where groups of vertices are formed according to a well-defined similarity measure, has never been more essential affecting a vast range of scientific fields such as bio-informatics, sociology, discrete mathematics, nonlinear dynamics, digital marketing, and computer science. Even if an impressive amount of research has yet been published to tackle this NP-hard class problem, the existing methods and algorithms have virtually been proven inefficient and severely unscalable. In this regard, the purpose of this manuscript is to combine the network topology properties expressed by the loose similarity and the local edge betweenness, which is a currently proposed Girvan–Newman’s edge betweenness measure alternative, along with the intrinsic user content information, in order to introduce a novel and highly distributed hybrid community detection methodology. The proposed approach has been thoroughly tested on various real social graphs, roundly compared to other classic divisive community detection algorithms that serve as baselines and practically proven exceptionally scalable, highly efficient, and adequately accurate in terms of revealing the subjacent network hierarchy.


Author(s):  
J. C. Russ ◽  
T. Taguchi ◽  
P. M. Peters ◽  
E. Chatfield ◽  
J. C. Russ ◽  
...  

Conventional SAD patterns as obtained in the TEM present difficulties for identification of materials such as asbestiform minerals, although diffraction data is considered to be an important method for making this purpose. The preferred orientation of the fibers and the spotty patterns that are obtained do not readily lend themselves to measurement of the integrated intensity values for each d-spacing, and even the d-spacings may be hard to determine precisely because the true center location for the broken rings requires estimation. We have implemented an automatic method for diffraction pattern measurement to overcome these problems. It automatically locates the center of patterns with high precision, measures the radius of each ring of spots in the pattern, and integrates the density of spots in that ring. The resulting spectrum of intensity vs. radius is then used just as a conventional X-ray diffractometer scan would be, to locate peaks and produce a list of d,I values suitable for search/match comparison to known or expected phases.


Author(s):  
K. Z. Botros ◽  
S. S. Sheinin

The main features of weak beam images of dislocations were first described by Cockayne et al. using calculations of intensity profiles based on the kinematical and two beam dynamical theories. The feature of weak beam images which is of particular interest in this investigation is that intensity profiles exhibit a sharp peak located at a position very close to the position of the dislocation in the crystal. This property of weak beam images of dislocations has an important application in the determination of stacking fault energy of crystals. This can easily be done since the separation of the partial dislocations bounding a stacking fault ribbon can be measured with high precision, assuming of course that the weak beam relationship between the positions of the image and the dislocation is valid. In order to carry out measurements such as these in practice the specimen must be tilted to "good" weak beam diffraction conditions, which implies utilizing high values of the deviation parameter Sg.


Author(s):  
Klaus-Ruediger Peters

Differential hysteresis processing is a new image processing technology that provides a tool for the display of image data information at any level of differential contrast resolution. This includes the maximum contrast resolution of the acquisition system which may be 1,000-times higher than that of the visual system (16 bit versus 6 bit). All microscopes acquire high precision contrasts at a level of <0.01-25% of the acquisition range in 16-bit - 8-bit data, but these contrasts are mostly invisible or only partially visible even in conventionally enhanced images. The processing principle of the differential hysteresis tool is based on hysteresis properties of intensity variations within an image.Differential hysteresis image processing moves a cursor of selected intensity range (hysteresis range) along lines through the image data reading each successive pixel intensity. The midpoint of the cursor provides the output data. If the intensity value of the following pixel falls outside of the actual cursor endpoint values, then the cursor follows the data either with its top or with its bottom, but if the pixels' intensity value falls within the cursor range, then the cursor maintains its intensity value.


Sign in / Sign up

Export Citation Format

Share Document