Link Prediction in Highly Fractional Data Sets

Author(s):  
Michael Fire ◽  
Rami Puzis ◽  
Yuval Elovici
Keyword(s):  
2020 ◽  
Vol 2 (4) ◽  
pp. 672-704
Author(s):  
Ece C. Mutlu ◽  
Toktam Oghaz ◽  
Amirarsalan Rajabi ◽  
Ivan Garibay

Link prediction in complex networks has attracted considerable attention from interdisciplinary research communities, due to its ubiquitous applications in biological networks, social networks, transportation networks, telecommunication networks, and, recently, knowledge graphs. Numerous studies utilized link prediction approaches in order sto find missing links or predict the likelihood of future links as well as employed for reconstruction networks, recommender systems, privacy control, etc. This work presents an extensive review of state-of-art methods and algorithms proposed on this subject and categorizes them into four main categories: similarity-based methods, probabilistic methods, relational models, and learning-based methods. Additionally, a collection of network data sets has been presented in this paper, which can be used in order to study link prediction. We conclude this study with a discussion of recent developments and future research directions.


Author(s):  
Valentina Franzoni ◽  
Andrea Chiancone ◽  
Alfredo Milani

Topological link prediction is the task of assessing the likelihood of new future links based on topological properties of entities in a network at a given time. In this paper, we introduce a multistrain bacterial diffusion model for link prediction, where the ranking of candidate links is based on the mutual transfer of bacteria strains via physical social contact. The model incorporates parameters like efficiency of the receiver surface, reproduction rate and number of social contacts. The basic idea is that entities continuously infect their neighborhood with their own bacteria strains, and such infections are iteratively propagated on the social network over time. The probability of transmission can be evaluated in terms of strains, reproduction, previous transfer, surface transfer efficiency, number of direct social contacts i.e. neighbors, multiple paths between entities. The value of the mutual strains of infection between a pair of entities is used to rank the potential arcs joining the entity nodes. The proposed multistrain diffusion model and mutual-strain infection ranking technique have been implemented and tested on widely accepted social network data sets. Experiments show that the MSDM-LP and mutual-strain diffusion ranking technique outperforms state-of-the-art algorithms for neighbor-based ranking.


Entropy ◽  
2019 ◽  
Vol 21 (3) ◽  
pp. 254 ◽  
Author(s):  
Shaokai Wang ◽  
Xutao Li ◽  
Yunming Ye ◽  
Shanshan Feng ◽  
Raymond Lau ◽  
...  

Presently, many users are involved in multiple social networks. Identifying the same user in different networks, also known as anchor link prediction, becomes an important problem, which can serve numerous applications, e.g., cross-network recommendation, user profiling, etc. Previous studies mainly use hand-crafted structure features, which, if not carefully designed, may fail to reflect the intrinsic structure regularities. Moreover, most of the methods neglect the attribute information of social networks. In this paper, we propose a novel semi-supervised network-embedding model to address the problem. In the model, each node of the multiple networks is represented by a vector for anchor link prediction, which is learnt with awareness of observed anchor links as semi-supervised information, and topology structure and attributes as input. Experimental results on the real-world data sets demonstrate the superiority of the proposed model compared to state-of-the-art techniques.


2018 ◽  
Vol 32 (01) ◽  
pp. 1850004 ◽  
Author(s):  
Hui-Min Cheng ◽  
Yi-Zi Ning ◽  
Zhao Yin ◽  
Chao Yan ◽  
Xin Liu ◽  
...  

Community detection and link prediction are both of great significance in network analysis, which provide very valuable insights into topological structures of the network from different perspectives. In this paper, we propose a novel community detection algorithm with inclusion of link prediction, motivated by the question whether link prediction can be devoted to improving the accuracy of community partition. For link prediction, we propose two novel indices to compute the similarity between each pair of nodes, one of which aims to add missing links, and the other tries to remove spurious edges. Extensive experiments are conducted on benchmark data sets, and the results of our proposed algorithm are compared with two classes of baselines. In conclusion, our proposed algorithm is competitive, revealing that link prediction does improve the precision of community detection.


2020 ◽  
Vol 10 (1) ◽  
pp. 33-41
Author(s):  
Srilatha Pulipati ◽  
Manjula Ramakrishnan

AbstractLink prediction problem has received remarkable interest in recent past. In this paper, firefly swarm intelligence algorithm is used to perform link prediction exploiting the topological and node attribute features of social network. Fireflies will be made to traverse on nodes and edges of social networks and the brightness of fireflies will play a major role in their movement. Common neighbor method of link prediction is used to compute similarity score upon each iteration. Performance of the proposed algorithm were analyzed over standard data sets using validation method called ten-fold method. The accuracy of proposed work is measured in terms of Area Under the Curve Characteristics (AUC), Recall and Precision. Experimental results showed that the proposed work outperforms the methods proposed in the literature.


Author(s):  
John A. Hunt

Spectrum-imaging is a useful technique for comparing different processing methods on very large data sets which are identical for each method. This paper is concerned with comparing methods of electron energy-loss spectroscopy (EELS) quantitative analysis on the Al-Li system. The spectrum-image analyzed here was obtained from an Al-10at%Li foil aged to produce δ' precipitates that can span the foil thickness. Two 1024 channel EELS spectra offset in energy by 1 eV were recorded and stored at each pixel in the 80x80 spectrum-image (25 Mbytes). An energy range of 39-89eV (20 channels/eV) are represented. During processing the spectra are either subtracted to create an artifact corrected difference spectrum, or the energy offset is numerically removed and the spectra are added to create a normal spectrum. The spectrum-images are processed into 2D floating-point images using methods and software described in [1].


Author(s):  
Mark Ellisman ◽  
Maryann Martone ◽  
Gabriel Soto ◽  
Eleizer Masliah ◽  
David Hessler ◽  
...  

Structurally-oriented biologists examine cells, tissues, organelles and macromolecules in order to gain insight into cellular and molecular physiology by relating structure to function. The understanding of these structures can be greatly enhanced by the use of techniques for the visualization and quantitative analysis of three-dimensional structure. Three projects from current research activities will be presented in order to illustrate both the present capabilities of computer aided techniques as well as their limitations and future possibilities.The first project concerns the three-dimensional reconstruction of the neuritic plaques found in the brains of patients with Alzheimer's disease. We have developed a software package “Synu” for investigation of 3D data sets which has been used in conjunction with laser confocal light microscopy to study the structure of the neuritic plaque. Tissue sections of autopsy samples from patients with Alzheimer's disease were double-labeled for tau, a cytoskeletal marker for abnormal neurites, and synaptophysin, a marker of presynaptic terminals.


Author(s):  
Douglas L. Dorset

The quantitative use of electron diffraction intensity data for the determination of crystal structures represents the pioneering achievement in the electron crystallography of organic molecules, an effort largely begun by B. K. Vainshtein and his co-workers. However, despite numerous representative structure analyses yielding results consistent with X-ray determination, this entire effort was viewed with considerable mistrust by many crystallographers. This was no doubt due to the rather high crystallographic R-factors reported for some structures and, more importantly, the failure to convince many skeptics that the measured intensity data were adequate for ab initio structure determinations.We have recently demonstrated the utility of these data sets for structure analyses by direct phase determination based on the probabilistic estimate of three- and four-phase structure invariant sums. Examples include the structure of diketopiperazine using Vainshtein's 3D data, a similar 3D analysis of the room temperature structure of thiourea, and a zonal determination of the urea structure, the latter also based on data collected by the Moscow group.


Author(s):  
W. Shain ◽  
H. Ancin ◽  
H.C. Craighead ◽  
M. Isaacson ◽  
L. Kam ◽  
...  

Neural protheses have potential to restore nervous system functions lost by trauma or disease. Nanofabrication extends this approach to implants for stimulating and recording from single or small groups of neurons in the spinal cord and brain; however, tissue compatibility is a major limitation to their practical application. We are using a cell culture method for quantitatively measuring cell attachment to surfaces designed for nanofabricated neural prostheses.Silicon wafer test surfaces composed of 50-μm bars separated by aliphatic regions were fabricated using methods similar to a procedure described by Kleinfeld et al. Test surfaces contained either a single or double positive charge/residue. Cyanine dyes (diIC18(3)) stained the background and cell membranes (Fig 1); however, identification of individual cells at higher densities was difficult (Fig 2). Nuclear staining with acriflavine allowed discrimination of individual cells and permitted automated counting of nuclei using 3-D data sets from the confocal microscope (Fig 3). For cell attachment assays, LRM5 5 astroglial cells and astrocytes in primary cell culture were plated at increasing cell densities on test substrates, incubated for 24 hr, fixed, stained, mounted on coverslips, and imaged with a 10x objective.


Author(s):  
Thomas W. Shattuck ◽  
James R. Anderson ◽  
Neil W. Tindale ◽  
Peter R. Buseck

Individual particle analysis involves the study of tens of thousands of particles using automated scanning electron microscopy and elemental analysis by energy-dispersive, x-ray emission spectroscopy (EDS). EDS produces large data sets that must be analyzed using multi-variate statistical techniques. A complete study uses cluster analysis, discriminant analysis, and factor or principal components analysis (PCA). The three techniques are used in the study of particles sampled during the FeLine cruise to the mid-Pacific ocean in the summer of 1990. The mid-Pacific aerosol provides information on long range particle transport, iron deposition, sea salt ageing, and halogen chemistry.Aerosol particle data sets suffer from a number of difficulties for pattern recognition using cluster analysis. There is a great disparity in the number of observations per cluster and the range of the variables in each cluster. The variables are not normally distributed, they are subject to considerable experimental error, and many values are zero, because of finite detection limits. Many of the clusters show considerable overlap, because of natural variability, agglomeration, and chemical reactivity.


Sign in / Sign up

Export Citation Format

Share Document