Link Prediction in Highly Fractional Data Sets

Review on Learning and Extracting Graph Features for Link Prediction

Machine Learning and Knowledge Extraction ◽

10.3390/make2040036 ◽

2020 ◽

Vol 2 (4) ◽

pp. 672-704

Author(s):

Ece C. Mutlu ◽

Toktam Oghaz ◽

Amirarsalan Rajabi ◽

Ivan Garibay

Keyword(s):

Biological Networks ◽

Link Prediction ◽

Telecommunication Networks ◽

Probabilistic Methods ◽

Future Research ◽

Data Sets ◽

Relational Models ◽

Research Directions ◽

Recent Developments ◽

Future Research Directions

Link prediction in complex networks has attracted considerable attention from interdisciplinary research communities, due to its ubiquitous applications in biological networks, social networks, transportation networks, telecommunication networks, and, recently, knowledge graphs. Numerous studies utilized link prediction approaches in order sto find missing links or predict the likelihood of future links as well as employed for reconstruction networks, recommender systems, privacy control, etc. This work presents an extensive review of state-of-art methods and algorithms proposed on this subject and categorizes them into four main categories: similarity-based methods, probabilistic methods, relational models, and learning-based methods. Additionally, a collection of network data sets has been presented in this paper, which can be used in order to study link prediction. We conclude this study with a discussion of recent developments and future research directions.

Download Full-text

A Multistrain Bacterial Diffusion Model for Link Prediction

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001417590248 ◽

2017 ◽

Vol 31 (11) ◽

pp. 1759024 ◽

Cited By ~ 11

Author(s):

Valentina Franzoni ◽

Andrea Chiancone ◽

Alfredo Milani

Keyword(s):

Social Network ◽

Diffusion Model ◽

Link Prediction ◽

Social Contact ◽

Reproduction Rate ◽

Data Sets ◽

Social Contacts ◽

Social Network Data ◽

Multiple Paths ◽

Ranking Technique

Topological link prediction is the task of assessing the likelihood of new future links based on topological properties of entities in a network at a given time. In this paper, we introduce a multistrain bacterial diffusion model for link prediction, where the ranking of candidate links is based on the mutual transfer of bacteria strains via physical social contact. The model incorporates parameters like efficiency of the receiver surface, reproduction rate and number of social contacts. The basic idea is that entities continuously infect their neighborhood with their own bacteria strains, and such infections are iteratively propagated on the social network over time. The probability of transmission can be evaluated in terms of strains, reproduction, previous transfer, surface transfer efficiency, number of direct social contacts i.e. neighbors, multiple paths between entities. The value of the mutual strains of infection between a pair of entities is used to rank the potential arcs joining the entity nodes. The proposed multistrain diffusion model and mutual-strain infection ranking technique have been implemented and tested on widely accepted social network data sets. Experiments show that the MSDM-LP and mutual-strain diffusion ranking technique outperforms state-of-the-art algorithms for neighbor-based ranking.

Download Full-text

Anchor Link Prediction across Attributed Networks via Network Embedding

Entropy ◽

10.3390/e21030254 ◽

2019 ◽

Vol 21 (3) ◽

pp. 254 ◽

Cited By ~ 4

Author(s):

Shaokai Wang ◽

Xutao Li ◽

Yunming Ye ◽

Shanshan Feng ◽

Raymond Lau ◽

...

Keyword(s):

Social Networks ◽

Link Prediction ◽

State Of The Art ◽

User Profiling ◽

Data Sets ◽

Network Embedding ◽

Real World Data ◽

Intrinsic Structure ◽

Multiple Networks ◽

Proposed Model

Presently, many users are involved in multiple social networks. Identifying the same user in different networks, also known as anchor link prediction, becomes an important problem, which can serve numerous applications, e.g., cross-network recommendation, user profiling, etc. Previous studies mainly use hand-crafted structure features, which, if not carefully designed, may fail to reflect the intrinsic structure regularities. Moreover, most of the methods neglect the attribute information of social networks. In this paper, we propose a novel semi-supervised network-embedding model to address the problem. In the model, each node of the multiple networks is represented by a vector for anchor link prediction, which is learnt with awareness of observed anchor links as semi-supervised information, and topology structure and attributes as input. Experimental results on the real-world data sets demonstrate the superiority of the proposed model compared to state-of-the-art techniques.

Download Full-text

Community detection in complex networks using link prediction

Modern Physics Letters B ◽

10.1142/s0217984918500045 ◽

2018 ◽

Vol 32 (01) ◽

pp. 1850004 ◽

Cited By ~ 9

Author(s):

Hui-Min Cheng ◽

Yi-Zi Ning ◽

Zhao Yin ◽

Chao Yan ◽

Xin Liu ◽

...

Keyword(s):

Network Analysis ◽

Complex Networks ◽

Community Detection ◽

Link Prediction ◽

Detection Algorithm ◽

The Other ◽

Data Sets ◽

Topological Structures ◽

Benchmark Data ◽

Community Detection Algorithm

Community detection and link prediction are both of great significance in network analysis, which provide very valuable insights into topological structures of the network from different perspectives. In this paper, we propose a novel community detection algorithm with inclusion of link prediction, motivated by the question whether link prediction can be devoted to improving the accuracy of community partition. For link prediction, we propose two novel indices to compute the similarity between each pair of nodes, one of which aims to add missing links, and the other tries to remove spurious edges. Extensive experiments are conducted on benchmark data sets, and the results of our proposed algorithm are compared with two classes of baselines. In conclusion, our proposed algorithm is competitive, revealing that link prediction does improve the precision of community detection.

Download Full-text

Topological and Attribute Link Prediction using Firefly algorithm

Open Computer Science ◽

10.1515/comp-2020-0001 ◽

2020 ◽

Vol 10 (1) ◽

pp. 33-41

Author(s):

Srilatha Pulipati ◽

Manjula Ramakrishnan

Keyword(s):

Link Prediction ◽

Area Under The Curve ◽

Similarity Score ◽

Data Sets ◽

Prediction Problem ◽

Recent Past ◽

Common Neighbor ◽

Standard Data ◽

Node Attribute ◽

Swarm Intelligence Algorithm

AbstractLink prediction problem has received remarkable interest in recent past. In this paper, firefly swarm intelligence algorithm is used to perform link prediction exploiting the topological and node attribute features of social network. Fireflies will be made to traverse on nodes and edges of social networks and the brightness of fireflies will play a major role in their movement. Common neighbor method of link prediction is used to compute similarity score upon each iteration. Performance of the proposed algorithm were analyzed over standard data sets using validation method called ten-fold method. The accuracy of proposed work is measured in terms of Area Under the Curve Characteristics (AUC), Recall and Precision. Experimental results showed that the proposed work outperforms the methods proposed in the literature.

Download Full-text

An example of spectrum imaging used for comparison of EELS quantitative analysis techniques on Al-Li

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s042482010008794x ◽

1991 ◽

Vol 49 ◽

pp. 726-727

Author(s):

John A. Hunt

Keyword(s):

Quantitative Analysis ◽

Large Data ◽

Difference Spectrum ◽

Large Data Sets ◽

Foil Thickness ◽

Data Sets ◽

Analysis Techniques ◽

Spectrum Imaging ◽

Normal Spectrum ◽

Electron Energy Loss

Spectrum-imaging is a useful technique for comparing different processing methods on very large data sets which are identical for each method. This paper is concerned with comparing methods of electron energy-loss spectroscopy (EELS) quantitative analysis on the Al-Li system. The spectrum-image analyzed here was obtained from an Al-10at%Li foil aged to produce δ' precipitates that can span the foil thickness. Two 1024 channel EELS spectra offset in energy by 1 eV were recorded and stored at each pixel in the 80x80 spectrum-image (25 Mbytes). An energy range of 39-89eV (20 channels/eV) are represented. During processing the spectra are either subtracted to create an artifact corrected difference spectrum, or the energy offset is numerically removed and the spectra are added to create a normal spectrum. The spectrum-images are processed into 2D floating-point images using methods and software described in [1].

Download Full-text

Computer-aided methods for 3-D visualization of serial sections and thick biological specimens

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100129930 ◽

1992 ◽

Vol 50 (2) ◽

pp. 1060-1061

Author(s):

Mark Ellisman ◽

Maryann Martone ◽

Gabriel Soto ◽

Eleizer Masliah ◽

David Hessler ◽

...

Keyword(s):

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Three Dimensional ◽

Neuritic Plaque ◽

Dimensional Structure ◽

Data Sets ◽

Molecular Physiology ◽

Research Activities ◽

Computer Aided ◽

Dimensional Reconstruction

Structurally-oriented biologists examine cells, tissues, organelles and macromolecules in order to gain insight into cellular and molecular physiology by relating structure to function. The understanding of these structures can be greatly enhanced by the use of techniques for the visualization and quantitative analysis of three-dimensional structure. Three projects from current research activities will be presented in order to illustrate both the present capabilities of computer aided techniques as well as their limitations and future possibilities.The first project concerns the three-dimensional reconstruction of the neuritic plaques found in the brains of patients with Alzheimer's disease. We have developed a software package “Synu” for investigation of 3D data sets which has been used in conjunction with laser confocal light microscopy to study the structure of the neuritic plaque. Tissue sections of autopsy samples from patients with Alzheimer's disease were double-labeled for tau, a cytoskeletal marker for abnormal neurites, and synaptophysin, a marker of presynaptic terminals.

Download Full-text

Direct phase determination in electron crystallography: small organic molecules

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100130468 ◽

1992 ◽

Vol 50 (2) ◽

pp. 1166-1167

Author(s):

Douglas L. Dorset

Keyword(s):

Organic Molecules ◽

Data Sets ◽

Temperature Structure ◽

3D Analysis ◽

Intensity Data ◽

Electron Crystallography ◽

Phase Determination ◽

Measured Intensity ◽

3D Data

The quantitative use of electron diffraction intensity data for the determination of crystal structures represents the pioneering achievement in the electron crystallography of organic molecules, an effort largely begun by B. K. Vainshtein and his co-workers. However, despite numerous representative structure analyses yielding results consistent with X-ray determination, this entire effort was viewed with considerable mistrust by many crystallographers. This was no doubt due to the rather high crystallographic R-factors reported for some structures and, more importantly, the failure to convince many skeptics that the measured intensity data were adequate for ab initio structure determinations.We have recently demonstrated the utility of these data sets for structure analyses by direct phase determination based on the probabilistic estimate of three- and four-phase structure invariant sums. Examples include the structure of diketopiperazine using Vainshtein's 3D data, a similar 3D analysis of the room temperature structure of thiourea, and a zonal determination of the urea structure, the latter also based on data collected by the Moscow group.

Download Full-text

Automated cell counting of astrocytes on patterned substrates containing aliphatic and charged properties

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s042482010014124x ◽

1995 ◽

Vol 53 ◽

pp. 974-975

Author(s):

W. Shain ◽

H. Ancin ◽

H.C. Craighead ◽

M. Isaacson ◽

L. Kam ◽

...

Keyword(s):

Cell Culture ◽

Cell Attachment ◽

Culture Method ◽

Cell Counting ◽

Data Sets ◽

Nuclear Staining ◽

Double Positive ◽

A Cell ◽

Wafer Test ◽

Cell Densities

Neural protheses have potential to restore nervous system functions lost by trauma or disease. Nanofabrication extends this approach to implants for stimulating and recording from single or small groups of neurons in the spinal cord and brain; however, tissue compatibility is a major limitation to their practical application. We are using a cell culture method for quantitatively measuring cell attachment to surfaces designed for nanofabricated neural prostheses.Silicon wafer test surfaces composed of 50-μm bars separated by aliphatic regions were fabricated using methods similar to a procedure described by Kleinfeld et al. Test surfaces contained either a single or double positive charge/residue. Cyanine dyes (diIC18(3)) stained the background and cell membranes (Fig 1); however, identification of individual cells at higher densities was difficult (Fig 2). Nuclear staining with acriflavine allowed discrimination of individual cells and permitted automated counting of nuclei using 3-D data sets from the confocal microscope (Fig 3). For cell attachment assays, LRM5 5 astroglial cells and astrocytes in primary cell culture were plated at increasing cell densities on test substrates, incubated for 24 hr, fixed, stained, mounted on coverslips, and imaged with a 10x objective.

Download Full-text

Cluster analysis for large data sets: applications to individual aerosol particles from the mid-pacific

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100132078 ◽

1992 ◽

Vol 50 (2) ◽

pp. 1488-1489

Author(s):

Thomas W. Shattuck ◽

James R. Anderson ◽

Neil W. Tindale ◽

Peter R. Buseck

Keyword(s):

Cluster Analysis ◽

Chemical Reactivity ◽

Large Data ◽

Large Data Sets ◽

Particle Analysis ◽

Data Sets ◽

Halogen Chemistry ◽

Complete Study ◽

Components Analysis ◽

Automated Scanning

Individual particle analysis involves the study of tens of thousands of particles using automated scanning electron microscopy and elemental analysis by energy-dispersive, x-ray emission spectroscopy (EDS). EDS produces large data sets that must be analyzed using multi-variate statistical techniques. A complete study uses cluster analysis, discriminant analysis, and factor or principal components analysis (PCA). The three techniques are used in the study of particles sampled during the FeLine cruise to the mid-Pacific ocean in the summer of 1990. The mid-Pacific aerosol provides information on long range particle transport, iron deposition, sea salt ageing, and halogen chemistry.Aerosol particle data sets suffer from a number of difficulties for pattern recognition using cluster analysis. There is a great disparity in the number of observations per cluster and the range of the variables in each cluster. The variables are not normally distributed, they are subject to considerable experimental error, and many values are zero, because of finite detection limits. Many of the clusters show considerable overlap, because of natural variability, agglomeration, and chemical reactivity.

Download Full-text