Variational Graph Embedding and Clustering with Laplacian Eigenmaps

As a fundamental machine learning problem, graph clustering has facilitated various real-world applications, and tremendous efforts had been devoted to it in the past few decades. However, most of the existing methods like spectral clustering suffer from the sparsity, scalability, robustness and handling high dimensional raw information in clustering. To address this issue, we propose a deep probabilistic model, called Variational Graph Embedding and Clustering with Laplacian Eigenmaps (VGECLE), which learns node embeddings and assigns node clusters simultaneously. It represents each node as a Gaussian distribution to disentangle the true embedding position and the uncertainty from the graph. With a Mixture of Gaussian (MoG) prior, VGECLE is capable of learning an interpretable clustering by the variational inference and generative process. In order to learn the pairwise relationships better, we propose a Teacher-Student mechanism encouraging node to learn a better Gaussian from its instant neighbors in the stochastic gradient descent (SGD) training fashion. By optimizing the graph embedding and the graph clustering problem as a whole, our model can fully take the advantages in their correlation. To our best knowledge, we are the first to tackle graph clustering in a deep probabilistic viewpoint. We perform extensive experiments on both synthetic and real-world networks to corroborate the effectiveness and efficiency of the proposed framework.

Download Full-text

An Approximation Algorithm for a Semi-supervised Graph Clustering Problem

Mathematical Optimization Theory and Operations Research - Communications in Computer and Information Science ◽

10.1007/978-3-030-58657-7_3 ◽

2020 ◽

pp. 23-29

Author(s):

Victor Il’ev ◽

Svetlana Il’eva ◽

Alexander Morshinin

Keyword(s):

Approximation Algorithm ◽

Graph Clustering ◽

Clustering Problem

Download Full-text

A Method for Solving a Bipartite-Graph Clustering Problem with Sequence Optimization

7th IEEE International Conference on Computer and Information Technology (CIT 2007) ◽

10.1109/cit.2007.194 ◽

2007 ◽

Cited By ~ 2

Author(s):

Keiu Harada ◽

Takuya Ishioka ◽

Ikuo Suzuki ◽

Masashi Furukawa

Keyword(s):

Bipartite Graph ◽

Graph Clustering ◽

Sequence Optimization ◽

Clustering Problem

Download Full-text

Proficient Normalised Fuzzy K-Means With Initial Centroids Methodology

International Journal of Knowledge Discovery in Bioinformatics ◽

10.4018/ijkdb.2018010104 ◽

2018 ◽

Vol 8 (1) ◽

pp. 42-59

Author(s):

Deepali Virmani ◽

Nikita Jain ◽

Ketan Parikh ◽

Shefali Upadhyaya ◽

Abhishek Srivastav

Keyword(s):

Unsupervised Learning ◽

Real World ◽

Learning Algorithms ◽

Clustering Algorithms ◽

Real World Data ◽

World Data ◽

Clustering Problem ◽

Time Required ◽

Selection Of

This article describes how data is relevant and if it can be organized, linked with other data and grouped into a cluster. Clustering is the process of organizing a given set of objects into a set of disjoint groups called clusters. There are a number of clustering algorithms like k-means, k-medoids, normalized k-means, etc. So, the focus remains on efficiency and accuracy of algorithms. The focus is also on the time it takes for clustering and reducing overlapping between clusters. K-means is one of the simplest unsupervised learning algorithms that solves the well-known clustering problem. The k-means algorithm partitions data into K clusters and the centroids are randomly chosen resulting numeric values prohibits it from being used to cluster real world data containing categorical values. Poor selection of initial centroids can result in poor clustering. This article deals with a proposed algorithm which is a variant of k-means with some modifications resulting in better clustering, reduced overlapping and lesser time required for clustering by selecting initial centres in k-means and normalizing the data.

Download Full-text

Entity Alignment between Knowledge Graphs Using Attribute Embeddings

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.3301297 ◽

2019 ◽

Vol 33 ◽

pp. 297-304 ◽

Cited By ~ 26

Author(s):

Bayu Distiawan Trisedya ◽

Jianzhong Qi ◽

Rui Zhang

Keyword(s):

Real World ◽

Graph Embedding ◽

Knowledge Bases ◽

Knowledge Graph ◽

World Knowledge ◽

Large Numbers ◽

Proposed Model ◽

Alignment Task ◽

Transitivity Rule ◽

Knowledge Graphs

The task of entity alignment between knowledge graphs aims to find entities in two knowledge graphs that represent the same real-world entity. Recently, embedding-based models are proposed for this task. Such models are built on top of a knowledge graph embedding model that learns entity embeddings to capture the semantic similarity between entities in the same knowledge graph. We propose to learn embeddings that can capture the similarity between entities in different knowledge graphs. Our proposed model helps align entities from different knowledge graphs, and hence enables the integration of multiple knowledge graphs. Our model exploits large numbers of attribute triples existing in the knowledge graphs and generates attribute character embeddings. The attribute character embedding shifts the entity embeddings from two knowledge graphs into the same space by computing the similarity between entities based on their attributes. We use a transitivity rule to further enrich the number of attributes of an entity to enhance the attribute character embedding. Experiments using real-world knowledge bases show that our proposed model achieves consistent improvements over the baseline models by over 50% in terms of hits@1 on the entity alignment task.

Download Full-text

Tools to Build Their Best Learning: Examining How Kindergarten Teachers Frame Student Mistakes

Harvard Educational Review ◽

10.17763/1943-5045-90.1.54 ◽

2020 ◽

Vol 90 (1) ◽

pp. 54-74 ◽

Cited By ~ 1

Author(s):

MALEKA DONALDSON

Keyword(s):

Young Children ◽

Student Performance ◽

Real World ◽

Corrective Feedback ◽

Student Interactions ◽

Pedagogical Tools ◽

Practicing Teachers ◽

School Settings ◽

Teacher Student ◽

Teacher Student Interactions

In this portrait, Maleka Donaldson vividly illustrates how two teachers in real-world, public school settings convey their expectations for kindergarten student performance and set the tone for learning from mistakes and feedback. Research in psychology and education has established the benefits of corrective feedback on learning but has not closely examined how practicing teachers respond to mistakes made by young children during day-to-day instruction. Donaldson draws on extended observations of teacher-student interactions to juxtapose the two contexts and reveal divergent techniques that the participating teachers use to frame mistakes and correct answers during instruction. She compares these variations and considers how each teacher's pedagogical tools could be integrated into a mistake-response toolkit that could fundamentally reshape learning from mistakes for kindergarteners.

Download Full-text

A local search for a graph clustering problem

10.1063/1.4965325 ◽

2016 ◽

Author(s):

Anna Navrotskaya ◽

Victor Il’ev

Keyword(s):

Local Search ◽

Graph Clustering ◽

Clustering Problem

Download Full-text

GLEE: Geometric Laplacian Eigenmap Embedding

Journal of Complex Networks ◽

10.1093/comnet/cnaa007 ◽

2020 ◽

Vol 8 (2) ◽

Author(s):

Leo Torres ◽

Kevin S Chan ◽

Tina Eliassi-Rad

Keyword(s):

Link Prediction ◽

Graph Embedding ◽

Laplacian Matrix ◽

Dimensional Representation ◽

Laplacian Eigenmaps ◽

New Approach ◽

Graph Reconstruction ◽

Node Similarity ◽

Distance Minimization ◽

Low Dimensional

Abstract Graph embedding seeks to build a low-dimensional representation of a graph $G$. This low-dimensional representation is then used for various downstream tasks. One popular approach is Laplacian Eigenmaps (LE), which constructs a graph embedding based on the spectral properties of the Laplacian matrix of $G$. The intuition behind it, and many other embedding techniques, is that the embedding of a graph must respect node similarity: similar nodes must have embeddings that are close to one another. Here, we dispose of this distance-minimization assumption. Instead, we use the Laplacian matrix to find an embedding with geometric properties instead of spectral ones, by leveraging the so-called simplex geometry of $G$. We introduce a new approach, Geometric Laplacian Eigenmap Embedding, and demonstrate that it outperforms various other techniques (including LE) in the tasks of graph reconstruction and link prediction.

Download Full-text

A Trust-based Mixture of Gaussian Processes Model for Reliable Regression in Participatory Sensing

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/540 ◽

2017 ◽

Cited By ~ 2

Author(s):

Qikun Xiang ◽

Jie Zhang ◽

Ido Nevat ◽

Pengfei Zhang

Keyword(s):

Real World ◽

Gaussian Processes ◽

Spatial Regression ◽

Participatory Sensing ◽

Sensing Applications ◽

Different Types ◽

Inaccurate Estimation ◽

Real World Datasets ◽

Gp Model ◽

Mixture Of Gaussian

Data trustworthiness is a crucial issue in real-world participatory sensing applications. Without considering this issue, different types of worker misbehavior, especially the challenging collusion attacks, can result in biased and inaccurate estimation and decision making. We propose a novel trust-based mixture of Gaussian processes (GP) model for spatial regression to jointly detect such misbehavior and accurately estimate the spatial field. We develop a Markov chain Monte Carlo (MCMC)-based algorithm to efficiently perform Bayesian inference of the model. Experiments using two real-world datasets show the superior robustness of our model compared with existing approaches.

Download Full-text

Privacy-Preserving Deep Learning for the Detection of Protected Health Information in Real-World Data: Comparative Evaluation (Preprint)

10.2196/preprints.14064 ◽

2019 ◽

Author(s):

Sven Festag ◽

Cord Spreckelsen

Keyword(s):

Deep Learning ◽

Health Information ◽

Clinical Data ◽

Real World ◽

Privacy Preserving ◽

Stochastic Gradient Descent ◽

Data Sets ◽

Learning Approaches ◽

Protected Health Information ◽

Private Data

BACKGROUND Collaborative privacy-preserving training methods allow for the integration of locally stored private data sets into machine learning approaches while ensuring confidentiality and nondisclosure. OBJECTIVE In this work we assess the performance of a state-of-the-art neural network approach for the detection of protected health information in texts trained in a collaborative privacy-preserving way. METHODS The training adopts distributed selective stochastic gradient descent (ie, it works by exchanging local learning results achieved on private data sets). Five networks were trained on separated real-world clinical data sets by using the privacy-protecting protocol. In total, the data sets contain 1304 real longitudinal patient records for 296 patients. RESULTS These networks reached a mean F1 value of 0.955. The gold standard centralized training that is based on the union of all sets and does not take data security into consideration reaches a final value of 0.962. CONCLUSIONS Using real-world clinical data, our study shows that detection of protected health information can be secured by collaborative privacy-preserving training. In general, the approach shows the feasibility of deep learning on distributed and confidential clinical data while ensuring data protection.

Download Full-text