Details (Don't) Matter: Isolating Cluster Information in Deep Embedded Spaces

Deep clustering techniques combine representation learning with clustering objectives to improve their performance. Among existing deep clustering techniques, autoencoder-based methods are the most prevalent ones. While they achieve promising clustering results, they suffer from an inherent conflict between preserving details, as expressed by the reconstruction loss, and finding similar groups by ignoring details, as expressed by the clustering loss. This conflict leads to brittle training procedures, dependence on trade-off hyperparameters and less interpretable results. We propose our framework, ACe/DeC, that is compatible with Autoencoder Centroid based Deep Clustering methods and automatically learns a latent representation consisting of two separate spaces. The clustering space captures all cluster-specific information and the shared space explains general variation in the data. This separation resolves the above mentioned conflict and allows our method to learn both detailed reconstructions and cluster specific abstractions. We evaluate our framework with extensive experiments to show several benefits: (1) cluster performance – on various data sets we outperform relevant baselines; (2) no hyperparameter tuning – this improved performance is achieved without introducing new clustering specific hyperparameters; (3) interpretability – isolating the cluster specific information in a separate space is advantageous for data exploration and interpreting the clustering results; and (4) dimensionality of the embedded space – we automatically learn a low dimensional space for clustering. Our ACe/DeC framework isolates cluster information, increases stability and interpretability, while improving cluster performance.

Download Full-text

Graph representation learning: a survey

APSIPA Transactions on Signal and Information Processing ◽

10.1017/atsip.2020.13 ◽

2020 ◽

Vol 9 ◽

Author(s):

Fenxiao Chen ◽

Yun-Cheng Wang ◽

Bin Wang ◽

C.-C. Jay Kuo

Keyword(s):

Graph Embedding ◽

Large Data ◽

Representation Learning ◽

Graph Representation ◽

Data Sets ◽

Graph Data ◽

Graph Properties ◽

Wide Range ◽

Regular Lattices ◽

Low Dimensional

Abstract Research on graph representation learning has received great attention in recent years since most data in real-world applications come in the form of graphs. High-dimensional graph data are often in irregular forms. They are more difficult to analyze than image/video/audio data defined on regular lattices. Various graph embedding techniques have been developed to convert the raw graph data into a low-dimensional vector representation while preserving the intrinsic graph properties. In this review, we first explain the graph embedding task and its challenges. Next, we review a wide range of graph embedding techniques with insights. Then, we evaluate several stat-of-the-art methods against small and large data sets and compare their performance. Finally, potential applications and future directions are presented.

Download Full-text

Learning representations of Web entities for entity resolution

International Journal of Web Information Systems ◽

10.1108/ijwis-07-2018-0059 ◽

2019 ◽

Vol 15 (3) ◽

pp. 346-358

Author(s):

Luciano Barbosa

Keyword(s):

Deep Learning ◽

Dimensional Space ◽

Entity Resolution ◽

Data Sets ◽

Content Type ◽

Learning Framework ◽

Document Frequency ◽

Deep Learning Network ◽

Low Dimensional ◽

Vector Representations

Purpose Matching instances of the same entity, a task known as entity resolution, is a key step in the process of data integration. This paper aims to propose a deep learning network that learns different representations of Web entities for entity resolution. Design/methodology/approach To match Web entities, the proposed network learns the following representations of entities: embeddings, which are vector representations of the words in the entities in a low-dimensional space; convolutional vectors from a convolutional layer, which capture short-distance patterns in word sequences in the entities; and bag-of-word vectors, created by a bow layer that learns weights for words in the vocabulary based on the task at hand. Given a pair of entities, the similarity between their learned representations is used as a feature to a binary classifier that identifies a possible match. In addition to those features, the classifier also uses a modification of inverse document frequency for pairs, which identifies discriminative words in pairs of entities. Findings The proposed approach was evaluated in two commercial and two academic entity resolution benchmarking data sets. The results have shown that the proposed strategy outperforms previous approaches in the commercial data sets, which are more challenging, and have similar results to its competitors in the academic data sets. Originality/value No previous work has used a single deep learning framework to learn different representations of Web entities for entity resolution.

Download Full-text

A learned embedding for efficient joint analysis of millions of mass spectra

10.1101/483263 ◽

2018 ◽

Cited By ~ 4

Author(s):

Damon H. May ◽

Jeffrey Bilmes ◽

William S. Noble

Keyword(s):

Mass Spectra ◽

Large Scale ◽

Dimensional Space ◽

Software Implementation ◽

Mass Spectrometry Data ◽

Joint Analysis ◽

Clustering Methods ◽

Peptide Mass ◽

Public Repositories ◽

Low Dimensional

AbstractDespite an explosion of data in public repositories, peptide mass spectra are usually analyzed by each laboratory in isolation, treating each experiment as if it has no relationship to any others. This approach fails to exploit the wealth of existing, previously analyzed mass spectrometry data. Others have jointly analyzed many mass spectra, often using clustering. However, mass spectra are not necessarily best summarized as clusters, and although new spectra can be added to existing clusters, clustering methods previously applied to mass spectra do not allow new clusters to be defined without completely re-clustering. As an alternative, we propose to train a deep neural network, called “GLEAMS,” to learn an embedding of spectra into a low-dimensional space in which spectra generated by the same peptide are close to one another. We demonstrate empirically the utility of this learned embedding by propagating annotations from labeled to unlabeled spectra. We further use GLEAMS to detect groups of unidentified, proximal spectra representing the same peptide, and we show how to use these spectral communities to reveal misidentified spectra and to characterize frequently observed but consistently unidentified molecular species. We provide a software implementation of our approach, along with a tool to quickly embed additional spectra using a pre-trained model, to facilitate large-scale analyses.

Download Full-text

Weighted Mutual Information for Aggregated Kernel Clustering

Entropy ◽

10.3390/e22030351 ◽

2020 ◽

Vol 22 (3) ◽

pp. 351

Author(s):

Nezamoddin N. Kachouie ◽

Meshal Shutaywi

Keyword(s):

Mutual Information ◽

Kernel Function ◽

Dimensional Space ◽

Data Sets ◽

Clustering Methods ◽

Main Challenge ◽

Kernel Clustering ◽

Clustering Data ◽

Project Data ◽

The Right

Background: A common task in machine learning is clustering data into different groups based on similarities. Clustering methods can be divided in two groups: linear and nonlinear. A commonly used linear clustering method is K-means. Its extension, kernel K-means, is a non-linear technique that utilizes a kernel function to project the data to a higher dimensional space. The projected data will then be clustered in different groups. Different kernels do not perform similarly when they are applied to different datasets. Methods: A kernel function might be relevant for one application but perform poorly to project data for another application. In turn choosing the right kernel for an arbitrary dataset is a challenging task. To address this challenge, a potential approach is aggregating the clustering results to obtain an impartial clustering result regardless of the selected kernel function. To this end, the main challenge is how to aggregate the clustering results. A potential solution is to combine the clustering results using a weight function. In this work, we introduce Weighted Mutual Information (WMI) for calculating the weights for different clustering methods based on their performance to combine the results. The performance of each method is evaluated using a training set with known labels. Results: We applied the proposed Weighted Mutual Information to four data sets that cannot be linearly separated. We also tested the method in different noise conditions. Conclusions: Our results show that the proposed Weighted Mutual Information method is impartial, does not rely on a single kernel, and performs better than each individual kernel specially in high noise.

Download Full-text

LkeRec: Toward Lightweight End-to-End Joint Representation Learning for Building Accurate and Effective Recommendation

ACM Transactions on Information Systems ◽

10.1145/3486673 ◽

2022 ◽

Vol 40 (3) ◽

pp. 1-28

Author(s):

Surong Yan ◽

Kwei-Jay Lin ◽

Xiaolin Zheng ◽

Haosen Wang

Keyword(s):

Side Information ◽

Dimensional Space ◽

Representation Learning ◽

Knowledge Graph ◽

Learning Framework ◽

Relation Type ◽

End To End ◽

Low Dimensional ◽

Item Knowledge ◽

Explicit And Implicit Knowledge

Explicit and implicit knowledge about users and items have been used to describe complex and heterogeneous side information for recommender systems (RSs). Many existing methods use knowledge graph embedding (KGE) to learn the representation of a user-item knowledge graph (KG) in low-dimensional space. In this article, we propose a lightweight end-to-end joint learning framework for fusing the tasks of KGE and RSs at the model level. Our method proposes a lightweight KG embedding method by using bidirectional bijection relation-type modeling to enable scalability for large graphs while using self-adaptive negative sampling to optimize negative sample generating. Our method further generates the integrated views for users and items based on relation-types to explicitly model users’ preferences and items’ features, respectively. Finally, we add virtual “recommendation” relations between the integrated views of users and items to model the preferences of users on items, seamlessly integrating RS with user-item KG over a unified graph. Experimental results on multiple datasets and benchmarks show that our method can achieve a better accuracy of recommendation compared with existing state-of-the-art methods. Complexity and runtime analysis suggests that our method can gain a lower time and space complexity than most of existing methods and improve scalability.

Download Full-text

Semisupervised, Multilabel, Multi-Instance Learning for Structured Data

Neural Computation ◽

10.1162/neco_a_00939 ◽

2017 ◽

Vol 29 (4) ◽

pp. 1053-1102 ◽

Cited By ~ 1

Author(s):

Hossein Soleimani ◽

David J. Miller

Keyword(s):

Dimensional Space ◽

Binary Classification ◽

Data Sets ◽

Learning Methods ◽

Practical Applications ◽

Bayes Methods ◽

Text Document ◽

Standard Classification ◽

Low Dimensional ◽

Inference Methods

Many classification tasks require both labeling objects and determining label associations for parts of each object. Example applications include labeling segments of images or determining relevant parts of a text document when the training labels are available only at the image or document level. This task is usually referred to as multi-instance (MI) learning, where the learner typically receives a collection of labeled (or sometimes unlabeled) bags, each containing several segments (instances). We propose a semisupervised MI learning method for multilabel classification. Most MI learning methods treat instances in each bag as independent and identically distributed samples. However, in many practical applications, instances are related to each other and should not be considered independent. Our model discovers a latent low-dimensional space that captures structure within each bag. Further, unlike many other MI learning methods, which are primarily developed for binary classification, we model multiple classes jointly, thus also capturing possible dependencies between different classes. We develop our model within a semisupervised framework, which leverages both labeled and, typically, a larger set of unlabeled bags for training. We develop several efficient inference methods for our model. We first introduce a Markov chain Monte Carlo method for inference, which can handle arbitrary relations between bag labels and instance labels, including the standard hard-max MI assumption. We also develop an extension of our model that uses stochastic variational Bayes methods for inference, and thus scales better to massive data sets. Experiments show that our approach outperforms several MI learning and standard classification methods on both bag-level and instance-level label prediction. All code for replicating our experiments is available from https://github.com/hsoleimani/MLTM .

Download Full-text

Fast Multidimensional Scaling Through Sampling, Springs and Interpolation

Information Visualization ◽

10.1057/palgrave.ivs.9500040 ◽

2003 ◽

Vol 2 (1) ◽

pp. 68-77 ◽

Cited By ~ 70

Author(s):

Alistair Morrison ◽

Greg Ross ◽

Matthew Chalmers

Keyword(s):

Multidimensional Scaling ◽

Dimensional Space ◽

Hybrid Approach ◽

Real Data ◽

Visual Exploration ◽

Data Sets ◽

Proximity Data ◽

Stochastic Sampling ◽

Similarity Relations ◽

Low Dimensional

The term ‘proximity data’ refers to data sets within which it is possible to assess the similarity of pairs of objects. Multidimensional scaling (MDS) is applied to such data and attempts to map high-dimensional objects onto low-dimensional space through the preservation of these similarity relations. Standard MDS techniques have in the past suffered from high computational complexity and, as such, could not feasibly be applied to data sets over a few thousand objects in size. Through a novel hybrid approach based upon stochastic sampling, interpolation and spring models, we have designed an algorithm running in O( N√N). Using Chalmers’ 1996 O( N2) spring model as a benchmark for the evaluation of our technique, we compare layout quality and run times using sets of synthetic and real data. Our algorithm executes significantly faster than Chalmers’ 1996 algorithm, while producing superior layouts. In reducing complexity and run time, we allow the visualisation of data sets of previously infeasible size. Our results indicate that our method is a solid foundation for interactive and visual exploration of data.

Download Full-text

Robust Facial Image Super-Resolution by Kernel Locality-Constrained Coupled-Layer Regression

ACM Transactions on Internet Technology ◽

10.1145/3418462 ◽

2021 ◽

Vol 21 (3) ◽

pp. 1-15

Author(s):

Guangwei Gao ◽

Dong Zhu ◽

Huimin Lu ◽

Yi Yu ◽

Heyou Chang ◽

...

Keyword(s):

Dimensional Space ◽

Contextual Information ◽

Super Resolution ◽

Target Position ◽

Representation Learning ◽

Facial Image ◽

Nonlinear Characteristics ◽

Resolution Scheme ◽

Image Super Resolution ◽

Low Dimensional

Super-resolution methods for facial image via representation learning scheme have become very effective methods due to their efficiency. The key problem for the super-resolution of facial image is to reveal the latent relationship between the low-resolution ( LR ) and the corresponding high-resolution ( HR ) training patch pairs. To simultaneously utilize the contextual information of the target position and the manifold structure of the primitive HR space, in this work, we design a robust context-patch facial image super-resolution scheme via a kernel locality-constrained coupled-layer regression (KLC2LR) scheme to obtain the desired HR version from the acquired LR image. Here, KLC2LR proposes to acquire contextual surrounding patches to represent the target patch and adds an HR layer constraint to compensate the detail information. Additionally, KLC2LR desires to acquire more high-frequency information by searching for nearest neighbors in the HR sample space. We also utilize kernel function to map features in original low-dimensional space into a high-dimensional one to obtain potential nonlinear characteristics. Our compared experiments in the noisy and noiseless cases have verified that our suggested methodology performs better than many existing predominant facial image super-resolution methods.

Download Full-text

TransRHS: A Representation Learning Method for Knowledge Graphs with Relation Hierarchical Structure

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/413 ◽

2020 ◽

Cited By ~ 1

Author(s):

Fuxiang Zhang ◽

Xin Wang ◽

Zhao Li ◽

Jianxin Li

Keyword(s):

Hierarchical Structure ◽

Link Prediction ◽

Dimensional Space ◽

Model Performance ◽

Representation Learning ◽

Negative Effect ◽

Novel Method ◽

Overall Performance ◽

Knowledge Graphs ◽

Low Dimensional

Representation learning of knowledge graphs aims to project both entities and relations as vectors in a continuous low-dimensional space. Relation Hierarchical Structure (RHS), which is constructed by a generalization relationship named subRelationOf between relations, can improve the overall performance of knowledge representation learning. However, most of the existing methods ignore this critical information, and a straightforward way of considering RHS may have a negative effect on the embeddings and thus reduce the model performance. In this paper, we propose a novel method named TransRHS, which is able to incorporate RHS seamlessly into the embeddings. More specifically, TransRHS encodes each relation as a vector together with a relation-specific sphere in the same space. Our TransRHS employs the relative positions among the vectors and spheres to model the subRelationOf, which embodies the inherent generalization relationships among relations. We evaluate our model on two typical tasks, i.e., link prediction and triple classification. The experimental results show that our TransRHS model significantly outperforms all baselines on both tasks, which verifies that the RHS information is significant to representation learning of knowledge graphs, and TransRHS can effectively and efficiently fuse RHS into knowledge graph embeddings.

Download Full-text

A Novel THz Differential Spectral Clustering Recognition Method Based on t-SNE

Discrete Dynamics in Nature and Society ◽

10.1155/2020/6787608 ◽

2020 ◽

Vol 2020 ◽

pp. 1-9

Author(s):

Tie-Jun Li ◽

Chih-Cheng Chen ◽

Jian-jun Liu ◽

Gui-fang Shao ◽

Christopher Chun Ki Chan

Keyword(s):

Learning Algorithm ◽

Dimensional Space ◽

Ceramic Matrix Composite ◽

Data Sets ◽

Conditional Probability Distribution ◽

Computation Efficiency ◽

Defect Recognition ◽

Low Dimensional ◽

T Distribution ◽

Spectrum Data

We apply time-domain spectroscopy (THz) imaging technology to perform nondestructive detection on three industrial ceramic matrix composite (CMC) samples and one silicon slice with defects. In terms of spectrum recognition, a low-resolution THz spectrum image results in an ineffective recognition on sample defect features. Therefore, in this article, we propose a spectrum clustering recognition model based on t-distribution stochastic neighborhood embedding (t-SNE) to address this ineffective sample defect recognition. Firstly, we propose a model to recognize a reduced dimensional clustering of different spectrums drawn from the imaging spectrum data sets, in order to judge whether a sample includes a feature indicating a defect or not in a low-dimensional space. Second, we improve computation efficiency by mapping spectrum data samples from high-dimensional space to low-dimensional space by the use of a manifold learning algorithm (t-SNE). Finally, to achieve a visible observation of sample features in low-dimensional space, we use a conditional probability distribution to measure the distance invariant similarity. Comparative experiments indicate that our model can judge the existence of sample defect features or not through spectrum clustering, as a predetection process for image analysis.

Download Full-text