latent space Latest Research Papers

This article introduces a two-phase deep feature engineering framework for efficient learning of semantics enhanced joint embedding, which clearly separates the deep feature engineering in data preprocessing from training the text-image joint embedding model. We use the Recipe1M dataset for the technical description and empirical validation. In preprocessing, we perform deep feature engineering by combining deep feature engineering with semantic context features derived from raw text-image input data. We leverage LSTM to identify key terms, deep NLP models from the BERT family, TextRank, or TF-IDF to produce ranking scores for key terms before generating the vector representation for each key term by using Word2vec. We leverage Wide ResNet50 and Word2vec to extract and encode the image category semantics of food images to help semantic alignment of the learned recipe and image embeddings in the joint latent space. In joint embedding learning, we perform deep feature engineering by optimizing the batch-hard triplet loss function with soft-margin and double negative sampling, taking into account also the category-based alignment loss and discriminator-based alignment loss. Extensive experiments demonstrate that our SEJE approach with deep feature engineering significantly outperforms the state-of-the-art approaches.

Download Full-text

Con&Net: A Cross-Network Anchor Link Discovery Method Based on Embedding Representation

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3469083 ◽

2022 ◽

Vol 16 (2) ◽

pp. 1-18

Author(s):

Xueyuan Wang ◽

Hongpo Zhang ◽

Zongmin Wang ◽

Yaqiong Qiao ◽

Jiangtao Ma ◽

...

Keyword(s):

Area Under The Curve ◽

Research Problem ◽

Cosine Similarity ◽

Baseline Method ◽

Latent Space ◽

Link Discovery ◽

Cross Platform ◽

Auc Value ◽

The Stability ◽

Discovery Method

Cross-network anchor link discovery is an important research problem and has many applications in heterogeneous social network. Existing schemes of cross-network anchor link discovery can provide reasonable link discovery results, but the quality of these results depends on the features of the platform. Therefore, there is no theoretical guarantee to the stability. This article employs user embedding feature to model the relationship between cross-platform accounts, that is, the more similar the user embedding features are, the more similar the two accounts are. The similarity of user embedding features is determined by the distance of the user features in the latent space. Based on the user embedding features, this article proposes an embedding representation-based method Con&Net(Content and Network) to solve cross-network anchor link discovery problem. Con&Net combines the user’s profile features, user-generated content (UGC) features, and user’s social structure features to measure the similarity of two user accounts. Con&Net first trains the user’s profile features to get profile embedding. Then it trains the network structure of the nodes to get structure embedding. It connects the two features through vector concatenating, and calculates the cosine similarity of the vector based on the embedding vector. This cosine similarity is used to measure the similarity of the user accounts. Finally, Con&Net predicts the link based on similarity for account pairs across the two networks. A large number of experiments in Sina Weibo and Twitter networks show that the proposed method Con&Net is better than state-of-the-art method. The area under the curve (AUC) value of the receiver operating characteristic (ROC) curve predicted by the anchor link is 11% higher than the baseline method, and Precision@30 is 25% higher than the baseline method.

Download Full-text

Variational Inference for Latent Space Models for Dynamic Networks

Statistica Sinica ◽

10.5705/ss.202020.0506 ◽

2023 ◽

Author(s):

Yan Liu ◽

Yuguo Chen

Keyword(s):

Dynamic Networks ◽

Variational Inference ◽

Latent Space ◽

Latent Space Models

Download Full-text

A Compact Representation of Measured BRDFs Using Neural Processes

ACM Transactions on Graphics ◽

10.1145/3490385 ◽

2022 ◽

Vol 41 (2) ◽

pp. 1-15

Author(s):

Chuankun Zheng ◽

Ruzhang Zheng ◽

Rui Wang ◽

Shuang Zhao ◽

Hujun Bao

Keyword(s):

Neural Networks ◽

Compression Ratio ◽

Continuous Functions ◽

High Dimensional ◽

Compact Representation ◽

Practical Usefulness ◽

Latent Space ◽

Neural Processes ◽

Low Dimensional ◽

Better Than

In this article, we introduce a compact representation for measured BRDFs by leveraging Neural Processes (NPs). Unlike prior methods that express those BRDFs as discrete high-dimensional matrices or tensors, our technique considers measured BRDFs as continuous functions and works in corresponding function spaces . Specifically, provided the evaluations of a set of BRDFs, such as ones in MERL and EPFL datasets, our method learns a low-dimensional latent space as well as a few neural networks to encode and decode these measured BRDFs or new BRDFs into and from this space in a non-linear fashion. Leveraging this latent space and the flexibility offered by the NPs formulation, our encoded BRDFs are highly compact and offer a level of accuracy better than prior methods. We demonstrate the practical usefulness of our approach via two important applications, BRDF compression and editing. Additionally, we design two alternative post-trained decoders to, respectively, achieve better compression ratio for individual BRDFs and enable importance sampling of BRDFs.

Download Full-text

Anonymizing Sensor Data on the Edge: A Representation Learning and Transformation Approach

ACM Transactions on Internet of Things ◽

10.1145/3485820 ◽

2022 ◽

Vol 3 (1) ◽

pp. 1-26

Author(s):

Omid Hajihassani ◽

Omid Ardakanian ◽

Hamzeh Khazaei

Keyword(s):

Time Series ◽

Linear Transformation ◽

Input Data ◽

Time Series Data ◽

Representation Learning ◽

Sensor Data ◽

Series Data ◽

Sensitive Information ◽

Latent Space ◽

Low Dimensional

The abundance of data collected by sensors in Internet of Things devices and the success of deep neural networks in uncovering hidden patterns in time series data have led to mounting privacy concerns. This is because private and sensitive information can be potentially learned from sensor data by applications that have access to this data. In this article, we aim to examine the tradeoff between utility and privacy loss by learning low-dimensional representations that are useful for data obfuscation. We propose deterministic and probabilistic transformations in the latent space of a variational autoencoder to synthesize time series data such that intrusive inferences are prevented while desired inferences can still be made with sufficient accuracy. In the deterministic case, we use a linear transformation to move the representation of input data in the latent space such that the reconstructed data is likely to have the same public attribute but a different private attribute than the original input data. In the probabilistic case, we apply the linear transformation to the latent representation of input data with some probability. We compare our technique with autoencoder-based anonymization techniques and additionally show that it can anonymize data in real time on resource-constrained edge devices.

Download Full-text

Hybrid Variational Autoencoder for Recommender Systems

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3470659 ◽

2022 ◽

Vol 16 (2) ◽

pp. 1-37

Author(s):

Hangbin Zhang ◽

Raymond K. Wong ◽

Victor W. Chu

Keyword(s):

Collaborative Filtering ◽

Recommender Systems ◽

State Of The Art ◽

Multinomial Distribution ◽

Latent Space ◽

Variational Autoencoder ◽

Textual Data ◽

Hybrid Solution ◽

Semantic Relevance ◽

Content Information

E-commerce platforms heavily rely on automatic personalized recommender systems, e.g., collaborative filtering models, to improve customer experience. Some hybrid models have been proposed recently to address the deficiency of existing models. However, their performances drop significantly when the dataset is sparse. Most of the recent works failed to fully address this shortcoming. At most, some of them only tried to alleviate the problem by considering either user side or item side content information. In this article, we propose a novel recommender model called Hybrid Variational Autoencoder (HVAE) to improve the performance on sparse datasets. Different from the existing approaches, we encode both user and item information into a latent space for semantic relevance measurement. In parallel, we utilize collaborative filtering to find the implicit factors of users and items, and combine their outputs to deliver a hybrid solution. In addition, we compare the performance of Gaussian distribution and multinomial distribution in learning the representations of the textual data. Our experiment results show that HVAE is able to significantly outperform state-of-the-art models with robust performance.

Download Full-text

Multimodal Earth observation data fusion: Graph-based approach in shared latent space

Information Fusion ◽

10.1016/j.inffus.2021.09.004 ◽

2022 ◽

Vol 78 ◽

pp. 20-39

Author(s):

P.V. Arun ◽

R. Sadeh ◽

A. Avneri ◽

Y. Tubul ◽

C. Camino ◽

...

Keyword(s):

Data Fusion ◽

Earth Observation ◽

Observation Data ◽

Latent Space ◽

Earth Observation Data

Download Full-text

Knowledge Preserving and Distribution Alignment for Heterogeneous Domain Adaptation

ACM Transactions on Information Systems ◽

10.1145/3469856 ◽

2022 ◽

Vol 40 (1) ◽

pp. 1-29

Author(s):

Hanrui Wu ◽

Qingyao Wu ◽

Michael K. Ng

Keyword(s):

Domain Adaptation ◽

Iteration Scheme ◽

Target Space ◽

Target Domain ◽

Maximum Mean Discrepancy ◽

Learning Tasks ◽

Feature Spaces ◽

Latent Space ◽

Laplacian Graph ◽

Novel Model

Domain adaptation aims at improving the performance of learning tasks in a target domain by leveraging the knowledge extracted from a source domain. To this end, one can perform knowledge transfer between these two domains. However, this problem becomes extremely challenging when the data of these two domains are characterized by different types of features, i.e., the feature spaces of the source and target domains are different, which is referred to as heterogeneous domain adaptation (HDA). To solve this problem, we propose a novel model called Knowledge Preserving and Distribution Alignment (KPDA), which learns an augmented target space by jointly minimizing information loss and maximizing domain distribution alignment. Specifically, we seek to discover a latent space, where the knowledge is preserved by exploiting the Laplacian graph terms and reconstruction regularizations. Moreover, we adopt the Maximum Mean Discrepancy to align the distributions of the source and target domains in the latent space. Mathematically, KPDA is formulated as a minimization problem with orthogonal constraints, which involves two projection variables. Then, we develop an algorithm based on the Gauss–Seidel iteration scheme and split the problem into two subproblems, which are solved by searching algorithms based on the Barzilai–Borwein (BB) stepsize. Promising results demonstrate the effectiveness of the proposed method.

Download Full-text

TARA: Training and Representation Alteration for AI Fairness and Domain Generalization

Neural Computation ◽

10.1162/neco_a_01468 ◽

2022 ◽

pp. 1-38

Author(s):

William Paul ◽

Armin Hadzic ◽

Neil Joshi ◽

Fady Alajaji ◽

Philippe Burlina

Keyword(s):

Domain Adaptation ◽

Representation Learning ◽

Data Representation ◽

Generative Models ◽

Underrepresented Populations ◽

Latent Space ◽

Dual Strategy ◽

Fine Control ◽

Novel Method ◽

And Training

Abstract We propose a novel method for enforcing AI fairness with respect to protected or sensitive factors. This method uses a dual strategy performing training and representation alteration (TARA) for the mitigation of prominent causes of AI bias. It includes the use of representation learning alteration via adversarial independence to suppress the bias-inducing dependence of the data representation from protected factors and training set alteration via intelligent augmentation to address bias-causing data imbalance by using generative models that allow the fine control of sensitive factors related to underrepresented populations via domain adaptation and latent space manipulation. When testing our methods on image analytics, experiments demonstrate that TARA significantly or fully debiases baseline models while outperforming competing debiasing methods that have the same amount of information—for example, with (% overall accuracy, % accuracy gap) = (78.8, 0.5) versus the baseline method's score of (71.8, 10.5) for Eye-PACS, and (73.7, 11.8) versus (69.1, 21.7) for CelebA. Furthermore, recognizing certain limitations in current metrics used for assessing debiasing performance, we propose novel conjunctive debiasing metrics. Our experiments also demonstrate the ability of these novel metrics in assessing the Pareto efficiency of the proposed methods.

Download Full-text

Using Variational Autoencoder to Develop and Validate a Compact, Deep Representation of Digital Clock Drawing Test for Classifying Dementia

10.21203/rs.3.rs-1207133/v1 ◽

2022 ◽

Author(s):

Sabyasachi Bandyopadhyay ◽

Catherine Dion ◽

David J. Libon ◽

Patrick Tighe ◽

Catherine Price ◽

...

Keyword(s):

Deep Learning ◽

Classification Model ◽

Operating Characteristics ◽

Large Dataset ◽

Domain Experts ◽

Clock Drawing Test ◽

Clock Drawing ◽

Drawing Test ◽

Latent Space ◽

Variational Autoencoder

Abstract The Clock Drawing Test (CDT) is an inexpensive tool to screen for dementia. In this study, we examined if a semi-supervised deep learning (DL) system using Variational Autoencoder (VAE) can extract atypical clock features from a large dataset of unannotated CDTs (n=13,580) and use them to classify dementia (n=18) from non-dementia (n=20) peers. The classification model built with VAE latent space features adequately classified dementia from non-dementia (0.78 Area Under Receiver Operating Characteristics (AUROC)). The VAE-identified atypical clock features were then reviewed by domain experts and compared with existing literature on clock drawing errors. This study shows that a semi-supervised deep learning (DL) analysis of the CDT can extract important clock drawing anomalies that are predictive of dementia.

Download Full-text

latent space
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Learning Text-image Joint Embedding for Efficient Cross-modal Retrieval with Deep Feature Engineering

Con&Net: A Cross-Network Anchor Link Discovery Method Based on Embedding Representation

Variational Inference for Latent Space Models for Dynamic Networks

A Compact Representation of Measured BRDFs Using Neural Processes

Anonymizing Sensor Data on the Edge: A Representation Learning and Transformation Approach

Hybrid Variational Autoencoder for Recommender Systems

Multimodal Earth observation data fusion: Graph-based approach in shared latent space

Knowledge Preserving and Distribution Alignment for Heterogeneous Domain Adaptation

TARA: Training and Representation Alteration for AI Fairness and Domain Generalization

Using Variational Autoencoder to Develop and Validate a Compact, Deep Representation of Digital Clock Drawing Test for Classifying Dementia

Export Citation Format

latent spaceRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Learning Text-image Joint Embedding for Efficient Cross-modal Retrieval with Deep Feature Engineering

Con&Net: A Cross-Network Anchor Link Discovery Method Based on Embedding Representation

Variational Inference for Latent Space Models for Dynamic Networks

A Compact Representation of Measured BRDFs Using Neural Processes

Anonymizing Sensor Data on the Edge: A Representation Learning and Transformation Approach

Hybrid Variational Autoencoder for Recommender Systems

Multimodal Earth observation data fusion: Graph-based approach in shared latent space

Knowledge Preserving and Distribution Alignment for Heterogeneous Domain Adaptation

TARA: Training and Representation Alteration for AI Fairness and Domain Generalization

Using Variational Autoencoder to Develop and Validate a Compact, Deep Representation of Digital Clock Drawing Test for Classifying Dementia

latent space
Recently Published Documents