Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in a First-person Simulated 3D Environment

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/306 ◽

2021 ◽

Author(s):

Wilka Carvalho ◽

Anthony Liang ◽

Kimin Lee ◽

Sungryull Sohn ◽

Honglak Lee ◽

...

Keyword(s):

Object Representation ◽

Ground Truth ◽

Representation Learning ◽

Classification Problem ◽

Task Completion ◽

Dynamics Model ◽

Object Interaction ◽

Correct Object ◽

Ground Truth Information ◽

Object Dynamics

Learning how to execute complex tasks involving multiple objects in a 3D world is challenging when there is no ground-truth information about the objects or any demonstration to learn from. When an agent only receives a signal from task-completion, this makes it challenging to learn the object-representations which support learning the correct object-interactions needed to complete the task. In this work, we formulate learning an attentive object dynamics model as a classification problem, using random object-images to define incorrect labels for our object-dynamics model. We show empirically that this enables object-representation learning that captures an object's category (is it a toaster?), its properties (is it on?), and object-relations (is something inside of it?). With this, our core learner (a relational RL agent) receives the dense training signal it needs to rapidly learn object-interaction tasks. We demonstrate results in the 3D AI2Thor simulated kitchen environment with a range of challenging food preparation tasks. We compare our method's performance to several related approaches and against the performance of an oracle: an agent that is supplied with ground-truth information about objects in the scene. We find that our agent achieves performance closest to the oracle in terms of both learning speed and maximum success rate.

Get full-text (via PubEx)

Fully Data-Driven Pseudohealthy Synthesis for Planning Valve-Sparing Aortic Root Reconstruction using Conditional Variational Autoencoders

Current Directions in Biomedical Engineering ◽

10.1515/cdbme-2020-3072 ◽

2020 ◽

Vol 6 (3) ◽

pp. 284-287

Author(s):

Jannis Hagenah ◽

Mohamad Mehdi ◽

Floris Ernst

Keyword(s):

Aortic Root ◽

Similarity Index ◽

Ground Truth ◽

Representation Learning ◽

Patient Specific ◽

Ultrasound Images ◽

Specific Geometry ◽

The Individual ◽

Native Root ◽

Original Information

AbstractAortic root aneurysm is treated by replacing the dilated root by a grafted prosthesis which mimics the native root morphology of the individual patient. The challenge in predicting the optimal prosthesis size rises from the highly patient-specific geometry as well as the absence of the original information on the healthy root. Therefore, the estimation is only possible based on the available pathological data. In this paper, we show that representation learning with Conditional Variational Autoencoders is capable of turning the distorted geometry of the aortic root into smoother shapes while the information on the individual anatomy is preserved. We evaluated this method using ultrasound images of the porcine aortic root alongside their labels. The observed results show highly realistic resemblance in shape and size to the ground truth images. Furthermore, the similarity index has noticeably improved compared to the pathological images. This provides a promising technique in planning individual aortic root replacement.

Get full-text (via PubEx)

Classification of unlabeled online media

Scientific Reports ◽

10.1038/s41598-021-85608-5 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Sakthi Kumar Arul Prakash ◽

Conrad Tucker

Keyword(s):

Social Media ◽

Real World ◽

Graphical Model ◽

Ground Truth ◽

Classification Problem ◽

Machine Learning Algorithms ◽

Social Media Networks ◽

Online Social Media ◽

Wide Range

AbstractThis work investigates the ability to classify misinformation in online social media networks in a manner that avoids the need for ground truth labels. Rather than approach the classification problem as a task for humans or machine learning algorithms, this work leverages user–user and user–media (i.e.,media likes) interactions to infer the type of information (fake vs. authentic) being spread, without needing to know the actual details of the information itself. To study the inception and evolution of user–user and user–media interactions over time, we create an experimental platform that mimics the functionality of real-world social media networks. We develop a graphical model that considers the evolution of this network topology to model the uncertainty (entropy) propagation when fake and authentic media disseminates across the network. The creation of a real-world social media network enables a wide range of hypotheses to be tested pertaining to users, their interactions with other users, and with media content. The discovery that the entropy of user–user and user–media interactions approximate fake and authentic media likes, enables us to classify fake media in an unsupervised learning manner.

Get full-text (via PubEx)

A TWO-CHANNEL MODEL FOR REPRESENTATION LEARNING IN VIETNAMESE SENTIMENT CLASSIFICATION PROBLEM

Journal of Computer Science and Cybernetics ◽

10.15625/1813-9663/36/4/14829 ◽

2020 ◽

Vol 36 (4) ◽

pp. 305-323

Author(s):

Quan Hoang Nguyen ◽

Ly Vu ◽

Quang Uy Nguyen

Keyword(s):

Channel Model ◽

Rapid Development ◽

Representation Learning ◽

Classification Problem ◽

Sentiment Classification ◽

Complex Sentences ◽

Digital World ◽

Part Of Speech ◽

Proposed Model ◽

Important Research Topic

Sentiment classification (SC) aims to determine whether a document conveys a positive or negative opinion. Due to the rapid development of the digital world, SC has become an important research topic that affects many aspects of our life. In SC based on machine learning, the representation of the document strongly influences on its accuracy. Word Embedding (WE)-based techniques, i.e., Word2vec techniques, are proved to be beneficial techniques to the SC problem. However, Word2vec is often not enough to represent the semantic of documents with complex sentences of Vietnamese. In this paper, we propose a new representation learning model called a \textbf{two-channel vector} to learn a higher-level feature of a document in SC. Our model uses two neural networks to learn the semantic feature, i.e., Word2vec and the syntactic feature, i.e., Part of Speech tag (POS). Two features are then combined and input to a \textit{Softmax} function to make the final classification. We carry out intensive experiments on $4$ recent Vietnamese sentiment datasets to evaluate the performance of the proposed architecture. The experimental results demonstrate that the proposed model can significantly enhance the accuracy of SC problems compared to two single models and a state-of-the-art ensemble method.

Get full-text (via PubEx)

BYANJON: A Ground Truth Preparation System for Online Handwritten Bangla Documents

ACM Transactions on Asian and Low-Resource Language Information Processing ◽

10.1145/3464379 ◽

2021 ◽

Vol 20 (6) ◽

pp. 1-16

Author(s):

Shibaprasad Sen ◽

Ankan Bhattacharyya ◽

Ram Sarkar ◽

Kaushik Roy

Keyword(s):

Extraction Procedure ◽

Ground Truth ◽

Word Segmentation ◽

Text Line ◽

Line Extraction ◽

Manual Intervention ◽

Ground Truth Generation ◽

Class Labels ◽

Text Line Extraction ◽

Ground Truth Information

The work reported in this article deals with the ground truth generation scheme for online handwritten Bangla documents at text-line, word, and stroke levels. The aim of the proposed scheme is twofold: firstly, to build a document level database so that future researchers can use the database to do research in this field. Secondly, the ground truth information will help other researchers to evaluate the performance of their algorithms developed for text-line extraction, word extraction, word segmentation, stroke recognition, and word recognition. The reported ground truth generation scheme starts with text-line extraction from the online handwritten Bangla documents, then words extraction from the text-lines, and finally segmentation of those words into basic strokes. After word segmentation, the basic strokes are assigned appropriate class labels by using modified distance-based feature extraction procedure and the MLP ( Multi-layer Perceptron ) classifier. The Unicode for the words are then generated from the sequence of stroke labels. XML files are used to store the stroke, word, and text-line levels ground truth information for the corresponding documents. The proposed system is semi-automatic and each step such as text-line extraction, word extraction, word segmentation, and stroke recognition has been implemented by using different algorithms. Thus, the proposed ground truth generation procedure minimizes huge manual intervention by reducing the number of mouse clicks required to extract text-lines, words from the document, and segment the words into basic strokes. The integrated stroke recognition module also helps to minimize the manual labor needed to assign appropriate stroke labels. The freely available and can be accessed at https://byanjon.herokuapp.com/ .

Get full-text (via PubEx)

Learning Compositional Representations of Interacting Systems with Restricted Boltzmann Machines: Comparative Study of Lattice Proteins

Neural Computation ◽

10.1162/neco_a_01210 ◽

2019 ◽

Vol 31 (8) ◽

pp. 1671-1717 ◽

Cited By ~ 1

Author(s):

Jérôme Tubiana ◽

Simona Cocco ◽

Rémi Monasson

Keyword(s):

Graphical Model ◽

A Priori ◽

Protein Sequences ◽

Ground Truth ◽

Representation Learning ◽

Statistical Features ◽

Restricted Boltzmann Machines ◽

Interacting Systems ◽

Hidden Layer ◽

Stochastic Mapping

A restricted Boltzmann machine (RBM) is an unsupervised machine learning bipartite graphical model that jointly learns a probability distribution over data and extracts their relevant statistical features. RBMs were recently proposed for characterizing the patterns of coevolution between amino acids in protein sequences and for designing new sequences. Here, we study how the nature of the features learned by RBM changes with its defining parameters, such as the dimensionality of the representations (size of the hidden layer) and the sparsity of the features. We show that for adequate values of these parameters, RBMs operate in a so-called compositional phase in which visible configurations sampled from the RBM are obtained by recombining these features. We then compare the performance of RBM with other standard representation learning algorithms, including principal or independent component analysis (PCA, ICA), autoencoders (AE), variational autoencoders (VAE), and their sparse variants. We show that RBMs, due to the stochastic mapping between data configurations and representations, better capture the underlying interactions in the system and are significantly more robust with respect to sample size than deterministic methods such as PCA or ICA. In addition, this stochastic mapping is not prescribed a priori as in VAE, but learned from data, which allows RBMs to show good performance even with shallow architectures. All numerical results are illustrated on synthetic lattice protein data that share similar statistical features with real protein sequences and for which ground-truth interactions are known.

Get full-text (via PubEx)

Identifying Doppler Velocity Contamination Caused by Migrating Birds. Part II: Bayes Identification and Probability Tests

Journal of Atmospheric and Oceanic Technology ◽

10.1175/jtech1758.1 ◽

2005 ◽

Vol 22 (8) ◽

pp. 1114-1121 ◽

Cited By ~ 38

Author(s):

Shun Liu ◽

Qin Xu ◽

Pengfei Zhang

Keyword(s):

Elevation Angle ◽

Ground Truth ◽

Single Parameter ◽

Statistical Decision Theory ◽

Doppler Velocity ◽

Statistical Decision ◽

Wrong Decision ◽

Migrating Birds ◽

Bayesian Statistical Decision Theory ◽

Ground Truth Information

Abstract Based on the Bayesian statistical decision theory, a probabilistic quality control (QC) technique is developed to identify and flag migrating-bird-contaminated sweeps of level II velocity scans at the lowest elevation angle using the QC parameters presented in Part I. The QC technique can use either each single QC parameter or all three in combination. The single-parameter QC technique is shown to be useful for evaluating the effectiveness of each QC parameter based on the smallness of the tested percentages of wrong decision by using the ground truth information (if available) or based on the smallness of the estimated probabilities of wrong decision (if there is no ground truth information). The multiparameter QC technique is demonstrated to be much better than any of the three single-parameter QC techniques, as indicated by the very small value of the tested percentages of wrong decision for no-flag decisions (not contaminated by migrating birds). Since the averages of the estimated probabilities of wrong decision are quite close to the tested percentages of wrong decision, they can provide useful information about the probability of wrong decision when the multiparameter QC technique is used for real applications (with no ground truth information).

Get full-text (via PubEx)

Improved Unsupervised Representation Learning of Spatial Transcriptomic Data with Sparse Filtering

10.1101/2021.10.11.464002 ◽

2021 ◽

Author(s):

Benjamin B Bartelle ◽

Mohammad Abbasi ◽

Connor Sanderford ◽

Narendian Raghu

Keyword(s):

Spatial Data ◽

Performance Metrics ◽

Ground Truth ◽

Representation Learning ◽

Brain Atlas ◽

Sparse Learning ◽

Sparse Filtering ◽

Quantitative Basis ◽

Global And Local ◽

Allen Mouse Brain Atlas

We have developed representation learning methods, specifically to address the constraints and advantages of complex spatial data. Sparse filtering (SFt), uses principles of sparsity and mutual information to build representations from both global and local features from a minimal list of samples. Critically, the samples that comprise each representation are listed and ranked by informativeness. We used the Allen Mouse Brain Atlas gene expression data for prototyping and established performance metrics based on representation accuracy to labeled anatomy. SFt, implemented with the PyTorch machine learning libraries for Python, returned the most accurate reconstruction of anatomical ground truth of any method tested. SFt generated gene lists could be further compressed, retaining 95% of informativeness with only 580 genes. Finally, we build classifiers capable of parsing anatomy with >95% accuracy using only 10 derived genes. Sparse learning is a powerful, but underexplored means to derive biologically meaningful representations from complex datasets and a quantitative basis for compressed sensing of classifiable phenomena. SFt should be considered as an alternative to PCA or manifold learning for any high dimensional dataset and the basis for future spatial learning algorithms.

Get full-text (via PubEx)

Social Media-based User Embedding: A Literature Review

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/881 ◽

2019 ◽

Author(s):

Shimei Pan ◽

Tao Ding

Keyword(s):

Social Media ◽

High Performance ◽

Large Scale ◽

Ground Truth ◽

Representation Learning ◽

Success Stories ◽

Recent Success ◽

User Data ◽

Low Dimensional ◽

And Behavior

Automated representation learning is behind many recent success stories in machine learning. It is often used to transfer knowledge learned from a large dataset (e.g., raw text) to tasks for which only a small number of training examples are available. In this paper, we review recent advance in learning to represent social media users in low-dimensional embeddings. The technology is critical for creating high performance social media-based human traits and behavior models since the ground truth for assessing latent human traits and behavior is often expensive to acquire at a large scale. In this survey, we review typical methods for learning a unified user embeddings from heterogeneous user data (e.g., combines social media texts with images to learn a unified user representation). Finally we point out some current issues and future directions.

Get full-text (via PubEx)

Gaussian Processes for Vegetation Parameter Estimation from Hyperspectral Data with Limited Ground Truth

Remote Sensing ◽

10.3390/rs11131614 ◽

2019 ◽

Vol 11 (13) ◽

pp. 1614 ◽

Cited By ~ 3

Author(s):

Utsav B. Gewali ◽

Sildomar T. Monteiro ◽

Eli Saber

Keyword(s):

Gaussian Processes ◽

Ground Truth ◽

Joint Modeling ◽

Hyperspectral Data ◽

Sample Collection ◽

Spectral Correlation ◽

Covariance Functions ◽

Vegetation Parameter ◽

Vegetation Parameters ◽

Ground Truth Information

An important application of airborne- and satellite-based hyperspectral imaging is the mapping of the spatial distribution of vegetation biophysical and biochemical parameters in an environment. Statistical models, such as Gaussian processes, have been very successful for modeling vegetation parameters from captured spectra, however their performance is highly dependent on the amount of available ground truth. This is a problem because it is generally expensive to obtain ground truth information due to difficulties and costs associated with sample collection and analysis. In this paper, we present two Gaussian processes based approaches for improving the accuracy of vegetation parameter retrieval when ground truth is limited. The first is the adoption of covariance functions based on well-established metrics, such as, spectral angle and spectral correlation, which are known to be better measures of similarity for spectral data owing to their resilience to spectral variabilities. The second is the joint modeling of related vegetation parameters by multitask Gaussian processes so that the prediction accuracy of the vegetation parameter of interest can be improved with the aid of related vegetation parameters for which a larger set of ground truth is available. We experimentally demonstrate the efficacy of the proposed methods against existing approaches on three real-world hyperspectral datasets and one synthetic dataset.

Get full-text (via PubEx)

The Active Segmentation Platform for Microscopic Image Classification and Segmentation

Brain Sciences ◽

10.3390/brainsci11121645 ◽

2021 ◽

Vol 11 (12) ◽

pp. 1645

Author(s):

Sumit K. Vohra ◽

Dimiter Prodanov

Keyword(s):

Machine Learning ◽

Image Segmentation ◽

Image Classification ◽

Domain Knowledge ◽

Feature Space ◽

Ground Truth ◽

Classification Problem ◽

Data Sets ◽

Learning Approaches ◽

Data Set

Image segmentation still represents an active area of research since no universal solution can be identified. Traditional image segmentation algorithms are problem-specific and limited in scope. On the other hand, machine learning offers an alternative paradigm where predefined features are combined into different classifiers, providing pixel-level classification and segmentation. However, machine learning only can not address the question as to which features are appropriate for a certain classification problem. The article presents an automated image segmentation and classification platform, called Active Segmentation, which is based on ImageJ. The platform integrates expert domain knowledge, providing partial ground truth, with geometrical feature extraction based on multi-scale signal processing combined with machine learning. The approach in image segmentation is exemplified on the ISBI 2012 image segmentation challenge data set. As a second application we demonstrate whole image classification functionality based on the same principles. The approach is exemplified using the HeLa and HEp-2 data sets. Obtained results indicate that feature space enrichment properly balanced with feature selection functionality can achieve performance comparable to deep learning approaches. In summary, differential geometry can substantially improve the outcome of machine learning since it can enrich the underlying feature space with new geometrical invariant objects.

Get full-text (via PubEx)