Neural 3D Morphable Models: Spiral Convolutional Networks for 3D Shape Representation Learning and Generation

Learning a 3D shape representation from a collection of its rendered 2D images has been extensively studied. However, existing view-based techniques have not yet fully exploited the information among all the views of projections. In this paper, by employing recurrent neural network to efficiently capture features across different views, we propose a siamese CNN-BiLSTM network for 3D shape representation learning. The proposed method minimizes a discriminative loss function to learn a deep nonlinear transformation, mapping 3D shapes from the original space into a nonlinear feature space. In the transformed space, the distance of 3D shapes with the same label is minimized, otherwise the distance is maximized to a large margin. Specifically, the 3D shapes are first projected into a group of 2D images from different views. Then convolutional neural network (CNN) is adopted to extract features from different view images, followed by a bidirectional long short-term memory (LSTM) to aggregate information across different views. Finally, we construct the whole CNN-BiLSTM network into a siamese structure with contrastive loss function. Our proposed method is evaluated on two benchmarks, ModelNet40 and SHREC 2014, demonstrating superiority over the state-of-the-art methods.

Download Full-text

Median-Shape Representation Learning for Category-Level Object Pose Estimation in Cluttered Environments

2020 25th International Conference on Pattern Recognition (ICPR) ◽

10.1109/icpr48806.2021.9412318 ◽

2021 ◽

Author(s):

Hiroki Tatemichi ◽

Yasutomo Kawanishi ◽

Daisuke Deguchi ◽

Ichiro Ide ◽

Ayako Amma ◽

...

Keyword(s):

Pose Estimation ◽

Shape Representation ◽

Representation Learning ◽

Cluttered Environments ◽

Object Pose Estimation

Download Full-text

Knowledge-aware Multi-modal Adaptive Graph Convolutional Networks for Fake News Detection

ACM Transactions on Multimedia Computing Communications and Applications ◽

10.1145/3451215 ◽

2021 ◽

Vol 17 (3) ◽

pp. 1-23

Author(s):

Shengsheng Qian ◽

Jun Hu ◽

Quan Fang ◽

Changsheng Xu

Keyword(s):

Social Media ◽

Visual Information ◽

Representation Learning ◽

Fake News ◽

Unified Framework ◽

Model Learning ◽

Convolutional Network ◽

Textual Information ◽

Convolutional Networks ◽

Real World Datasets

In this article, we focus on fake news detection task and aim to automatically identify the fake news from vast amount of social media posts. To date, many approaches have been proposed to detect fake news, which includes traditional learning methods and deep learning-based models. However, there are three existing challenges: (i) How to represent social media posts effectively, since the post content is various and highly complicated; (ii) how to propose a data-driven method to increase the flexibility of the model to deal with the samples in different contexts and news backgrounds; and (iii) how to fully utilize the additional auxiliary information (the background knowledge and multi-modal information) of posts for better representation learning. To tackle the above challenges, we propose a novel Knowledge-aware Multi-modal Adaptive Graph Convolutional Networks (KMAGCN) to capture the semantic representations by jointly modeling the textual information, knowledge concepts, and visual information into a unified framework for fake news detection. We model posts as graphs and use a knowledge-aware multi-modal adaptive graph learning principal for the effective feature learning. Compared with existing methods, the proposed KMAGCN addresses challenges from three aspects: (1) It models posts as graphs to capture the non-consecutive and long-range semantic relations; (2) it proposes a novel adaptive graph convolutional network to handle the variability of graph data; and (3) it leverages textual information, knowledge concepts and visual information jointly for model learning. We have conducted extensive experiments on three public real-world datasets and superior results demonstrate the effectiveness of KMAGCN compared with other state-of-the-art algorithms.

Download Full-text

Exploiting Graph Convolutional Networks for Representation Learning of Mobile App Usage

2019 IEEE International Conference on Big Data (Big Data) ◽

10.1109/bigdata47090.2019.9006428 ◽

2019 ◽

Author(s):

Keiichi Ochiai ◽

Naoki Yamamoto ◽

Takashi Hamatani ◽

Yusuke Fukazawa ◽

Takayasu Yamaguchi

Keyword(s):

Representation Learning ◽

Mobile App ◽

Convolutional Networks

Download Full-text

MeshNet: Mesh Neural Network for 3D Shape Representation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33018279 ◽

2019 ◽

Vol 33 ◽

pp. 8279-8286 ◽

Cited By ~ 18

Author(s):

Yutong Feng ◽

Yifan Feng ◽

Haoxuan You ◽

Xibin Zhao ◽

Yue Gao

Keyword(s):

Neural Network ◽

Computer Vision ◽

Point Cloud ◽

State Of The Art ◽

Shape Representation ◽

Shape Classification ◽

Retrieval Performance ◽

3D Shape ◽

General Architecture ◽

3D Shapes

Mesh is an important and powerful type of data for 3D shapes and widely studied in the field of computer vision and computer graphics. Regarding the task of 3D shape representation, there have been extensive research efforts concentrating on how to represent 3D shapes well using volumetric grid, multi-view and point cloud. However, there is little effort on using mesh data in recent years, due to the complexity and irregularity of mesh data. In this paper, we propose a mesh neural network, named MeshNet, to learn 3D shape representation from mesh data. In this method, face-unit and feature splitting are introduced, and a general architecture with available and effective blocks are proposed. In this way, MeshNet is able to solve the complexity and irregularity problem of mesh and conduct 3D shape representation well. We have applied the proposed MeshNet method in the applications of 3D shape classification and retrieval. Experimental results and comparisons with the state-of-the-art methods demonstrate that the proposed MeshNet can achieve satisfying 3D shape classification and retrieval performance, which indicates the effectiveness of the proposed method on 3D shape representation.

Download Full-text

3D shape representation with spatial probabilistic distribution of intrinsic shape keypoints

EURASIP Journal on Advances in Signal Processing ◽

10.1186/s13634-017-0483-y ◽

2017 ◽

Vol 2017 (1) ◽

Cited By ~ 3

Author(s):

Vijaya K. Ghorpade ◽

Paul Checchin ◽

Laurent Malaterre ◽

Laurent Trassoudaine

Keyword(s):

Shape Representation ◽

3D Shape ◽

Probabilistic Distribution ◽

Intrinsic Shape

Download Full-text

MLVCNN: Multi-Loop-View Convolutional Neural Network for 3D Shape Retrieval

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33018513 ◽

2019 ◽

Vol 33 ◽

pp. 8513-8520 ◽

Cited By ~ 10

Author(s):

Jianwen Jiang ◽

Di Bao ◽

Ziqiang Chen ◽

Xibin Zhao ◽

Yue Gao

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

State Of The Art ◽

Shape Representation ◽

The State ◽

Shape Retrieval ◽

3D Shape Retrieval ◽

3D Shape ◽

Loop Level ◽

Art Methods

3D shape retrieval has attracted much attention and become a hot topic in computer vision field recently.With the development of deep learning, 3D shape retrieval has also made great progress and many view-based methods have been introduced in recent years. However, how to represent 3D shapes better is still a challenging problem. At the same time, the intrinsic hierarchical associations among views still have not been well utilized. In order to tackle these problems, in this paper, we propose a multi-loop-view convolutional neural network (MLVCNN) framework for 3D shape retrieval. In this method, multiple groups of views are extracted from different loop directions first. Given these multiple loop views, the proposed MLVCNN framework introduces a hierarchical view-loop-shape architecture, i.e., the view level, the loop level, and the shape level, to conduct 3D shape representation from different scales. In the view-level, a convolutional neural network is first trained to extract view features. Then, the proposed Loop Normalization and LSTM are utilized for each loop of view to generate the loop-level features, which considering the intrinsic associations of the different views in the same loop. Finally, all the loop-level descriptors are combined into a shape-level descriptor for 3D shape representation, which is used for 3D shape retrieval. Our proposed method has been evaluated on the public 3D shape benchmark, i.e., ModelNet40. Experiments and comparisons with the state-of-the-art methods show that the proposed MLVCNN method can achieve significant performance improvement on 3D shape retrieval tasks. Our MLVCNN outperforms the state-of-the-art methods by the mAP of 4.84% in 3D shape retrieval task. We have also evaluated the performance of the proposed method on the 3D shape classification task where MLVCNN also achieves superior performance compared with recent methods.

Download Full-text

Revisiting Graph Based Collaborative Filtering: A Linear Residual Graph Convolutional Network Approach

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i01.5330 ◽

2020 ◽

Vol 34 (01) ◽

pp. 27-34 ◽

Cited By ~ 5

Author(s):

Lei Chen ◽

Le Wu ◽

Richang Hong ◽

Kun Zhang ◽

Meng Wang

Keyword(s):

Collaborative Filtering ◽

Representation Learning ◽

Superior Performance ◽

Convolutional Network ◽

Convolutional Networks ◽

Proposed Model ◽

Non Linear ◽

Efficiency And Effectiveness ◽

Residual Graph ◽

Interaction Modeling

Graph Convolutional Networks~(GCNs) are state-of-the-art graph based representation learning models by iteratively stacking multiple layers of convolution aggregation operations and non-linear activation operations. Recently, in Collaborative Filtering~(CF) based Recommender Systems~(RS), by treating the user-item interaction behavior as a bipartite graph, some researchers model higher-layer collaborative signals with GCNs. These GCN based recommender models show superior performance compared to traditional works. However, these models suffer from training difficulty with non-linear activations for large user-item graphs. Besides, most GCN based models could not model deeper layers due to the over smoothing effect with the graph convolution operation. In this paper, we revisit GCN based CF models from two aspects. First, we empirically show that removing non-linearities would enhance recommendation performance, which is consistent with the theories in simple graph convolutional networks. Second, we propose a residual network structure that is specifically designed for CF with user-item interaction modeling, which alleviates the over smoothing problem in graph convolution aggregation operation with sparse user-item interaction data. The proposed model is a linear model and it is easy to train, scale to large datasets, and yield better efficiency and effectiveness on two real datasets. We publish the source code at https://github.com/newlei/LR-GCCF.

Download Full-text

ON THE REPRESENTATION, LEARNING AND TRANSFER OF SPATIO-TEMPORAL MOVEMENT CHARACTERISTICS

International Journal of Humanoid Robotics ◽

10.1142/s0219843604000320 ◽

2004 ◽

Vol 01 (04) ◽

pp. 613-636 ◽

Cited By ~ 33

Author(s):

WINFRIED ILG ◽

GÖKHAN H. BAKIR ◽

JOHANNES MEZGER ◽

MARTIN A. GIESE

Keyword(s):

Representation Learning ◽

Movement Sequences ◽

Movement Trajectories ◽

Movement Characteristics ◽

Human Movements ◽

Complex Sequences ◽

Movement Representation ◽

Spatio Temporal ◽

Morphable Models ◽

Complex Movement

In this paper we present a learning-based approach for the modeling of complex movement sequences. Based on the method of Spatio-Temporal Morphable Models (STMMs) we derive a hierarchical algorithm that, in a first step, identifies automatically movement elements in movement sequences based on a coarse spatio-temporal description, and in a second step models these movement primitives by approximation through linear combinations of learned example movement trajectories. We describe the different steps of the algorithm and show how it can be applied for modeling and synthesis of complex sequences of human movements that contain movement elements with a variable style. The proposed method is demonstrated on different applications of movement representation relevant for imitation learning of movement styles in humanoid robotics.

Download Full-text