MeshNet: Mesh Neural Network for 3D Shape Representation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33018279 ◽

2019 ◽

Vol 33 ◽

pp. 8279-8286 ◽

Cited By ~ 18

Author(s):

Yutong Feng ◽

Yifan Feng ◽

Haoxuan You ◽

Xibin Zhao ◽

Yue Gao

Keyword(s):

Neural Network ◽

Computer Vision ◽

Point Cloud ◽

State Of The Art ◽

Shape Representation ◽

Shape Classification ◽

Retrieval Performance ◽

3D Shape ◽

General Architecture ◽

3D Shapes

Mesh is an important and powerful type of data for 3D shapes and widely studied in the field of computer vision and computer graphics. Regarding the task of 3D shape representation, there have been extensive research efforts concentrating on how to represent 3D shapes well using volumetric grid, multi-view and point cloud. However, there is little effort on using mesh data in recent years, due to the complexity and irregularity of mesh data. In this paper, we propose a mesh neural network, named MeshNet, to learn 3D shape representation from mesh data. In this method, face-unit and feature splitting are introduced, and a general architecture with available and effective blocks are proposed. In this way, MeshNet is able to solve the complexity and irregularity problem of mesh and conduct 3D shape representation well. We have applied the proposed MeshNet method in the applications of 3D shape classification and retrieval. Experimental results and comparisons with the state-of-the-art methods demonstrate that the proposed MeshNet can achieve satisfying 3D shape classification and retrieval performance, which indicates the effectiveness of the proposed method on 3D shape representation.

Download Full-text

MLVCNN: Multi-Loop-View Convolutional Neural Network for 3D Shape Retrieval

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33018513 ◽

2019 ◽

Vol 33 ◽

pp. 8513-8520 ◽

Cited By ~ 10

Author(s):

Jianwen Jiang ◽

Di Bao ◽

Ziqiang Chen ◽

Xibin Zhao ◽

Yue Gao

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

State Of The Art ◽

Shape Representation ◽

The State ◽

Shape Retrieval ◽

3D Shape Retrieval ◽

3D Shape ◽

Loop Level ◽

Art Methods

3D shape retrieval has attracted much attention and become a hot topic in computer vision field recently.With the development of deep learning, 3D shape retrieval has also made great progress and many view-based methods have been introduced in recent years. However, how to represent 3D shapes better is still a challenging problem. At the same time, the intrinsic hierarchical associations among views still have not been well utilized. In order to tackle these problems, in this paper, we propose a multi-loop-view convolutional neural network (MLVCNN) framework for 3D shape retrieval. In this method, multiple groups of views are extracted from different loop directions first. Given these multiple loop views, the proposed MLVCNN framework introduces a hierarchical view-loop-shape architecture, i.e., the view level, the loop level, and the shape level, to conduct 3D shape representation from different scales. In the view-level, a convolutional neural network is first trained to extract view features. Then, the proposed Loop Normalization and LSTM are utilized for each loop of view to generate the loop-level features, which considering the intrinsic associations of the different views in the same loop. Finally, all the loop-level descriptors are combined into a shape-level descriptor for 3D shape representation, which is used for 3D shape retrieval. Our proposed method has been evaluated on the public 3D shape benchmark, i.e., ModelNet40. Experiments and comparisons with the state-of-the-art methods show that the proposed MLVCNN method can achieve significant performance improvement on 3D shape retrieval tasks. Our MLVCNN outperforms the state-of-the-art methods by the mAP of 4.84% in 3D shape retrieval task. We have also evaluated the performance of the proposed method on the 3D shape classification task where MLVCNN also achieves superior performance compared with recent methods.

Download Full-text

Siamese CNN-BiLSTM Architecture for 3D Shape Representation Learning

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/93 ◽

2018 ◽

Cited By ~ 11

Author(s):

Guoxian Dai ◽

Jin Xie ◽

Yi Fang

Keyword(s):

Neural Network ◽

Loss Function ◽

Short Term Memory ◽

Shape Representation ◽

Feature Space ◽

Representation Learning ◽

3D Shape ◽

Aggregate Information ◽

3D Shapes ◽

2D Images

Learning a 3D shape representation from a collection of its rendered 2D images has been extensively studied. However, existing view-based techniques have not yet fully exploited the information among all the views of projections. In this paper, by employing recurrent neural network to efficiently capture features across different views, we propose a siamese CNN-BiLSTM network for 3D shape representation learning. The proposed method minimizes a discriminative loss function to learn a deep nonlinear transformation, mapping 3D shapes from the original space into a nonlinear feature space. In the transformed space, the distance of 3D shapes with the same label is minimized, otherwise the distance is maximized to a large margin. Specifically, the 3D shapes are first projected into a group of 2D images from different views. Then convolutional neural network (CNN) is adopted to extract features from different view images, followed by a bidirectional long short-term memory (LSTM) to aggregate information across different views. Finally, we construct the whole CNN-BiLSTM network into a siamese structure with contrastive loss function. Our proposed method is evaluated on two benchmarks, ModelNet40 and SHREC 2014, demonstrating superiority over the state-of-the-art methods.

Download Full-text

Quantification of the suitable rooftop area for solar panel installation from overhead imagery using Convolutional Neural Networks

Journal of Physics Conference Series ◽

10.1088/1742-6596/2042/1/012002 ◽

2021 ◽

Vol 2042 (1) ◽

pp. 012002

Author(s):

Roberto Castello ◽

Alina Walch ◽

Raphaël Attias ◽

Riccardo Cadei ◽

Shasha Jiang ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Computer Vision ◽

State Of The Art ◽

Solar Panel ◽

Post Processing ◽

Processing Step ◽

Recent Method

Abstract The integration of solar technology in the built environment is realized mainly through rooftop-installed panels. In this paper, we leverage state-of-the-art Machine Learning and computer vision techniques applied on overhead images to provide a geo-localization of the available rooftop surfaces for solar panel installation. We further exploit a 3D building database to associate them to the corresponding roof geometries by means of a geospatial post-processing approach. The stand-alone Convolutional Neural Network used to segment suitable rooftop areas reaches an intersection over union of 64% and an accuracy of 93%, while a post-processing step using building database improves the rejection of false positives. The model is applied to a case study area in the canton of Geneva and the results are compared with another recent method used in the literature to derive the realistic available area.

Download Full-text

3D Shape Classification Based on Point Convolutional Neural Network Combining Multi-geometric Features

Lecture Notes in Electrical Engineering - Proceedings of 2019 Chinese Intelligent Systems Conference ◽

10.1007/978-981-32-9698-5_49 ◽

2019 ◽

pp. 435-442

Author(s):

Guang Zeng ◽

Yujuan Wu ◽

Haisheng Li ◽

Li Tan

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Shape Classification ◽

Geometric Features ◽

3D Shape

Download Full-text

Generating Point Cloud from Measurements and Shapes Based on Convolutional Neural Network: An Application for Building 3D Human Model

Computational Intelligence and Neuroscience ◽

10.1155/2019/1353601 ◽

2019 ◽

Vol 2019 ◽

pp. 1-15 ◽

Cited By ~ 1

Author(s):

Mau Tung Nguyen ◽

Thanh Vu Dang ◽

Minh Kieu Tran Thi ◽

Pham The Bao

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Point Cloud ◽

Data Representation ◽

Human Model ◽

Average Error ◽

Shape Information ◽

Hierarchical Learning ◽

3D Data ◽

3D Shapes

It has been widely known that 3D shape models are comprehensively parameterized using point cloud and meshes. The point cloud particularly is much simpler to handle compared with meshes, and it also contains the shape information of a 3D model. In this paper, we would like to introduce our new method to generating the 3D point cloud from a set of crucial measurements and shapes of importance positions. In order to find the correspondence between shapes and measurements, we introduced a method of representing 3D data called slice structure. A Neural Networks-based hierarchical learning model is presented to be compatible with the data representation. Primary slices are generated by matching the measurements set before the whole point cloud tuned by Convolutional Neural Network. We conducted the experiment on a 3D human dataset which contains 1706 examples. Our results demonstrate the effectiveness of the proposed framework with the average error 7.72% and fine visualization. This study indicates that paying more attention to local features is worthwhile when dealing with 3D shapes.

Download Full-text

Foreground Detection by Competitive Learning for Varying Input Distributions

International Journal of Neural Systems ◽

10.1142/s0129065717500563 ◽

2018 ◽

Vol 28 (05) ◽

pp. 1750056 ◽

Cited By ~ 14

Author(s):

Ezequiel López-Rubio ◽

Miguel A. Molina-Cabello ◽

Rafael Marcos Luque-Baena ◽

Enrique Domínguez

Keyword(s):

Neural Network ◽

Computer Vision ◽

Input Data ◽

State Of The Art ◽

Background Modeling ◽

Cluster Detection ◽

Foreground Detection ◽

Learning Mechanism ◽

Input Distribution ◽

Computer Vision Applications

One of the most important challenges in computer vision applications is the background modeling, especially when the background is dynamic and the input distribution might not be stationary, i.e. the distribution of the input data could change with time (e.g. changing illuminations, waving trees, water, etc.). In this work, an unsupervised learning neural network is proposed which is able to cope with progressive changes in the input distribution. It is based on a dual learning mechanism which manages the changes of the input distribution separately from the cluster detection. The proposal is adequate for scenes where the background varies slowly. The performance of the method is tested against several state-of-the-art foreground detectors both quantitatively and qualitatively, with favorable results.

Download Full-text

An Optimized Approach for Intra-Class Fruit Classification Using Deep Convolutional Neural Network

International Journal of Image and Graphics ◽

10.1142/s0219467821400143 ◽

2021 ◽

pp. 2140014

Author(s):

Rishipal Singh ◽

Rajneesh Rani ◽

Aman Kamboj

Keyword(s):

Neural Network ◽

Computer Vision ◽

Convolutional Neural Network ◽

State Of The Art ◽

Deep Convolutional Neural Network ◽

Classification Models ◽

Proposed Model ◽

Traditional Classification

Fruits classification is one of the influential applications of computer vision. Traditional classification models are trained by considering various features such as color, shape, texture, etc. These features are common for different varieties of the same fruit. Therefore, a new set of features is required to classify the fruits belonging to the same class. In this paper, we have proposed an optimized method to classify intra-class fruits using deep convolutional layers. The proposed architecture is capable of solving the challenges of a commercial tray-based system in the supermarket. As the research in intra-class classification is still in its infancy, there are challenges that have not been tackled. So, the proposed method is specifically designed to overcome the challenges related to intra-class fruits classification. The proposed method showcases an impressive performance for intra-class classification, which is achieved using a few parameters than the existing methods. The proposed model consists of Inception block, Residual connections and various other layers in very precise order. To validate its performance, the proposed method is compared with state-of-the-art models and performs best in terms of accuracy, loss, parameters, and depth.

Download Full-text

Removing gloss using Deep Neural Network for 3D Reconstruction

Color and Imaging Conference ◽

10.2352/issn.2169-2629.2019.27.27 ◽

2019 ◽

Vol 2019 (1) ◽

pp. 143-148

Author(s):

Futa Matsushita ◽

Ryo Takahasshi ◽

Mari Tsunomura ◽

Norimichi Tsumura

Keyword(s):

Neural Network ◽

3D Reconstruction ◽

Conventional Method ◽

Deep Neural Network ◽

High Accuracy ◽

Simple Shape ◽

3D Shape ◽

Illumination Direction ◽

Rgb Images ◽

3D Shapes

3D reconstruction is used for inspection of industrial products. The demand for measuring 3D shapes is increased. There are many methods for 3D reconstruction using RGB images. However, it is difficult to reconstruct 3D shape using RGB images with gloss. In this paper, we use the deep neural network to remove the gloss from the image group captured by the RGB camera, and reconstruct the 3D shape with high accuracy than conventional method. In order to do the evaluation experiment, we use CG of simple shape and create images which changed geometry such as illumination direction. We removed gloss on these images and corrected defect parts after gloss removal for accurately estimating 3D shape. Finally, we compared 3D estimation using proposed method and conventional method by photo metric stereo. As a result, we show that the proposed method can estimate 3D shape more accurately than the conventional method.

Download Full-text

Y2Seq2Seq: Cross-Modal Representation Learning for 3D Shape and Text by Joint Reconstruction and Prediction of View and Word Sequences

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.3301126 ◽

2019 ◽

Vol 33 ◽

pp. 126-133 ◽

Cited By ~ 10

Author(s):

Zhizhong Han ◽

Mingyang Shang ◽

Xiyang Wang ◽

Yu-Shen Liu ◽

Matthias Zwicker

Keyword(s):

Network Architecture ◽

State Of The Art ◽

Computational Cost ◽

Representation Learning ◽

Semantic Meaning ◽

3D Shape ◽

Joint Reconstruction ◽

Recent Method ◽

Hierarchical Constraints ◽

3D Shapes

Jointly learning representations of 3D shapes and text is crucial to support tasks such as cross-modal retrieval or shape captioning. A recent method employs 3D voxels to represent 3D shapes, but this limits the approach to low resolutions due to the computational cost caused by the cubic complexity of 3D voxels. Hence the method suffers from a lack of detailed geometry. To resolve this issue, we propose Y2Seq2Seq, a view-based model, to learn cross-modal representations by joint reconstruction and prediction of view and word sequences. Specifically, the network architecture of Y2Seq2Seq bridges the semantic meaning embedded in the two modalities by two coupled “Y” like sequence-tosequence (Seq2Seq) structures. In addition, our novel hierarchical constraints further increase the discriminability of the cross-modal representations by employing more detailed discriminative information. Experimental results on cross-modal retrieval and 3D shape captioning show that Y2Seq2Seq outperforms the state-of-the-art methods.

Download Full-text

3D Shape Classification Based on Spectral Function and MDS Mapping

Journal of Computing and Information Science in Engineering ◽

10.1115/1.3290769 ◽

2010 ◽

Vol 10 (1) ◽

Cited By ~ 2

Author(s):

Zhanqing Chen ◽

Kai Tang

Keyword(s):

Spectral Function ◽

Euclidean Plane ◽

High Sensitivity ◽

Shape Classification ◽

The Novel ◽

Feature Points ◽

3D Shape ◽

And Topology ◽

3D Shapes ◽

Novel Design

This paper reports a new method for 3D shape classification. Given a 3D shape M, we first define a spectral function at every point on M that is a weighted summation of the geodesics from the point to a set of curvature-sensitive feature points on M. Based on this spectral field, a real-valued square matrix is defined that correlates the topology (the spectral field) with the geometry (the maximum geodesic) of M, and the eigenvalues of this matrix are then taken as the fingerprint of M. This fingerprint enjoys several favorable characteristics desired for 3D shape classification, such as high sensitivity to intrinsic features on M (because of the feature points and the correlation) and good immunity to geometric noise on M (because of the novel design of the weights and the overall integration of geodesics). As an integral part of the work, we finally apply the classical multidimensional scaling method to the fingerprints of the 3D shapes to be classified. In all, our classification algorithm maps 3D shapes into clusters in a Euclidean plane that possess high fidelity to intrinsic features—in both geometry and topology—of the original shapes. We demonstrate the versatility of our approach through various classification examples.

Download Full-text