scholarly journals MeshNet: Mesh Neural Network for 3D Shape Representation

Author(s):  
Yutong Feng ◽  
Yifan Feng ◽  
Haoxuan You ◽  
Xibin Zhao ◽  
Yue Gao

Mesh is an important and powerful type of data for 3D shapes and widely studied in the field of computer vision and computer graphics. Regarding the task of 3D shape representation, there have been extensive research efforts concentrating on how to represent 3D shapes well using volumetric grid, multi-view and point cloud. However, there is little effort on using mesh data in recent years, due to the complexity and irregularity of mesh data. In this paper, we propose a mesh neural network, named MeshNet, to learn 3D shape representation from mesh data. In this method, face-unit and feature splitting are introduced, and a general architecture with available and effective blocks are proposed. In this way, MeshNet is able to solve the complexity and irregularity problem of mesh and conduct 3D shape representation well. We have applied the proposed MeshNet method in the applications of 3D shape classification and retrieval. Experimental results and comparisons with the state-of-the-art methods demonstrate that the proposed MeshNet can achieve satisfying 3D shape classification and retrieval performance, which indicates the effectiveness of the proposed method on 3D shape representation.

Author(s):  
Jianwen Jiang ◽  
Di Bao ◽  
Ziqiang Chen ◽  
Xibin Zhao ◽  
Yue Gao

3D shape retrieval has attracted much attention and become a hot topic in computer vision field recently.With the development of deep learning, 3D shape retrieval has also made great progress and many view-based methods have been introduced in recent years. However, how to represent 3D shapes better is still a challenging problem. At the same time, the intrinsic hierarchical associations among views still have not been well utilized. In order to tackle these problems, in this paper, we propose a multi-loop-view convolutional neural network (MLVCNN) framework for 3D shape retrieval. In this method, multiple groups of views are extracted from different loop directions first. Given these multiple loop views, the proposed MLVCNN framework introduces a hierarchical view-loop-shape architecture, i.e., the view level, the loop level, and the shape level, to conduct 3D shape representation from different scales. In the view-level, a convolutional neural network is first trained to extract view features. Then, the proposed Loop Normalization and LSTM are utilized for each loop of view to generate the loop-level features, which considering the intrinsic associations of the different views in the same loop. Finally, all the loop-level descriptors are combined into a shape-level descriptor for 3D shape representation, which is used for 3D shape retrieval. Our proposed method has been evaluated on the public 3D shape benchmark, i.e., ModelNet40. Experiments and comparisons with the state-of-the-art methods show that the proposed MLVCNN method can achieve significant performance improvement on 3D shape retrieval tasks. Our MLVCNN outperforms the state-of-the-art methods by the mAP of 4.84% in 3D shape retrieval task. We have also evaluated the performance of the proposed method on the 3D shape classification task where MLVCNN also achieves superior performance compared with recent methods.


Author(s):  
Guoxian Dai ◽  
Jin Xie ◽  
Yi Fang

Learning a 3D shape representation from a collection of its rendered 2D images has been extensively studied. However, existing view-based techniques have not yet fully exploited the information among all the views of projections. In this paper, by employing recurrent neural network to efficiently capture features across different views, we propose a siamese CNN-BiLSTM network for 3D shape representation learning. The proposed method minimizes a discriminative loss function to learn a deep nonlinear transformation, mapping 3D shapes from the original space into a nonlinear feature space. In the transformed space, the distance of 3D shapes with the same label is minimized, otherwise the distance is maximized to a large margin. Specifically, the 3D shapes are first projected into a group of 2D images from different views. Then convolutional neural network (CNN) is adopted to extract features from different view images, followed by a bidirectional long short-term memory (LSTM) to aggregate information across different views. Finally, we construct the whole CNN-BiLSTM network into a siamese structure with contrastive loss function. Our proposed method is evaluated on two benchmarks, ModelNet40 and SHREC 2014, demonstrating superiority over the state-of-the-art methods.


2021 ◽  
Vol 2042 (1) ◽  
pp. 012002
Author(s):  
Roberto Castello ◽  
Alina Walch ◽  
Raphaël Attias ◽  
Riccardo Cadei ◽  
Shasha Jiang ◽  
...  

Abstract The integration of solar technology in the built environment is realized mainly through rooftop-installed panels. In this paper, we leverage state-of-the-art Machine Learning and computer vision techniques applied on overhead images to provide a geo-localization of the available rooftop surfaces for solar panel installation. We further exploit a 3D building database to associate them to the corresponding roof geometries by means of a geospatial post-processing approach. The stand-alone Convolutional Neural Network used to segment suitable rooftop areas reaches an intersection over union of 64% and an accuracy of 93%, while a post-processing step using building database improves the rejection of false positives. The model is applied to a case study area in the canton of Geneva and the results are compared with another recent method used in the literature to derive the realistic available area.


2019 ◽  
Vol 2019 ◽  
pp. 1-15 ◽  
Author(s):  
Mau Tung Nguyen ◽  
Thanh Vu Dang ◽  
Minh Kieu Tran Thi ◽  
Pham The Bao

It has been widely known that 3D shape models are comprehensively parameterized using point cloud and meshes. The point cloud particularly is much simpler to handle compared with meshes, and it also contains the shape information of a 3D model. In this paper, we would like to introduce our new method to generating the 3D point cloud from a set of crucial measurements and shapes of importance positions. In order to find the correspondence between shapes and measurements, we introduced a method of representing 3D data called slice structure. A Neural Networks-based hierarchical learning model is presented to be compatible with the data representation. Primary slices are generated by matching the measurements set before the whole point cloud tuned by Convolutional Neural Network. We conducted the experiment on a 3D human dataset which contains 1706 examples. Our results demonstrate the effectiveness of the proposed framework with the average error 7.72% and fine visualization. This study indicates that paying more attention to local features is worthwhile when dealing with 3D shapes.


2018 ◽  
Vol 28 (05) ◽  
pp. 1750056 ◽  
Author(s):  
Ezequiel López-Rubio ◽  
Miguel A. Molina-Cabello ◽  
Rafael Marcos Luque-Baena ◽  
Enrique Domínguez

One of the most important challenges in computer vision applications is the background modeling, especially when the background is dynamic and the input distribution might not be stationary, i.e. the distribution of the input data could change with time (e.g. changing illuminations, waving trees, water, etc.). In this work, an unsupervised learning neural network is proposed which is able to cope with progressive changes in the input distribution. It is based on a dual learning mechanism which manages the changes of the input distribution separately from the cluster detection. The proposal is adequate for scenes where the background varies slowly. The performance of the method is tested against several state-of-the-art foreground detectors both quantitatively and qualitatively, with favorable results.


Author(s):  
Rishipal Singh ◽  
Rajneesh Rani ◽  
Aman Kamboj

Fruits classification is one of the influential applications of computer vision. Traditional classification models are trained by considering various features such as color, shape, texture, etc. These features are common for different varieties of the same fruit. Therefore, a new set of features is required to classify the fruits belonging to the same class. In this paper, we have proposed an optimized method to classify intra-class fruits using deep convolutional layers. The proposed architecture is capable of solving the challenges of a commercial tray-based system in the supermarket. As the research in intra-class classification is still in its infancy, there are challenges that have not been tackled. So, the proposed method is specifically designed to overcome the challenges related to intra-class fruits classification. The proposed method showcases an impressive performance for intra-class classification, which is achieved using a few parameters than the existing methods. The proposed model consists of Inception block, Residual connections and various other layers in very precise order. To validate its performance, the proposed method is compared with state-of-the-art models and performs best in terms of accuracy, loss, parameters, and depth.


2019 ◽  
Vol 2019 (1) ◽  
pp. 143-148
Author(s):  
Futa Matsushita ◽  
Ryo Takahasshi ◽  
Mari Tsunomura ◽  
Norimichi Tsumura

3D reconstruction is used for inspection of industrial products. The demand for measuring 3D shapes is increased. There are many methods for 3D reconstruction using RGB images. However, it is difficult to reconstruct 3D shape using RGB images with gloss. In this paper, we use the deep neural network to remove the gloss from the image group captured by the RGB camera, and reconstruct the 3D shape with high accuracy than conventional method. In order to do the evaluation experiment, we use CG of simple shape and create images which changed geometry such as illumination direction. We removed gloss on these images and corrected defect parts after gloss removal for accurately estimating 3D shape. Finally, we compared 3D estimation using proposed method and conventional method by photo metric stereo. As a result, we show that the proposed method can estimate 3D shape more accurately than the conventional method.


Author(s):  
Zhizhong Han ◽  
Mingyang Shang ◽  
Xiyang Wang ◽  
Yu-Shen Liu ◽  
Matthias Zwicker

Jointly learning representations of 3D shapes and text is crucial to support tasks such as cross-modal retrieval or shape captioning. A recent method employs 3D voxels to represent 3D shapes, but this limits the approach to low resolutions due to the computational cost caused by the cubic complexity of 3D voxels. Hence the method suffers from a lack of detailed geometry. To resolve this issue, we propose Y2Seq2Seq, a view-based model, to learn cross-modal representations by joint reconstruction and prediction of view and word sequences. Specifically, the network architecture of Y2Seq2Seq bridges the semantic meaning embedded in the two modalities by two coupled “Y” like sequence-tosequence (Seq2Seq) structures. In addition, our novel hierarchical constraints further increase the discriminability of the cross-modal representations by employing more detailed discriminative information. Experimental results on cross-modal retrieval and 3D shape captioning show that Y2Seq2Seq outperforms the state-of-the-art methods.


Author(s):  
Zhanqing Chen ◽  
Kai Tang

This paper reports a new method for 3D shape classification. Given a 3D shape M, we first define a spectral function at every point on M that is a weighted summation of the geodesics from the point to a set of curvature-sensitive feature points on M. Based on this spectral field, a real-valued square matrix is defined that correlates the topology (the spectral field) with the geometry (the maximum geodesic) of M, and the eigenvalues of this matrix are then taken as the fingerprint of M. This fingerprint enjoys several favorable characteristics desired for 3D shape classification, such as high sensitivity to intrinsic features on M (because of the feature points and the correlation) and good immunity to geometric noise on M (because of the novel design of the weights and the overall integration of geodesics). As an integral part of the work, we finally apply the classical multidimensional scaling method to the fingerprints of the 3D shapes to be classified. In all, our classification algorithm maps 3D shapes into clusters in a Euclidean plane that possess high fidelity to intrinsic features—in both geometry and topology—of the original shapes. We demonstrate the versatility of our approach through various classification examples.


Sign in / Sign up

Export Citation Format

Share Document