Unsupervised Learning of Probably Symmetric Deformable 3D Objects from Images in the Wild (Extended Abstract)

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/665 ◽

2021 ◽

Author(s):

Shangzhe Wu ◽

Christian Rupprecht ◽

Andrea Vedaldi

Keyword(s):

Input Image ◽

Shape Model ◽

Single View ◽

Model Code ◽

3D Objects ◽

Object Categories ◽

Symmetric Structure ◽

In The Wild ◽

Human Faces ◽

Prior Shape

We propose a method to learn 3D deformable object categories from raw single-view images, without external supervision. The method is based on an autoencoder that factors each input image into depth, albedo, viewpoint and illumination. In order to disentangle these components without supervision, we use the fact that many object categories have, at least approximately, a symmetric structure. We show that reasoning about illumination allows us to exploit the underlying object symmetry even if the appearance is not symmetric due to shading. Furthermore, we model objects that are probably, but not certainly, symmetric by predicting a symmetry probability map, learned end-to-end with the other components of the model. Our experiments show that this method can recover very accurately the 3D shape of human faces, cat faces and cars from single-view images, without any supervision or a prior shape model. Code and demo available at https://github.com/elliottwu/unsup3d.

Download Full-text

Shape My Face: Registering 3D Face Scans by Surface-to-Surface Translation

International Journal of Computer Vision ◽

10.1007/s11263-021-01494-4 ◽

2021 ◽

Author(s):

Mehdi Bahri ◽

Eimear O’ Sullivan ◽

Shunwang Gong ◽

Feng Liu ◽

Xiaoming Liu ◽

...

Keyword(s):

Large Scale ◽

Training Time ◽

3D Face ◽

In The Wild ◽

Previous State ◽

Surface Translation ◽

Visual Attention Mechanism ◽

Diverse Data ◽

Human Faces ◽

Robust To Noise

AbstractStandard registration algorithms need to be independently applied to each surface to register, following careful pre-processing and hand-tuning. Recently, learning-based approaches have emerged that reduce the registration of new scans to running inference with a previously-trained model. The potential benefits are multifold: inference is typically orders of magnitude faster than solving a new instance of a difficult optimization problem, deep learning models can be made robust to noise and corruption, and the trained model may be re-used for other tasks, e.g. through transfer learning. In this paper, we cast the registration task as a surface-to-surface translation problem, and design a model to reliably capture the latent geometric information directly from raw 3D face scans. We introduce Shape-My-Face (SMF), a powerful encoder-decoder architecture based on an improved point cloud encoder, a novel visual attention mechanism, graph convolutional decoders with skip connections, and a specialized mouth model that we smoothly integrate with the mesh convolutions. Compared to the previous state-of-the-art learning algorithms for non-rigid registration of face scans, SMF only requires the raw data to be rigidly aligned (with scaling) with a pre-defined face template. Additionally, our model provides topologically-sound meshes with minimal supervision, offers faster training time, has orders of magnitude fewer trainable parameters, is more robust to noise, and can generalize to previously unseen datasets. We extensively evaluate the quality of our registrations on diverse data. We demonstrate the robustness and generalizability of our model with in-the-wild face scans across different modalities, sensor types, and resolutions. Finally, we show that, by learning to register scans, SMF produces a hybrid linear and non-linear morphable model. Manipulation of the latent space of SMF allows for shape generation, and morphing applications such as expression transfer in-the-wild. We train SMF on a dataset of human faces comprising 9 large-scale databases on commodity hardware.

Download Full-text

Robust and Accurate Mandible Segmentation on Dental CBCT Scans Affected by Metal Artifacts Using a Prior Shape Model

Journal of Personalized Medicine ◽

10.3390/jpm11050364 ◽

2021 ◽

Vol 11 (5) ◽

pp. 364

Author(s):

Bingjiang Qiu ◽

Hylke van der van der Wel ◽

Joep Kraeima ◽

Haye Hendrik Glas ◽

Jiapan Guo ◽

...

Keyword(s):

State Of The Art ◽

Maxillofacial Surgery ◽

Cone Beam ◽

Oral And Maxillofacial Surgery ◽

Metal Artifacts ◽

Shape Model ◽

High Attenuation ◽

Mandible Shape ◽

Prior Shape ◽

Segmentation Models

Accurate mandible segmentation is significant in the field of maxillofacial surgery to guide clinical diagnosis and treatment and develop appropriate surgical plans. In particular, cone-beam computed tomography (CBCT) images with metal parts, such as those used in oral and maxillofacial surgery (OMFS), often have susceptibilities when metal artifacts are present such as weak and blurred boundaries caused by a high-attenuation material and a low radiation dose in image acquisition. To overcome this problem, this paper proposes a novel deep learning-based approach (SASeg) for automated mandible segmentation that perceives overall mandible anatomical knowledge. SASeg utilizes a prior shape feature extractor (PSFE) module based on a mean mandible shape, and recurrent connections maintain the continuity structure of the mandible. The effectiveness of the proposed network is substantiated on a dental CBCT dataset from orthodontic treatment containing 59 patients. The experiments show that the proposed SASeg can be easily used to improve the prediction accuracy in a dental CBCT dataset corrupted by metal artifacts. In addition, the experimental results on the PDDCA dataset demonstrate that, compared with the state-of-the-art mandible segmentation models, our proposed SASeg can achieve better segmentation performance.

Download Full-text

No-Reference Image Quality Assessment with Multi-Scale Orderless Pooling of Deep Features

Journal of Imaging ◽

10.3390/jimaging7070112 ◽

2021 ◽

Vol 7 (7) ◽

pp. 112

Author(s):

Domonkos Varga

Keyword(s):

Image Quality ◽

Quality Assessment ◽

Image Quality Assessment ◽

Multiple Scales ◽

Digital Images ◽

Input Image ◽

Perceptual Quality ◽

Reference Image ◽

Benchmark Datasets ◽

In The Wild

The goal of no-reference image quality assessment (NR-IQA) is to evaluate their perceptual quality of digital images without using the distortion-free, pristine counterparts. NR-IQA is an important part of multimedia signal processing since digital images can undergo a wide variety of distortions during storage, compression, and transmission. In this paper, we propose a novel architecture that extracts deep features from the input image at multiple scales to improve the effectiveness of feature extraction for NR-IQA using convolutional neural networks. Specifically, the proposed method extracts deep activations for local patches at multiple scales and maps them onto perceptual quality scores with the help of trained Gaussian process regressors. Extensive experiments demonstrate that the introduced algorithm performs favorably against the state-of-the-art methods on three large benchmark datasets with authentic distortions (LIVE In the Wild, KonIQ-10k, and SPAQ).

Download Full-text

Unsupervised Learning of Probably Symmetric Deformable 3D Objects From Images in the Wild

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) ◽

10.1109/cvpr42600.2020.00008 ◽

2020 ◽

Cited By ~ 2

Author(s):

Shangzhe Wu ◽

Christian Rupprecht ◽

Andrea Vedaldi

Keyword(s):

Unsupervised Learning ◽

3D Objects ◽

In The Wild

Download Full-text

Weakly-Supervised Reconstruction of 3D Objects with Large Shape Variation from Single In-the-Wild Images

Computer Vision – ACCV 2020 - Lecture Notes in Computer Science ◽

10.1007/978-3-030-69525-5_1 ◽

2021 ◽

pp. 3-19

Author(s):

Shichen Sun ◽

Zhengbang Zhu ◽

Xiaowei Dai ◽

Qijun Zhao ◽

Jing Li

Keyword(s):

Shape Variation ◽

3D Objects ◽

In The Wild ◽

Weakly Supervised ◽

Large Shape

Download Full-text

Rilievo: Artistic Scene Authoring via Interactive Height Map Extrusion in VR

Leonardo ◽

10.1162/leon_a_01933 ◽

2020 ◽

Vol 53 (4) ◽

pp. 438-441

Author(s):

Sevinc Eroglu ◽

Patric Schmitz ◽

Carlos Aguilera Martinez ◽

Jana Rusch ◽

Leif Kobbelt ◽

...

Keyword(s):

Spatial Arrangement ◽

Input Image ◽

Artistic Creation ◽

Image Structure ◽

3D Objects ◽

Input Material ◽

Authoring Environment ◽

2D Images

The authors present a virtual authoring environment for artistic creation in VR. It enables the effortless conversion of 2D images into volumetric 3D objects. Artistic elements in the input material are extracted with a convenient VR-based segmentation tool. Relief sculpting is then performed by interactively mixing different height maps. These are automatically generated from the input image structure and appearance. A prototype of the tool is showcased in an analog-virtual artistic workflow in collaboration with a traditional painter. It combines the expressiveness of analog painting and sculpting with the creative freedom of spatial arrangement in VR.

Download Full-text

Location of Eyes in Images of Human Faces through Analysis Variance Shine Intensity

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v11.i3.pp949-953 ◽

2018 ◽

Vol 11 (3) ◽

pp. 949

Author(s):

Hussein Ali Alhamzawi

Keyword(s):

Mathematical Morphology ◽

Center Of Mass ◽

Input Image ◽

Candidate Region ◽

Facial Features ◽

Visual Interpretation ◽

Recognition Process ◽

Face Images ◽

Human Faces ◽

Horizontal Band

Extraction of facial features is an important step in automatic visual interpretation and recognition of human faces. Among the facial features, the eyes play a major role in the recognition process. In this article, we present an approach to detect and locate the eyes in frontal face images. Eye regions are identified using the technique of voucher detection based on mathematical morphology. After this identification is made a comparison between the variances of three different portions of each candidate region to eye (set of pixels belonging to the candidate region as a whole, set of pixels contained in a minimum rectangle circumscribed to the candidate region and set of pixels of the candidate region belonging to a horizontal band that crosses the center of mass of this region). The calculation of these variances also considers the R, G, and B channels, as well as the gray version of the input image.

Download Full-text

Single View 3D Face Reconstruction

Recent Advances in 3D Imaging, Modeling, and Reconstruction - Advances in Multimedia and Interactive Technologies ◽

10.4018/978-1-5225-5294-9.ch010 ◽

2020 ◽

pp. 215-227

Author(s):

Claudio Ferrari ◽

Stefano Berretti ◽

Alberto del Bimbo

Keyword(s):

3D Structure ◽

Three Dimensional ◽

3D Face Reconstruction ◽

3D Face ◽

Single View ◽

Face Reconstruction ◽

Current State ◽

Rgb Images ◽

Human Faces ◽

Rgb Image

3D face reconstruction from a single 2D image is a fundamental computer vision problem of extraordinary difficulty that dates back to the 1980s. Briefly, it is the task of recovering the three-dimensional geometry of a human face from a single RGB image. While the problem of automatically estimating the 3D structure of a generic scene from RGB images can be regarded as a general task, the particular morphology and non-rigid nature of human faces make it a challenging problem for which dedicated approaches are still currently studied. This chapter aims at providing an overview of the problem, its evolutions, the current state of the art, and future trends.

Download Full-text

Robot Self-modeling of Rotational Symmetric 3D Objects Based on Generic Description of Object Categories

Recent Progress in Robotics: Viable Robotic Service to Human - Lecture Notes in Control and Information Sciences ◽

10.1007/978-3-540-76729-9_22 ◽

2007 ◽

pp. 283-298 ◽

Cited By ~ 2

Author(s):

Joon-Young Park ◽

Kyeong-Keun Baek ◽

Yeon-Chool Park ◽

Sukhan Lee

Keyword(s):

3D Objects ◽

Object Categories ◽

Generic Description

Download Full-text

Scale-Invariant Multidirectional License Plate Detection with the Network Combining Indirect and Direct Branches

Sensors ◽

10.3390/s21041074 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1074

Author(s):

Song-Lu Chen ◽

Qi Liu ◽

Jia-Wei Ma ◽

Chun Yang

Keyword(s):

Direct Detection ◽

Direct Methods ◽

Input Image ◽

Indirect Detection ◽

License Plate ◽

Scale Invariant ◽

Refinement Method ◽

In The Wild ◽

License Plate Detection ◽

End To End

As the license plate is multiscale and multidirectional in the natural scene image, its detection is challenging in many applications. In this work, a novel network that combines indirect and direct branches is proposed for license plate detection in the wild. The indirect detection branch performs small-sized vehicle plate detection with high precision in a coarse-to-fine scheme using vehicle–plate relationships. The direct detection branch detects the license plate directly in the input image, reducing false negatives in the indirect detection branch due to the miss of vehicles’ detection. We propose a universal multidirectional license plate refinement method by localizing the four corners of the license plate. Finally, we construct an end-to-end trainable network for license plate detection by combining these two branches via post-processing operations. The network can effectively detect the small-sized license plate and localize the multidirectional license plate in real applications. To our knowledge, the proposed method is the first one that combines indirect and direct methods into an end-to-end network for license plate detection. Extensive experiments verify that our method outperforms the indirect methods and direct methods significantly.

Download Full-text