scholarly journals Unsupervised Learning of Probably Symmetric Deformable 3D Objects from Images in the Wild (Extended Abstract)

Author(s):  
Shangzhe Wu ◽  
Christian Rupprecht ◽  
Andrea Vedaldi

We propose a method to learn 3D deformable object categories from raw single-view images, without external supervision. The method is based on an autoencoder that factors each input image into depth, albedo, viewpoint and illumination. In order to disentangle these components without supervision, we use the fact that many object categories have, at least approximately, a symmetric structure. We show that reasoning about illumination allows us to exploit the underlying object symmetry even if the appearance is not symmetric due to shading. Furthermore, we model objects that are probably, but not certainly, symmetric by predicting a symmetry probability map, learned end-to-end with the other components of the model. Our experiments show that this method can recover very accurately the 3D shape of human faces, cat faces and cars from single-view images, without any supervision or a prior shape model. Code and demo available at https://github.com/elliottwu/unsup3d.

Author(s):  
Mehdi Bahri ◽  
Eimear O’ Sullivan ◽  
Shunwang Gong ◽  
Feng Liu ◽  
Xiaoming Liu ◽  
...  

AbstractStandard registration algorithms need to be independently applied to each surface to register, following careful pre-processing and hand-tuning. Recently, learning-based approaches have emerged that reduce the registration of new scans to running inference with a previously-trained model. The potential benefits are multifold: inference is typically orders of magnitude faster than solving a new instance of a difficult optimization problem, deep learning models can be made robust to noise and corruption, and the trained model may be re-used for other tasks, e.g. through transfer learning. In this paper, we cast the registration task as a surface-to-surface translation problem, and design a model to reliably capture the latent geometric information directly from raw 3D face scans. We introduce Shape-My-Face (SMF), a powerful encoder-decoder architecture based on an improved point cloud encoder, a novel visual attention mechanism, graph convolutional decoders with skip connections, and a specialized mouth model that we smoothly integrate with the mesh convolutions. Compared to the previous state-of-the-art learning algorithms for non-rigid registration of face scans, SMF only requires the raw data to be rigidly aligned (with scaling) with a pre-defined face template. Additionally, our model provides topologically-sound meshes with minimal supervision, offers faster training time, has orders of magnitude fewer trainable parameters, is more robust to noise, and can generalize to previously unseen datasets. We extensively evaluate the quality of our registrations on diverse data. We demonstrate the robustness and generalizability of our model with in-the-wild face scans across different modalities, sensor types, and resolutions. Finally, we show that, by learning to register scans, SMF produces a hybrid linear and non-linear morphable model. Manipulation of the latent space of SMF allows for shape generation, and morphing applications such as expression transfer in-the-wild. We train SMF on a dataset of human faces comprising 9 large-scale databases on commodity hardware.


2021 ◽  
Vol 11 (5) ◽  
pp. 364
Author(s):  
Bingjiang Qiu ◽  
Hylke van der van der Wel ◽  
Joep Kraeima ◽  
Haye Hendrik Glas ◽  
Jiapan Guo ◽  
...  

Accurate mandible segmentation is significant in the field of maxillofacial surgery to guide clinical diagnosis and treatment and develop appropriate surgical plans. In particular, cone-beam computed tomography (CBCT) images with metal parts, such as those used in oral and maxillofacial surgery (OMFS), often have susceptibilities when metal artifacts are present such as weak and blurred boundaries caused by a high-attenuation material and a low radiation dose in image acquisition. To overcome this problem, this paper proposes a novel deep learning-based approach (SASeg) for automated mandible segmentation that perceives overall mandible anatomical knowledge. SASeg utilizes a prior shape feature extractor (PSFE) module based on a mean mandible shape, and recurrent connections maintain the continuity structure of the mandible. The effectiveness of the proposed network is substantiated on a dental CBCT dataset from orthodontic treatment containing 59 patients. The experiments show that the proposed SASeg can be easily used to improve the prediction accuracy in a dental CBCT dataset corrupted by metal artifacts. In addition, the experimental results on the PDDCA dataset demonstrate that, compared with the state-of-the-art mandible segmentation models, our proposed SASeg can achieve better segmentation performance.


2021 ◽  
Vol 7 (7) ◽  
pp. 112
Author(s):  
Domonkos Varga

The goal of no-reference image quality assessment (NR-IQA) is to evaluate their perceptual quality of digital images without using the distortion-free, pristine counterparts. NR-IQA is an important part of multimedia signal processing since digital images can undergo a wide variety of distortions during storage, compression, and transmission. In this paper, we propose a novel architecture that extracts deep features from the input image at multiple scales to improve the effectiveness of feature extraction for NR-IQA using convolutional neural networks. Specifically, the proposed method extracts deep activations for local patches at multiple scales and maps them onto perceptual quality scores with the help of trained Gaussian process regressors. Extensive experiments demonstrate that the introduced algorithm performs favorably against the state-of-the-art methods on three large benchmark datasets with authentic distortions (LIVE In the Wild, KonIQ-10k, and SPAQ).


Leonardo ◽  
2020 ◽  
Vol 53 (4) ◽  
pp. 438-441
Author(s):  
Sevinc Eroglu ◽  
Patric Schmitz ◽  
Carlos Aguilera Martinez ◽  
Jana Rusch ◽  
Leif Kobbelt ◽  
...  

The authors present a virtual authoring environment for artistic creation in VR. It enables the effortless conversion of 2D images into volumetric 3D objects. Artistic elements in the input material are extracted with a convenient VR-based segmentation tool. Relief sculpting is then performed by interactively mixing different height maps. These are automatically generated from the input image structure and appearance. A prototype of the tool is showcased in an analog-virtual artistic workflow in collaboration with a traditional painter. It combines the expressiveness of analog painting and sculpting with the creative freedom of spatial arrangement in VR.


Author(s):  
Hussein Ali Alhamzawi

Extraction of facial features is an important step in automatic visual interpretation and recognition of human faces. Among the facial features, the eyes play a major role in the recognition process. In this article, we present an approach to detect and locate the eyes in frontal face images. Eye regions are identified using the technique of voucher detection based on mathematical morphology. After this identification is made a comparison between the variances of three different portions of each candidate region to eye (set of pixels belonging to the candidate region as a whole, set of pixels contained in a minimum rectangle circumscribed to the candidate region and set of pixels of the candidate region belonging to a horizontal band that crosses the center of mass of this region). The calculation of these variances also considers the R, G, and B channels, as well as the gray version of the input image.


Author(s):  
Claudio Ferrari ◽  
Stefano Berretti ◽  
Alberto del Bimbo

3D face reconstruction from a single 2D image is a fundamental computer vision problem of extraordinary difficulty that dates back to the 1980s. Briefly, it is the task of recovering the three-dimensional geometry of a human face from a single RGB image. While the problem of automatically estimating the 3D structure of a generic scene from RGB images can be regarded as a general task, the particular morphology and non-rigid nature of human faces make it a challenging problem for which dedicated approaches are still currently studied. This chapter aims at providing an overview of the problem, its evolutions, the current state of the art, and future trends.


Sensors ◽  
2021 ◽  
Vol 21 (4) ◽  
pp. 1074
Author(s):  
Song-Lu Chen ◽  
Qi Liu ◽  
Jia-Wei Ma ◽  
Chun Yang

As the license plate is multiscale and multidirectional in the natural scene image, its detection is challenging in many applications. In this work, a novel network that combines indirect and direct branches is proposed for license plate detection in the wild. The indirect detection branch performs small-sized vehicle plate detection with high precision in a coarse-to-fine scheme using vehicle–plate relationships. The direct detection branch detects the license plate directly in the input image, reducing false negatives in the indirect detection branch due to the miss of vehicles’ detection. We propose a universal multidirectional license plate refinement method by localizing the four corners of the license plate. Finally, we construct an end-to-end trainable network for license plate detection by combining these two branches via post-processing operations. The network can effectively detect the small-sized license plate and localize the multidirectional license plate in real applications. To our knowledge, the proposed method is the first one that combines indirect and direct methods into an end-to-end network for license plate detection. Extensive experiments verify that our method outperforms the indirect methods and direct methods significantly.


Sign in / Sign up

Export Citation Format

Share Document