Diverse Motion Stylization for Multiple Style Domains via Spatial-Temporal Graph-Based Generative Model

Author(s):  
Soomin Park ◽  
Deok-Kyeong Jang ◽  
Sung-Hee Lee

This paper presents a novel deep learning-based framework for translating a motion into various styles within multiple domains. Our framework is a single set of generative adversarial networks that learns stylistic features from a collection of unpaired motion clips with style labels to support mapping between multiple style domains. We construct a spatio-temporal graph to model a motion sequence and employ the spatial-temporal graph convolution networks (ST-GCN) to extract stylistic properties along spatial and temporal dimensions. Through spatial-temporal modeling, our framework shows improved style translation results between significantly different actions and on a long motion sequence containing multiple actions. In addition, we first develop a mapping network for motion stylization that maps a random noise to style, which allows for generating diverse stylization results without using reference motions. Through various experiments, we demonstrate the ability of our method to generate improved results in terms of visual quality, stylistic diversity, and content preservation.

2022 ◽  
Vol 8 ◽  
Author(s):  
Runnan He ◽  
Shiqi Xu ◽  
Yashu Liu ◽  
Qince Li ◽  
Yang Liu ◽  
...  

Medical imaging provides a powerful tool for medical diagnosis. In the process of computer-aided diagnosis and treatment of liver cancer based on medical imaging, accurate segmentation of liver region from abdominal CT images is an important step. However, due to defects of liver tissue and limitations of CT imaging procession, the gray level of liver region in CT image is heterogeneous, and the boundary between the liver and those of adjacent tissues and organs is blurred, which makes the liver segmentation an extremely difficult task. In this study, aiming at solving the problem of low segmentation accuracy of the original 3D U-Net network, an improved network based on the three-dimensional (3D) U-Net, is proposed. Moreover, in order to solve the problem of insufficient training data caused by the difficulty of acquiring labeled 3D data, an improved 3D U-Net network is embedded into the framework of generative adversarial networks (GAN), which establishes a semi-supervised 3D liver segmentation optimization algorithm. Finally, considering the problem of poor quality of 3D abdominal fake images generated by utilizing random noise as input, deep convolutional neural networks (DCNN) based on feature restoration method is designed to generate more realistic fake images. By testing the proposed algorithm on the LiTS-2017 and KiTS19 dataset, experimental results show that the proposed semi-supervised 3D liver segmentation method can greatly improve the segmentation performance of liver, with a Dice score of 0.9424 outperforming other methods.


2021 ◽  
Vol 12 (6) ◽  
pp. 1-20
Author(s):  
Fayaz Ali Dharejo ◽  
Farah Deeba ◽  
Yuanchun Zhou ◽  
Bhagwan Das ◽  
Munsif Ali Jatoi ◽  
...  

Single Image Super-resolution (SISR) produces high-resolution images with fine spatial resolutions from a remotely sensed image with low spatial resolution. Recently, deep learning and generative adversarial networks (GANs) have made breakthroughs for the challenging task of single image super-resolution (SISR) . However, the generated image still suffers from undesirable artifacts such as the absence of texture-feature representation and high-frequency information. We propose a frequency domain-based spatio-temporal remote sensing single image super-resolution technique to reconstruct the HR image combined with generative adversarial networks (GANs) on various frequency bands (TWIST-GAN). We have introduced a new method incorporating Wavelet Transform (WT) characteristics and transferred generative adversarial network. The LR image has been split into various frequency bands by using the WT, whereas the transfer generative adversarial network predicts high-frequency components via a proposed architecture. Finally, the inverse transfer of wavelets produces a reconstructed image with super-resolution. The model is first trained on an external DIV2 K dataset and validated with the UC Merced Landsat remote sensing dataset and Set14 with each image size of 256 × 256. Following that, transferred GANs are used to process spatio-temporal remote sensing images in order to minimize computation cost differences and improve texture information. The findings are compared qualitatively and qualitatively with the current state-of-art approaches. In addition, we saved about 43% of the GPU memory during training and accelerated the execution of our simplified version by eliminating batch normalization layers.


2020 ◽  
Author(s):  
Numan Celik ◽  
Sam T. M. Ball ◽  
Elaheh Sayari ◽  
Lina Abdul Kadir ◽  
Fiona O’Brien ◽  
...  

AbstractUnderstanding and accurately quantifying ion channel molecule gating in real time is vital for knowledge of cell membrane behaviour, drug discovery and toxicity screening. Doing this with single-molecule resolution first requires the detection of individual protein pore opening and closing transitions and construction of a so-called idealised record which indicates sample-point by samplepoint whether a given molecule is open or closed. Creating this can be difficult, since patch-clamp electrophysiology data can be noisy or contain multiple ion channel molecules. We have recently developed a deep learning model to achieve this called Deep-Channel, but further development is limited by the massive datasets need to train and validate models. In the past, this problem has been tackled by simulation of single molecule activity from Markov models with the addition of pseudo-random noise. In the present report we develop a new method to synthesise raw data, based on generative adversarial networks (GANs). The limitation to direct application of a GAN with this method has been that whilst there are methods to generate classified output image by image, there has been no method to generate an entire timeseries with parallel idealisation, sample-point by sample-point. In this paper, we over-come this problem with DeepGANnel, a model that splits training data raw and parallel idealised data into different rows of image windows and passes these data through a progressive-GAN. This new methodology allows generation of realistic, idealisation synchronised single molecule patch-clamp data, without the biases inherent in pseudorandom simulation methods. This method will be useful for development of single molecule analysis methods and may in the future prove useful for generation of biological models including single molecule resolution stochastic data. The model is easily extendable to other timeseries data requiring parallel labelling, such as labelled ECG.


2019 ◽  
Author(s):  
Atin Sakkeer Hussain

Generative Adversarial Networks(GAN) are trained to generate images from random noise vectors, but often these images turn out poorly due to any of several reasons such as model collapse, lack of proper training data, lack of training, etc. To combat this issue this paper, makes use of a Variational Autoencoder(VAE). The VAE is trained on a combination of the training & generated data, after this the VAE can be used to map images generated by the GAN to better versions of it. (This is similar to Denoising, but with few variations in the image). In addition to improving quality the proposed model is shown to work better than normal WGAN’s on sparse datasets with higher variety, in equal number of training epochs.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Bing Li ◽  
Yong Xian ◽  
Juan Su ◽  
Da Q. Zhang ◽  
Wei L. Guo

The making of infrared templates is of great significance for improving the accuracy and precision of infrared imaging guidance. However, collecting infrared images from fields is difficult, of high cost, and time-consuming. In order to address this problem, an infrared image generation method, infrared generative adversarial networks (I-GANs), based on conditional generative adversarial networks (CGAN) architecture is proposed. In I-GANs, visible images instead of random noise are used as the inputs, and the D-LinkNet network is also utilized to build the generative model, enabling improved learning of rich image textures and identification of dependencies between images. Moreover, the PatchGAN architecture is employed to build a discriminant model to process the high-frequency components of the images effectively and reduce the amount of calculation required. In addition, batch normalization is used to optimize the training process, and thereby, the instability and mode collapse of the generated adversarial network training can be alleviated. Finally, experimental verification is conducted on the produced infrared/visible light dataset (IVFG). The experimental results reveal that high-quality and reliable infrared data are generated by the proposed I-GANs.


2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Richa Sharma ◽  
Manoj Sharma ◽  
Ankit Shukla ◽  
Santanu Chaudhury

Generation of synthetic data is a challenging task. There are only a few significant works on RGB video generation and no pertinent works on RGB-D data generation. In the present work, we focus our attention on synthesizing RGB-D data which can further be used as dataset for various applications like object tracking, gesture recognition, and action recognition. This paper has put forward a proposal for a novel architecture that uses conditional deep 3D-convolutional generative adversarial networks to synthesize RGB-D data by exploiting 3D spatio-temporal convolutional framework. The proposed architecture can be used to generate virtually unlimited data. In this work, we have presented the architecture to generate RGB-D data conditioned on class labels. In the architecture, two parallel paths were used, one to generate RGB data and the second to synthesize depth map. The output from the two parallel paths is combined to generate RGB-D data. The proposed model is used for video generation at 30 fps (frames per second). The frame referred here is an RGB-D with the spatial resolution of 512 × 512.


Author(s):  
Chaudhary Sarimurrab, Ankita Kesari Naman and Sudha Narang

The Generative Models have gained considerable attention in the field of unsupervised learning via a new and practical framework called Generative Adversarial Networks (GAN) due to its outstanding data generation capability. Many models of GAN have proposed, and several practical applications emerged in various domains of computer vision and machine learning. Despite GAN's excellent success, there are still obstacles to stable training. In this model, we aim to generate human faces through un-labelled data via the help of Deep Convolutional Generative Adversarial Networks. The applications for generating faces are vast in the field of image processing, entertainment, and other such industries. Our resulting model is successfully able to generate human faces from the given un-labelled data and random noise.


2022 ◽  
Vol 13 (2) ◽  
pp. 1-23
Author(s):  
Divya Saxena ◽  
Jiannong Cao

Spatio-temporal (ST) data is a collection of multiple time series data with different spatial locations and is inherently stochastic and unpredictable. An accurate prediction over such data is an important building block for several urban applications, such as taxi demand prediction, traffic flow prediction, and so on. Existing deep learning based approaches assume that outcome is deterministic and there is only one plausible future; therefore, cannot capture the multimodal nature of future contents and dynamics. In addition, existing approaches learn spatial and temporal data separately as they assume weak correlation between them. To handle these issues, in this article, we propose a stochastic spatio-temporal generative model (named D-GAN) which adopts Generative Adversarial Networks (GANs)-based structure for more accurate ST prediction in multiple time steps. D-GAN consists of two components: (1) spatio-temporal correlation network which models spatio-temporal joint distribution of pixels and supports a stochastic sampling of latent variables for multiple plausible futures; (2) a stochastic adversarial network to jointly learn generation and variational inference of data through implicit distribution modeling. D-GAN also supports fusion of external factors through explicit objective to improve the model learning. Extensive experiments performed on two real-world datasets show that D-GAN achieves significant improvements and outperforms baseline models.


2020 ◽  
Vol 10 (1) ◽  
pp. 370 ◽  
Author(s):  
Chih-Chung Hsu ◽  
Yi-Xiu Zhuang ◽  
Chia-Yen Lee

Generative adversarial networks (GANs) can be used to generate a photo-realistic image from a low-dimension random noise. Such a synthesized (fake) image with inappropriate content can be used on social media networks, which can cause severe problems. With the aim to successfully detect fake images, an effective and efficient image forgery detector is necessary. However, conventional image forgery detectors fail to recognize fake images generated by the GAN-based generator since these images are generated and manipulated from the source image. Therefore, in this paper, we propose a deep learning-based approach for detecting the fake images by using the contrastive loss. First, several state-of-the-art GANs are employed to generate the fake–real image pairs. Next, the reduced DenseNet is developed to a two-streamed network structure to allow pairwise information as the input. Then, the proposed common fake feature network is trained using the pairwise learning to distinguish the features between the fake and real images. Finally, a classification layer is concatenated to the proposed common fake feature network to detect whether the input image is fake or real. The experimental results demonstrated that the proposed method significantly outperformed other state-of-the-art fake image detectors.


Sign in / Sign up

Export Citation Format

Share Document