Fractional Wavelet-Based Generative Scattering Networks

Frontiers in Neurorobotics ◽

10.3389/fnbot.2021.752752 ◽

2021 ◽

Vol 15 ◽

Author(s):

Jiasong Wu ◽

Xiang Qiu ◽

Jing Zhang ◽

Fuzhi Wu ◽

Youyong Kong ◽

...

Keyword(s):

Dimensionality Reduction ◽

Reduction Method ◽

Gaussian White Noise ◽

Principal Component ◽

Experimental Results ◽

Generative Adversarial Networks ◽

Image Generation ◽

Adversarial Networks ◽

Dimensionality Reduction Method

Generative adversarial networks and variational autoencoders (VAEs) provide impressive image generation from Gaussian white noise, but both are difficult to train, since they need a generator (or encoder) and a discriminator (or decoder) to be trained simultaneously, which can easily lead to unstable training. To solve or alleviate these synchronous training problems of generative adversarial networks (GANs) and VAEs, researchers recently proposed generative scattering networks (GSNs), which use wavelet scattering networks (ScatNets) as the encoder to obtain features (or ScatNet embeddings) and convolutional neural networks (CNNs) as the decoder to generate an image. The advantage of GSNs is that the parameters of ScatNets do not need to be learned, while the disadvantage of GSNs is that their ability to obtain representations of ScatNets is slightly weaker than that of CNNs. In addition, the dimensionality reduction method of principal component analysis (PCA) can easily lead to overfitting in the training of GSNs and, therefore, affect the quality of generated images in the testing process. To further improve the quality of generated images while keeping the advantages of GSNs, this study proposes generative fractional scattering networks (GFRSNs), which use more expressive fractional wavelet scattering networks (FrScatNets), instead of ScatNets as the encoder to obtain features (or FrScatNet embeddings) and use similar CNNs of GSNs as the decoder to generate an image. Additionally, this study develops a new dimensionality reduction method named feature-map fusion (FMF) instead of performing PCA to better retain the information of FrScatNets,; it also discusses the effect of image fusion on the quality of the generated image. The experimental results obtained on the CIFAR-10 and CelebA datasets show that the proposed GFRSNs can lead to better generated images than the original GSNs on testing datasets. The experimental results of the proposed GFRSNs with deep convolutional GAN (DCGAN), progressive GAN (PGAN), and CycleGAN are also given.

Download Full-text

Reconstruction of Generative Adversarial Networks in Cross Modal Image Generation with Canonical Polyadic Decomposition

Wireless Communications and Mobile Computing ◽

10.1155/2021/8868781 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Ruixin Ma ◽

Junying Lou ◽

Peng Li ◽

Jing Gao

Keyword(s):

Mobile Terminal ◽

Experimental Results ◽

Generative Adversarial Networks ◽

Image Generation ◽

Adversarial Networks ◽

Speed Up ◽

Canonical Polyadic Decomposition

Generating pictures from text is an interesting, classic, and challenging task. Benefited from the development of generative adversarial networks (GAN), the generation quality of this task has been greatly improved. Many excellent cross modal GAN models have been put forward. These models add extensive layers and constraints to get impressive generation pictures. However, complexity and computation of existing cross modal GANs are too high to be deployed in mobile terminal. To solve this problem, this paper designs a compact cross modal GAN based on canonical polyadic decomposition. We replace an original convolution layer with three small convolution layers and use an autoencoder to stabilize and speed up training. The experimental results show that our model achieves 20% times of compression in both parameters and FLOPs without loss of quality on generated images.

Download Full-text

A model for spectroscopic food sample analysis using data sonification

International Journal of Speech Technology ◽

10.1007/s10772-020-09794-9 ◽

2021 ◽

Author(s):

Hsein Kew

Keyword(s):

Dimensionality Reduction ◽

Classification Accuracy ◽

Reduction Method ◽

Real Life ◽

Principal Component ◽

Relevant Information ◽

Analysis Model ◽

Linear Discriminant ◽

Audio Output ◽

Dimensionality Reduction Method

AbstractIn this paper, we propose a method to generate an audio output based on spectroscopy data in order to discriminate two classes of data, based on the features of our spectral dataset. To do this, we first perform spectral pre-processing, and then extract features, followed by machine learning, for dimensionality reduction. The features are then mapped to the parameters of a sound synthesiser, as part of the audio processing, so as to generate audio samples in order to compute statistical results and identify important descriptors for the classification of the dataset. To optimise the process, we compare Amplitude Modulation (AM) and Frequency Modulation (FM) synthesis, as applied to two real-life datasets to evaluate the performance of sonification as a method for discriminating data. FM synthesis provides a higher subjective classification accuracy as compared with to AM synthesis. We then further compare the dimensionality reduction method of Principal Component Analysis (PCA) and Linear Discriminant Analysis in order to optimise our sonification algorithm. The results of classification accuracy using FM synthesis as the sound synthesiser and PCA as the dimensionality reduction method yields a mean classification accuracies of 93.81% and 88.57% for the coffee dataset and the fruit puree dataset respectively, and indicate that this spectroscopic analysis model is able to provide relevant information on the spectral data, and most importantly, is able to discriminate accurately between the two spectra and thus provides a complementary tool to supplement current methods.

Download Full-text

Utilizing Amari-Alpha Divergence to Stabilize the Training of Generative Adversarial Networks

Entropy ◽

10.3390/e22040410 ◽

2020 ◽

Vol 22 (4) ◽

pp. 410 ◽

Cited By ~ 2

Author(s):

Likun Cai ◽

Yanjie Chen ◽

Ning Cai ◽

Wei Cheng ◽

Hao Wang

Keyword(s):

State Of The Art ◽

Generative Adversarial Networks ◽

Image Generation ◽

Significant Progress ◽

Trade Off ◽

Adversarial Networks ◽

Leibler Divergence ◽

The Stability ◽

Hellinger Divergence

Generative Adversarial Nets (GANs) are one of the most popular architectures for image generation, which has achieved significant progress in generating high-resolution, diverse image samples. The normal GANs are supposed to minimize the Kullback–Leibler divergence between distributions of natural and generated images. In this paper, we propose the Alpha-divergence Generative Adversarial Net (Alpha-GAN) which adopts the alpha divergence as the minimization objective function of generators. The alpha divergence can be regarded as a generalization of the Kullback–Leibler divergence, Pearson χ 2 divergence, Hellinger divergence, etc. Our Alpha-GAN employs the power function as the form of adversarial loss for the discriminator with two-order indexes. These hyper-parameters make our model more flexible to trade off between the generated and target distributions. We further give a theoretical analysis of how to select these hyper-parameters to balance the training stability and the quality of generated images. Extensive experiments of Alpha-GAN are performed on SVHN and CelebA datasets, and evaluation results show the stability of Alpha-GAN. The generated samples are also competitive compared with the state-of-the-art approaches.

Download Full-text

JPEG Steganalysis Based on Locality Preserving Projection Dimensionality Reduction Method

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.411-414.1185 ◽

2013 ◽

Vol 411-414 ◽

pp. 1185-1188 ◽

Cited By ~ 1

Author(s):

Ting Ting Zhu ◽

Li Na Wang ◽

Yu Fu ◽

Yan Zhen Ren

Keyword(s):

Dimensionality Reduction ◽

Reduction Method ◽

Experimental Results ◽

Generalization Capability ◽

Locality Preserving Projection ◽

Locality Preserving ◽

Dimensionality Reduction Method

In this paper, a JPEG steganalysis algorithm based on locality preserving projection (LPP) dimensionality reduction method is proposed for detecting the unseen stego algorithms. The co-occurrence features are extracted from DCT-DWT domain and dimension is reduced by using the LPP method. For improving the generalization capability of the algorithm, SVDD is used as the classifier. Experimental results reveal the fact that our scheme has better generalization capability and is more effective than others.

Download Full-text

Content-Based Attention Network for Person Image Generation

Journal of Circuits System and Computers ◽

10.1142/s0218126620502503 ◽

2020 ◽

Vol 29 (15) ◽

pp. 2050250

Author(s):

Xiongfei Liu ◽

Bengao Li ◽

Xin Chen ◽

Haiyan Zhang ◽

Shu Zhan

Keyword(s):

Major Part ◽

State Of The Art ◽

Attention Mechanism ◽

Experimental Results ◽

Generative Adversarial Networks ◽

Image Generation ◽

Attention Network ◽

Adversarial Networks ◽

Proposed Model ◽

Novel Method

This paper proposes a novel method for person image generation with arbitrary target pose. Given a person image and an arbitrary target pose, our proposed model can synthesize images with the same person but different poses. The Generative Adversarial Networks (GANs) are the major part of the proposed model. Different from the traditional GANs, we add attention mechanism to the generator in order to generate realistic-looking images, we also use content reconstruction with a pretrained VGG16 Net to keep the content consistency between generated images and target images. Furthermore, we test our model on DeepFashion and Market-1501 datasets. The experimental results show that the proposed network performs favorably against state-of-the-art methods.

Download Full-text

Least squares regression principal component analysis: A supervised dimensionality reduction method

Numerical Linear Algebra with Applications ◽

10.1002/nla.2411 ◽

2021 ◽

Author(s):

Hector Pascual ◽

Xin C. Yee

Keyword(s):

Principal Component Analysis ◽

Dimensionality Reduction ◽

Least Squares ◽

Reduction Method ◽

Principal Component ◽

Component Analysis ◽

Least Squares Regression ◽

Dimensionality Reduction Method

Download Full-text

Inhibition of Occluded Facial Regions for Distance-Based Face Recognition

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/746 ◽

2018 ◽

Cited By ~ 1

Author(s):

Daniel López Sánchez ◽

Juan M. Corchado ◽

Angelica González Arrieta

Keyword(s):

Face Recognition ◽

Dimensionality Reduction ◽

Reduction Method ◽

Computational Cost ◽

Experimental Results ◽

Classification Method ◽

Partial Occlusion ◽

Classical Distance ◽

Dimensionality Reduction Method ◽

Dissimilarity Function

This work focuses on the design and validation of a CBR system for efficient face recognition under partial occlusion conditions. The proposed CBR system is based on a classical distance-based classification method, modified to increase its robustness to partial occlusion. This is achieved by using a novel dissimilarity function which discards features coming from occluded facial regions. In addition, we explore the integration of an efficient dimensionality reduction method into the proposed framework to reduce computational cost. We present experimental results showing that the proposed CBR system outperforms classical methods of similar computational requirements in the task of face recognition under partial occlusion.

Download Full-text

A SURVEY OF METHODS OF TEXT-TO-IMAGE TRANSLATION

Bionics of Intelligence ◽

10.30837/bi.2019.2(93).11 ◽

2019 ◽

Vol 2 (93) ◽

pp. 64-68

Author(s):

I. Konarieva ◽

D. Pydorenko ◽

O. Turuta

Keyword(s):

Generative Adversarial Networks ◽

Text Compression ◽

Image Generation ◽

Adversarial Networks ◽

Image Translation ◽

Different Types ◽

The Given

The given work considers the existing methods of text compression (finding keywords or creating summary) using RAKE, Lex Rank, Luhn, LSA, Text Rank algorithms; image generation; text-to-image and image-to-image translation including GANs (generative adversarial networks). Different types of GANs were described such as StyleGAN, GauGAN, Pix2Pix, CycleGAN, BigGAN, AttnGAN. This work aims to show ways to create illustrations for the text. First, key information should be obtained from the text. Second, this key information should be transformed into images. There were proposed several ways to transform keywords to images: generating images or selecting them from a dataset with further transforming like generating new images based on selected ow combining selected images e.g. with applying style from one image to another. Based on results, possibilities for further improving the quality of image generation were also planned: combining image generation with selecting images from a dataset, limiting topics of image generation.

Download Full-text

S2I-Bird: Sound-to-Image Generation of Bird Species using Generative Adversarial Networks

2020 25th International Conference on Pattern Recognition (ICPR) ◽

10.1109/icpr48806.2021.9412721 ◽

2021 ◽

Author(s):

Joo Yong Shim ◽

Joongheon Kim ◽

Jong-Kook Kim

Keyword(s):

Bird Species ◽

Generative Adversarial Networks ◽

Image Generation ◽

Adversarial Networks

Download Full-text

A Multi-Resolution Approach to GAN-Based Speech Enhancement

Applied Sciences ◽

10.3390/app11020721 ◽

2021 ◽

Vol 11 (2) ◽

pp. 721

Author(s):

Hyung Yong Kim ◽

Ji Won Yoon ◽

Sung Jun Cheon ◽

Woo Hyun Kang ◽

Nam Soo Kim

Keyword(s):

Speech Enhancement ◽

Optimal Solution ◽

Experimental Results ◽

Generative Adversarial Networks ◽

The Real ◽

Multi Scale ◽

Adversarial Networks ◽

Speech Characteristics ◽

Conventional Methods ◽

Convex Property

Recently, generative adversarial networks (GANs) have been successfully applied to speech enhancement. However, there still remain two issues that need to be addressed: (1) GAN-based training is typically unstable due to its non-convex property, and (2) most of the conventional methods do not fully take advantage of the speech characteristics, which could result in a sub-optimal solution. In order to deal with these problems, we propose a progressive generator that can handle the speech in a multi-resolution fashion. Additionally, we propose a multi-scale discriminator that discriminates the real and generated speech at various sampling rates to stabilize GAN training. The proposed structure was compared with the conventional GAN-based speech enhancement algorithms using the VoiceBank-DEMAND dataset. Experimental results showed that the proposed approach can make the training faster and more stable, which improves the performance on various metrics for speech enhancement.

Download Full-text