Novel Adaptive Generative Adversarial Network for Voice Conversion

Voice conversion (VC) transforms the speaking style of a source speaker to the speaking style of a target speaker by keeping linguistic information unchanged. Traditional VC techniques rely on parallel recordings of multiple speakers uttering the same sentences. Earlier approaches mainly find a mapping between the given source–target speakers, which contain pairs of similar utterances spoken by different speakers. However, parallel data are computationally expensive and difficult to collect. Non-parallel VC remains an interesting but challenging speech processing task. To address this limitation, we propose a method that allows a non-parallel many-to-many voice conversion by using a generative adversarial network. To the best of the authors’ knowledge, our study is the first one that employs a sinusoidal model with continuous parameters to generate converted speech signals. Our method involves only several minutes of training examples without parallel utterances or time alignment procedures, where the source–target speakers are entirely unseen by the training dataset. Moreover, empirical study is carried out on the publicly available CSTR VCTK corpus. Our conclusions indicate that the proposed method reached the state-of-the-art results in speaker similarity to the utterance produced by the target speaker, while suggesting important structural ones to be further analyzed by experts.

Download Full-text

Cross-Lingual Voice Conversion With Controllable Speaker Individuality Using Variational Autoencoder and Star Generative Adversarial Network

IEEE Access ◽

10.1109/access.2021.3063519 ◽

2021 ◽

Vol 9 ◽

pp. 47503-47515

Author(s):

Tuan Vu Ho ◽

Masato Akagi

Keyword(s):

Voice Conversion ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Variational Autoencoder ◽

Cross Lingual

Download Full-text

Neutral-to-emotional voice conversion with cross-wavelet transform F0 using generative adversarial networks

APSIPA Transactions on Signal and Information Processing ◽

10.1017/atsip.2019.3 ◽

2019 ◽

Vol 8 ◽

Author(s):

Zhaojie Luo ◽

Jinhui Chen ◽

Tetsuya Takiguchi ◽

Yasuo Ariki

Keyword(s):

Wavelet Transform ◽

Training Model ◽

Voice Conversion ◽

Generative Adversarial Networks ◽

Limiting Factor ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Cross Wavelet Transform ◽

Voice Data ◽

Cross Wavelet

AbstractIn this paper, we propose a novel neutral-to-emotional voice conversion (VC) model that can effectively learn a mapping from neutral to emotional speech with limited emotional voice data. Although conventional VC techniques have achieved tremendous success in spectral conversion, the lack of representations in fundamental frequency (F0), which explicitly represents prosody information, is still a major limiting factor for emotional VC. To overcome this limitation, in our proposed model, we outline the practical elements of the cross-wavelet transform (XWT) method, highlighting how such a method is applied in synthesizing diverse representations of F0 features in emotional VC. The idea is (1) to decompose F0 into different temporal level representations using continuous wavelet transform (CWT); (2) to use XWT to combine different CWT-F0 features to synthesize interaction XWT-F0 features; (3) and then use both the CWT-F0 and corresponding XWT-F0 features to train the emotional VC model. Moreover, to better measure similarities between the converted and real F0 features, we applied a VA-GAN training model, which combines a variational autoencoder (VAE) with a generative adversarial network (GAN). In the VA-GAN model, VAE learns the latent representations of high-dimensional features (CWT-F0, XWT-F0), while the discriminator of the GAN can use the learned feature representations as a basis for a VAE reconstruction objective.

Download Full-text

Depth from stacked light field images using generative adversarial network

Electronic Imaging ◽

10.2352/issn.2470-1173.2019.11.ipas-270 ◽

2019 ◽

Vol 2019 (11) ◽

pp. 270-1-270-8

Author(s):

Ji-Hun Mun ◽

Yo-Sung Ho

Keyword(s):

Light Field ◽

Generative Adversarial Network ◽

Adversarial Network

Download Full-text

Flight Trajectory Pattern Generalization and Abnormal Flight Detection with Generative Adversarial Network

AIAA Scitech 2021 Forum ◽

10.2514/6.2021-0775 ◽

2021 ◽

Author(s):

Muhammet Aksoy ◽

Orhan Ozdemir ◽

Guney Guner ◽

Baris Baspinar ◽

Emre Koyuncu

Keyword(s):

Flight Trajectory ◽

Generative Adversarial Network ◽

Pattern Generalization ◽

Adversarial Network ◽

Trajectory Pattern

Download Full-text

ORGANIC (1).pdf

10.26434/chemrxiv.5309668.v1 ◽

2017 ◽

Author(s):

Benjamin Sanchez-Lengeling ◽

Carlos Outeiral ◽

Gabriel L. Guimaraes ◽

Alan Aspuru-Guzik

Keyword(s):

Machine Learning ◽

Learning Community ◽

Chemical Species ◽

Material Design ◽

Organic Photovoltaic ◽

Generative Adversarial Networks ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Adversarial Networks ◽

Photovoltaic Material

Molecular discovery seeks to generate chemical species tailored to very specific needs. In this paper, we present ORGANIC, a framework based on Objective-Reinforced Generative Adversarial Networks (ORGAN), capable of producing a distribution over molecular space that matches with a certain set of desirable metrics. This methodology combines two successful techniques from the machine learning community: a Generative Adversarial Network (GAN), to create non-repetitive sensible molecular species, and Reinforcement Learning (RL), to bias this generative distribution towards certain attributes. We explore several applications, from optimization of random physicochemical properties to candidates for drug discovery and organic photovoltaic material design.

Download Full-text

Completely Unsupervised Phoneme Recognition by a Generative Adversarial Network Harmonized with Iteratively Refined Hidden Markov Models

10.21437/interspeech.2019-2068 ◽

2019 ◽

Author(s):

Kuan-Yu Chen ◽

Che-Ping Tsai ◽

Da-Rong Liu ◽

Hung-Yi Lee ◽

Lin-shan Lee

Keyword(s):

Hidden Markov Models ◽

Markov Models ◽

Hidden Markov ◽

Phoneme Recognition ◽

Generative Adversarial Network ◽

Adversarial Network

Download Full-text

Role of General Adversarial Networks in Mammogram Analysis: A Review

Current Medical Imaging Formerly Current Medical Imaging Reviews ◽

10.2174/1573405614666191115102318 ◽

2020 ◽

Vol 16 (7) ◽

pp. 863-877

Author(s):

Annapoorani Gopal ◽

Lathaselvi Gandhimaruthian ◽

Javid Ali

Keyword(s):

Breast Tumor ◽

Deep Neural Networks ◽

Training Data ◽

Learning Technology ◽

Breast Cancers ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Adversarial Networks ◽

Tumor Extraction

The Deep Neural Networks have gained prominence in the biomedical domain, becoming the most commonly used networks after machine learning technology. Mammograms can be used to detect breast cancers with high precision with the help of Convolutional Neural Network (CNN) which is deep learning technology. An exhaustive labeled data is required to train the CNN from scratch. This can be overcome by deploying Generative Adversarial Network (GAN) which comparatively needs lesser training data during a mammogram screening. In the proposed study, the application of GANs in estimating breast density, high-resolution mammogram synthesis for clustered microcalcification analysis, effective segmentation of breast tumor, analysis of the shape of breast tumor, extraction of features and augmentation of the image during mammogram classification have been extensively reviewed.

Download Full-text