Non-parallel dictionary learning for voice conversion using non-negative Tucker decomposition

Abstract Voice conversion (VC) is a technique of exclusively converting speaker-specific information in the source speech while preserving the associated phonemic information. Non-negative matrix factorization (NMF)-based VC has been widely researched because of the natural-sounding voice it achieves when compared with conventional Gaussian mixture model-based VC. In conventional NMF-VC, models are trained using parallel data which results in the speech data requiring elaborate pre-processing to generate parallel data. NMF-VC also tends to be an extensive model as this method has several parallel exemplars for the dictionary matrix, leading to a high computational cost. In this study, an innovative parallel dictionary-learning method using non-negative Tucker decomposition (NTD) is proposed. The proposed method uses tensor decomposition and decomposes an input observation into a set of mode matrices and one core tensor. The proposed NTD-based dictionary-learning method estimates the dictionary matrix for NMF-VC without using parallel data. The experimental results show that the proposed method outperforms other methods in both parallel and non-parallel settings.

Download Full-text

Parallel-Data-Free Dictionary Learning for Voice Conversion Using Non-Negative Tucker Decomposition

2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2018.8462569 ◽

2018 ◽

Author(s):

Yuki Takashima ◽

Hajime Yano ◽

Toru Nakashika ◽

Tetsuya Takiguchi ◽

Yasuo Ariki

Keyword(s):

Dictionary Learning ◽

Voice Conversion ◽

Tucker Decomposition ◽

Parallel Data

Download Full-text

Retrieving the leaked signals from noise using a fast dictionary learning method

Geophysics ◽

10.1190/geo2021-0243.1 ◽

2021 ◽

pp. 1-86

Author(s):

Wei Chen ◽

Omar M. Saad ◽

Yapo Abolé Serge Innocent Oboué ◽

Liuqing Yang ◽

Yangkang Chen

Keyword(s):

Dictionary Learning ◽

Seismic Data ◽

State Of The Art ◽

Computational Cost ◽

Learning Method ◽

Global Parameter ◽

Radius Parameter ◽

Orthogonalization Method ◽

Value Decomposition ◽

Learned Features

Most traditional seismic denoising algorithms will cause damages to useful signals, which are visible from the removed noise profiles and are known as signal leakage. The local signal-and-noise orthogonalization method is an effective method for retrieving the leaked signals from the removed noise. Retrieving leaked signals while rejecting the noise is compromised by the smoothing radius parameter in the local orthogonalization method. It is not convenient to adjust the smoothing radius because it is a global parameter while the seismic data is highly variable locally. To retrieve the leaked signals adaptively, we propose a new dictionary learning method. Because of the patch-based nature of the dictionary learning method, it can adapt to the local feature of seismic data. We train a dictionary of atoms that represent the features of the useful signals from the initially denoised data. Based on the learned features, we retrieve the weak leaked signals from the noise via a sparse co ding step. Considering the large computational cost when training a dictionary from high-dimensional seismic data, we leverage a fast dictionary up dating algorithm, where the singular value decomposition (SVD) is replaced via the algebraic mean to update the dictionary atom. We test the performance of the proposed method on several synthetic and field data examples, and compare it with that from the state-of-the-art local orthogonalization method.

Download Full-text

Fast dictionary learning for noise attenuation of multidimensional seismic data

Geophysical Journal International ◽

10.1093/gji/ggaa184 ◽

2020 ◽

Vol 222 (3) ◽

pp. 1717-1727 ◽

Cited By ~ 1

Author(s):

Yangkang Chen

Keyword(s):

Dictionary Learning ◽

Computational Efficiency ◽

Seismic Data ◽

Learning Algorithm ◽

Computational Cost ◽

Noise Attenuation ◽

Arithmetic Average ◽

Sparse Dictionary Learning ◽

Singular Value Decompositions ◽

High Computational Cost

SUMMARY The K-SVD algorithm has been successfully utilized for adaptively learning the sparse dictionary in 2-D seismic denoising. Because of the high computational cost of many singular value decompositions (SVDs) in the K-SVD algorithm, it is not applicable in practical situations, especially in 3-D or 5-D problems. In this paper, I extend the dictionary learning based denoising approach from 2-D to 3-D. To address the computational efficiency problem in K-SVD, I propose a fast dictionary learning approach based on the sequential generalized K-means (SGK) algorithm for denoising multidimensional seismic data. The SGK algorithm updates each dictionary atom by taking an arithmetic average of several training signals instead of calculating an SVD as used in K-SVD algorithm. I summarize the sparse dictionary learning algorithm using K-SVD, and introduce SGK algorithm together with its detailed mathematical implications. 3-D synthetic, 2-D and 3-D field data examples are used to demonstrate the performance of both K-SVD and SGK algorithms. It has been shown that SGK algorithm can significantly increase the computational efficiency while only slightly degrading the denoising performance.

Download Full-text

Compression of hyper-spectral images using an accelerated nonnegative tensor decomposition

Open Physics ◽

10.1515/phys-2017-0123 ◽

2017 ◽

Vol 15 (1) ◽

pp. 992-996 ◽

Cited By ~ 1

Author(s):

Jin Li ◽

Zilong Liu

Keyword(s):

Computational Cost ◽

Tensor Decomposition ◽

Low Complexity ◽

Transform Domain ◽

Spatial Correlations ◽

Compression Method ◽

Nonnegative Tensor ◽

Hyper Spectral ◽

High Computational Cost ◽

Very High

AbstractNonnegative tensor Tucker decomposition (NTD) in a transform domain (e.g., 2D-DWT, etc) has been used in the compression of hyper-spectral images because it can remove redundancies between spectrum bands and also exploit spatial correlations of each band. However, the use of a NTD has a very high computational cost. In this paper, we propose a low complexity NTD-based compression method of hyper-spectral images. This method is based on a pair-wise multilevel grouping approach for the NTD to overcome its high computational cost. The proposed method has a low complexity under a slight decrease of the coding performance compared to conventional NTD. We experimentally confirm this method, which indicates that this method has the less processing time and keeps a better coding performance than the case that the NTD is not used. The proposed approach has a potential application in the loss compression of hyper-spectral or multi-spectral images

Download Full-text

An Alternate Algorithm for (3x3) Median Filtering of Digital Images

INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY ◽

10.24297/ijct.v1i1.6732 ◽

2012 ◽

Vol 2 (1) ◽

pp. 7-9 ◽

Cited By ~ 2

Author(s):

Satinderjit Singh

Keyword(s):

Median Filter ◽

Computational Cost ◽

Spatial Coherence ◽

General Purpose ◽

Median Filtering ◽

Basic Algorithm ◽

Temporal Complexity ◽

Filter Kernel ◽

One Step ◽

High Computational Cost

Median filtering is a commonly used technique in image processing. The main problem of the median filter is its high computational cost (for sorting N pixels, the temporal complexity is O(NÂ·log N), even with the most efficient sorting algorithms). When the median filter must be carried out in real time, the software implementation in general-purpose processorsdoes not usually give good results. This Paper presents an efficient algorithm for median filtering with a 3x3 filter kernel with only about 9 comparisons per pixel using spatial coherence between neighboring filter computations. The basic algorithm calculates two medians in one step and reuses sorted slices of three vertical neighboring pixels. An extension of this algorithm for 2D spatial coherence is also examined, which calculates four medians per step.

Download Full-text

Voice Conversion Based on Matrix Variate Gaussian Mixture Model Using Multiple Frame Features

10.21437/interspeech.2016-705 ◽

2016 ◽

Author(s):

Yi Yang ◽

Hidetsugu Uchida ◽

Daisuke Saito ◽

Nobuaki Minematsu

Keyword(s):

Gaussian Mixture Model ◽

Mixture Model ◽

Gaussian Mixture ◽

Voice Conversion ◽

Multiple Frame

Download Full-text

Average Modeling Approach to Voice Conversion with Non-Parallel Data

10.21437/odyssey.2018-32 ◽

2018 ◽

Cited By ~ 7

Author(s):

Xiaohai Tian ◽

Junchao Wang ◽

Haihua Xu ◽

Eng-Siong Chng ◽

Haizhou Li

Keyword(s):

Voice Conversion ◽

Modeling Approach ◽

Parallel Data

Download Full-text

Methods for studying dissolved oxygen levels in coastal and estuarine waters receiving combined sewer overflows

Water Science & Technology ◽

10.2166/wst.1995.0081 ◽

1995 ◽

Vol 32 (2) ◽

pp. 95-103

Author(s):

José A. Revilla ◽

Kalin N. Koev ◽

Rafael Díaz ◽

César Álvarez ◽

Antonio Roldán

Keyword(s):

Dissolved Oxygen ◽

Coastal Waters ◽

Computational Cost ◽

Oxygen Deficit ◽

Alternative Methods ◽

Coastal Zones ◽

Combined Sewer Overflows ◽

Sewer Systems ◽

Combined Sewer ◽

High Computational Cost

One factor in determining the transport capacity of coastal interceptors in Combined Sewer Systems (CSS) is the reduction of Dissolved Oxygen (DO) in coastal waters originating from the overflows. The study of the evolution of DO in coastal zones is complex. The high computational cost of using mathematical models discriminates against the required probabilistic analysis being undertaken. Alternative methods, based on such mathematical modelling, employed in a limited number of cases, are therefore needed. In this paper two alternative methods are presented for the study of oxygen deficit resulting from overflows of CSS. In the first, statistical analyses focus on the causes of the deficit (the volume discharged). The second concentrates on the effects (the concentrations of oxygen in the sea). Both methods have been applied in a study of the coastal interceptor at Pasajes Estuary (Guipúzcoa, Spain) with similar results.

Download Full-text

Visualizing Profiles of Large Datasets of Weighted and Mixed Data

Mathematics ◽

10.3390/math9080891 ◽

2021 ◽

Vol 9 (8) ◽

pp. 891

Author(s):

Aurea Grané ◽

Alpha A. Sow-Barry

Keyword(s):

Multidimensional Scaling ◽

Random Sample ◽

Simulation Study ◽

Clustering Algorithm ◽

Computational Cost ◽

Interpolation Formula ◽

Large Datasets ◽

Mixed Data ◽

Multivariate Techniques ◽

High Computational Cost

This work provides a procedure with which to construct and visualize profiles, i.e., groups of individuals with similar characteristics, for weighted and mixed data by combining two classical multivariate techniques, multidimensional scaling (MDS) and the k-prototypes clustering algorithm. The well-known drawback of classical MDS in large datasets is circumvented by selecting a small random sample of the dataset, whose individuals are clustered by means of an adapted version of the k-prototypes algorithm and mapped via classical MDS. Gower’s interpolation formula is used to project remaining individuals onto the previous configuration. In all the process, Gower’s distance is used to measure the proximity between individuals. The methodology is illustrated on a real dataset, obtained from the Survey of Health, Ageing and Retirement in Europe (SHARE), which was carried out in 19 countries and represents over 124 million aged individuals in Europe. The performance of the method was evaluated through a simulation study, whose results point out that the new proposal solves the high computational cost of the classical MDS with low error.

Download Full-text

Reliability and reliability-based sensitivity analysis of self-centering buckling restrained braces using meta-models

Journal of Intelligent Material Systems and Structures ◽

10.1177/1045389x211026382 ◽

2021 ◽

pp. 1045389X2110263

Author(s):

Seyede Vahide Hashemi ◽

Mahmoud Miri ◽

Mohsen Rashki ◽

Sadegh Etedali

Keyword(s):

Failure Probability ◽

Limit State ◽

Computational Cost ◽

Sensitivity Analyses ◽

State Function ◽

Reliability Indices ◽

Buckling Restrained Brace ◽

Polynomial Response Surface ◽

Nonlinear Dynamic Analyses ◽

High Computational Cost

This paper aims to carry out sensitivity analyses to study how the effect of each design variable on the performance of self-centering buckling restrained brace (SC-BRB) and the corresponding buckling restrained brace (BRB) without shape memory alloy (SMA) rods. Furthermore, the reliability analyses of BRB and SC-BRB are performed in this study. Considering the high computational cost of the simulation methods, three Meta-models including the Kriging, radial basis function (RBF), and polynomial response surface (PRSM) are utilized to construct the surrogate models. For this aim, the nonlinear dynamic analyses are conducted on both BRB and SC-BRB by using OpenSees software. The results showed that the SMA area, SMA length ratio, and BRB core area have the most effect on the failure probability of SC-BRB. It is concluded that Kriging-based Monte Carlo Simulation (MCS) gives the best performance to estimate the limit state function (LSF) of BRB and SC-BRB in the reliability analysis procedures. Considering the effects of changing the maximum cyclic loading on the failure probability computation and comparison of the failure probability for different LSFs, it is also found that the reliability indices of SC-BRB were always higher than the corresponding reliability indices determined for BRB which confirms the performance superiority of SC-BRB than BRB.

Download Full-text