scholarly journals Unsupervised Training for Deep Speech Source Separation with Kullback-Leibler Divergence Based Probabilistic Loss Function

Author(s):  
Masahito Togami ◽  
Yoshiki Masuyama ◽  
Tatsuya Komatsu ◽  
Yu Nakagome
Sensors ◽  
2018 ◽  
Vol 18 (5) ◽  
pp. 1371 ◽  
Author(s):  
Wai Lok Woo ◽  
Bin Gao ◽  
Ahmed Bouridane ◽  
Bingo Wing-Kuen Ling ◽  
Cheng Siong Chin

This paper presents an unsupervised learning algorithm for sparse nonnegative matrix factor time–frequency deconvolution with optimized fractional β-divergence. The β-divergence is a group of cost functions parametrized by a single parameter β. The Itakura–Saito divergence, Kullback–Leibler divergence and Least Square distance are special cases that correspond to β=0, 1, 2, respectively. This paper presents a generalized algorithm that uses a flexible range of β that includes fractional values. It describes a maximization–minimization (MM) algorithm leading to the development of a fast convergence multiplicative update algorithm with guaranteed convergence. The proposed model operates in the time–frequency domain and decomposes an information-bearing matrix into two-dimensional deconvolution of factor matrices that represent the spectral dictionary and temporal codes. The deconvolution process has been optimized to yield sparse temporal codes through maximizing the likelihood of the observations. The paper also presents a method to estimate the fractional β value. The method is demonstrated on separating audio mixtures recorded from a single channel. The paper shows that the extraction of the spectral dictionary and temporal codes is significantly more efficient by using the proposed algorithm and subsequently leads to better source separation performance. Experimental tests and comparisons with other factorization methods have been conducted to verify its efficacy.


Author(s):  
F. Boso ◽  
D. M. Tartakovsky

Hyperbolic balance laws with uncertain (random) parameters and inputs are ubiquitous in science and engineering. Quantification of uncertainty in predictions derived from such laws, and reduction of predictive uncertainty via data assimilation, remain an open challenge. That is due to nonlinearity of governing equations, whose solutions are highly non-Gaussian and often discontinuous. To ameliorate these issues in a computationally efficient way, we use the method of distributions, which here takes the form of a deterministic equation for spatio-temporal evolution of the cumulative distribution function (CDF) of the random system state, as a means of forward uncertainty propagation. Uncertainty reduction is achieved by recasting the standard loss function, i.e. discrepancy between observations and model predictions, in distributional terms. This step exploits the equivalence between minimization of the square error discrepancy and the Kullback–Leibler divergence. The loss function is regularized by adding a Lagrangian constraint enforcing fulfilment of the CDF equation. Minimization is performed sequentially, progressively updating the parameters of the CDF equation as more measurements are assimilated.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Zhen Yang ◽  
Xuefei Xu ◽  
Keke Wang ◽  
Xin Li ◽  
Chi Ma

In order to accurately identify targets such as insulators, shock hammers, bird nests, and spacers on high-voltage transmission lines, this paper proposes a multitarget detection model for transmission lines based on DANet and YOLOv4. First, the DANet and YOLOv4 are fused to solve the difficulty in understanding the scene and the discrimination of pixels caused by the complex and diverse scenes of UAV’ (unmanned aerial vehicle) aerial images (lighting, viewing angle, scale, occlusion, and so on) so as to improve the significance of the detection target. Gaussian function and KL (Kullback–Leibler) divergence are used to improve the nonmaximum suppression in YOLOv4 so as to improve the recognition rate of occluded targets; the focal loss function and the balanced cross entropy function are used to improve the loss function of YOLOv4 in order to reduce the impact of not only the imbalance between the background and the detection target but also the imbalance among the samples, which is aimed at improving the accuracy of the detection. Then, a data set is made for the experiment by using the UAV inspection image provided by a power grid company in Eastern Inner Mongolia. Finally, the algorithm proposed in this paper is compared with other target detection algorithms. Experimental results show that the average detection accuracy of the proposed algorithm can reach 94.7%, and the detection time of each image is 0.05 seconds. The method has good accuracy, real-time, and robustness.


Author(s):  
Lei Luo ◽  
Jian Pei ◽  
Heng Huang

This paper introduces a novel Robust Regression (RR) model, named Sinkhorn regression, which imposes Sinkhorn distances on both loss function and regularization. Traditional RR methods target at searching for an element-wise loss function (e.g., Lp-norm) to characterize the errors such that outlying data have a relatively smaller influence on the regression estimator. Due to the neglect of the geometric information, they often lead to the suboptimal results in the practical applications. To address this problem, we use a cross-bin distance function, i.e., Sinkhorn distances, to capture the geometric knowledge of real data. Sinkhorn distances is invariant in movement, rotation and zoom. Thus, our method is more robust to variations of data than traditional regression models. Meanwhile, we leverage Kullback-Leibler divergence to relax the proposed model with marginal constraints into its unbalanced formulation to adapt more types of features. In addition, we propose an efficient algorithm to solve the relaxed model and establish its complete statistical guarantees under mild conditions. Experiments on the five publicly available microarray data sets and one mass spectrometry data set demonstrate the effectiveness and robustness of our method.


Author(s):  
A. Howie ◽  
D.W. McComb

The bulk loss function Im(-l/ε (ω)), a well established tool for the interpretation of valence loss spectra, is being progressively adapted to the wide variety of inhomogeneous samples of interest to the electron microscopist. Proportionality between n, the local valence electron density, and ε-1 (Sellmeyer's equation) has sometimes been assumed but may not be valid even in homogeneous samples. Figs. 1 and 2 show the experimentally measured bulk loss functions for three pure silicates of different specific gravity ρ - quartz (ρ = 2.66), coesite (ρ = 2.93) and a zeolite (ρ = 1.79). Clearly, despite the substantial differences in density, the shift of the prominent loss peak is very small and far less than that predicted by scaling e for quartz with Sellmeyer's equation or even the somewhat smaller shift given by the Clausius-Mossotti (CM) relation which assumes proportionality between n (or ρ in this case) and (ε - 1)/(ε + 2). Both theories overestimate the rise in the peak height for coesite and underestimate the increase at high energies.


Sign in / Sign up

Export Citation Format

Share Document