permutation problem
Recently Published Documents


TOTAL DOCUMENTS

90
(FIVE YEARS 8)

H-INDEX

10
(FIVE YEARS 0)

Author(s):  
Rui Wang ◽  
Dong Liang ◽  
Xiaochun Cao ◽  
Yuanfang Guo

This article studies the correspondence problem for semantically similar images, which is challenging due to the joint visual and geometric deformations. We introduce the Flip-aware Distance Ratio method (FDR) to solve this problem from the perspective of geometric structure analysis. First, a distance ratio constraint is introduced to enforce the geometric consistencies between images with large visual variations, whereas local geometric jitters are tolerated via a smoothness term. For challenging cases with symmetric structures, our proposed method exploits Curl to suppress the mismatches. Subsequently, image correspondence is formulated as a permutation problem, for which we propose a Gradient Guided Simulated Annealing (GGSA) algorithm to perform a robust discrete optimization. Experiments on simulated and real-world datasets, where both visual and geometric deformations are present, indicate that our method significantly improves the baselines for both visually and semantically similar images.


Author(s):  
Daichi Kitamura ◽  
Kohei Yatabe

AbstractIndependent low-rank matrix analysis (ILRMA) is the state-of-the-art algorithm for blind source separation (BSS) in the determined situation (the number of microphones is greater than or equal to that of source signals). ILRMA achieves a great separation performance by modeling the power spectrograms of the source signals via the nonnegative matrix factorization (NMF). Such a highly developed source model can solve the permutation problem of the frequency-domain BSS to a large extent, which is the reason for the excellence of ILRMA. In this paper, we further improve the separation performance of ILRMA by additionally considering the general structure of spectrograms, which is called consistency, and hence, we call the proposed method Consistent ILRMA. Since a spectrogram is calculated by an overlapping window (and a window function induces spectral smearing called main- and side-lobes), the time-frequency bins depend on each other. In other words, the time-frequency components are related to each other via the uncertainty principle. Such co-occurrence among the spectral components can function as an assistant for solving the permutation problem, which has been demonstrated by a recent study. On the basis of these facts, we propose an algorithm for realizing Consistent ILRMA by slightly modifying the original algorithm. Its performance was extensively evaluated through experiments performed with various window lengths and shift lengths. The results indicated several tendencies of the original and proposed ILRMA that include some topics not fully discussed in the literature. For example, the proposed Consistent ILRMA tends to outperform the original ILRMA when the window length is sufficiently long compared to the reverberation time of the mixing system.


2020 ◽  
Author(s):  
Daichi Kitamura ◽  
Kohei Yatabe

Abstract Independent low-rank matrix analysis (ILRMA) is the state-of-the-art algorithm for blind source separation (BSS) in the determined situation (the number of microphones is greater than or equal to that of source signals). ILRMA achieves a great separation performance by modeling the power spectrograms of the source signals via the nonnegative matrix factorization (NMF). Such highly developed source model can effectively solve the permutation problem of the frequency-domain BSS, which should be the reason of the excellence of ILRMA. In this paper, we further improve the separation performance of ILRMA by additionally considering the general structure of spectrogram called consistency, and hence we call the proposed method Consistent ILRMA. Since a spectrogram is calculated by an overlapping window (and a window function induces spectral smearing called main- and side-lobes), the time-frequency bins depend on each other. In other words, the time-frequency components are related each other via the uncertainty principle. Such co-occurrence among the spectral components can be an assistant for solving the permutation problem, which has been demonstrated by a recent study. Based on these facts, we propose an algorithm for realizing Consistent ILRMA by slightly modifying the original algorithm. Its performance was extensively studied through the experiments performed with various window lengths and shift lengths. The results indicated several tendencies of the original and proposed ILRMA which include some topics have not discussed well in the literature. For example, the proposed Consistent ILRMA tends to outperform the original ILRMA when the window length is sufficiently long compared to the reverberation time of the mixing system.


Author(s):  
Monjed H. Samuh ◽  
Ridwan A. Sanusi

In this paper, permutation test of comparing two-independent samples is investigated in the context of extreme ranked set sampling (ERSS). Three test statistics are proposed. The statistical power of these new test statistics are evaluated numerically. The results are compared with the statistical power of the classical independent two-sample $t$-test, Mann-Whitney $U$ test, and the usual two-sample permutation test under simple random sampling (SRS). In addition, the method of computing a confidence interval for the two-sample permutation problem under ERSS is explained. The performance of this method is compared with the intervals obtained by SRS and Mann-Whitney procedures in terms of empirical coverage probability and expected length. The comparison shows that the proposed statistics outperform their counterparts. Finally, the application of the proposed statistics is illustrated using a real life example.


2020 ◽  
Vol 10 (2) ◽  
pp. 617
Author(s):  
Jo ◽  
Moon

In this paper, a Collision Grid Map (CGM) is proposed by using 3d point cloud data to predict the collision between the cattle and the end effector of the manipulator in the barn environment. The Generated Collision Grid Map using x-y plane and depth z data in 3D point cloud data is applied to a Convolutional Neural Network to predict a collision situation. There is an invariant of the permutation problem, which is not efficiently learned in occurring matter of different orders when 3d point cloud data is applied to Convolutional Neural Network. The Collision Grid Map is generated by point cloud data based on the probability method. The Collision Grid Map scheme is composed of a 2-channel. The first channel is constructed by location data in the x-y plane. The second channel is composed of depth data in the z-direction. 3D point cloud is measured in a barn environment and created a Collision Grid Map. Then the generated Collision Grid Map is applied to the Convolutional Neural Network to predict the collision with cattle. The experimental results show that the proposed scheme is reliable and robust in a barn environment.


Whatever the modern achievement of deep learning for several terminology processing tasks, single-microphone, speaker-independent speech separation remains difficult for just two main things. The rest point is that the arbitrary arrangement of the goal and masker speakers in the combination (permutation problem), and also the following is the unidentified amount of speakers in the mix (output issue). We suggest a publication profound learning framework for speech modification, which handles both issues. We work with a neural network to project the specific time-frequency representation with the mixed-signal to a high-dimensional categorizing region. The time-frequency embeddings of the speaker have then made to an audience around corresponding attractor stage that is employed to figure out the time-frequency assignment with this speaker identifying a speaker using a blend of speakers together with the aid of neural networks employing deep learning. The purpose function for your machine is standard sign renovation error that allows finishing functioning throughout both evaluation and training periods. We assessed our system with all the voices of users three and two speaker mixes and also document similar or greater performance when compared with another advanced level, deep learning approaches for speech separation.


Author(s):  
Sabine Storandt ◽  
Stefan Funke

In this paper, we study a problem from the realm of multicriteria decision making in which the goal is to select from a given set S of d-dimensional objects a minimum sized subset S0 with bounded regret. Thereby, regret measures the unhappiness of users which would like to select their favorite object from set S but now can only select their favorite object from the subset S0. Previous work focused on bounding the maximum regret which is determined by the most unhappy user. We propose to consider the average regret instead which is determined by the sum of (un)happiness of all possible users. We show that this regret measure comes with desirable properties as supermodularity which allows to construct approximation algorithms. Furthermore, we introduce the regret minimizing permutation problem and discuss extensions of our algorithms to the recently proposed k-regret measure. Our theoretical results are accompanied with experiments on a variety of inputs with d up to 7.


2018 ◽  
Vol 268 (2) ◽  
pp. 463-472
Author(s):  
Pascale Bendotti ◽  
Pierre Fouilhoux ◽  
Safia Kedad-Sidhoum

Author(s):  
Jing Shi ◽  
Jiaming Xu ◽  
Guangcan Liu ◽  
Bo Xu

Recent deep learning methods have made significant progress in multi-talker mixed speech separation. However, most existing models adopt a driftless strategy to separate all the speech channels rather than selectively attend the target one. As a result, those frameworks may be failed to offer a satisfactory solution in complex auditory scene where the number of input sounds is usually uncertain and even dynamic. In this paper, we present a novel neural network based structure motivated by the top-down attention behavior of human when facing complicated acoustical scene. Different from previous works, our method constructs an inference-attention structure to predict interested candidates and extract each speech channel of them. Our work gets rid of the limitation that the number of channels must be given or the high computation complexity for label permutation problem. We evaluated our model on the WSJ0 mixed-speech tasks. In all the experiments, our model gets highly competitive to reach and even outperform the baselines.


Sign in / Sign up

Export Citation Format

Share Document