frequency warping
Recently Published Documents


TOTAL DOCUMENTS

131
(FIVE YEARS 10)

H-INDEX

16
(FIVE YEARS 1)

Author(s):  
Christian Kexel ◽  
Jochen Moll

Active piezoelectric transducers are successfully deployed in recent years for structural health monitoring using guided elastic waves or electro-mechanical impedance (EMI). In both domains, damage detection can be hampered by operational/environmental conditions and low-power constraints. In both domains, processing can be divided into approaches (i) taking into account baselines of the pristine structure as reference, (ii) ingesting an extensive measurement history for clustering to explore anomalies, (iii) incorporating additional information to label a state. The latter approach requires data from complementary sensors, learning from laboratory/field experiments or knowledge from simulations which may be infeasible for complex structures. Semi-supervised approaches are thus gaining popularity: few initial annotations are needed, because labels emerge through clustering and are subsequently used for state classification. In our work, bending and combined bending/torsion studies on rudder stocks are considered regarding EMI-based damage detection in the presence of load. We discuss the underpinnings of our processing. Then, we follow strategy (i) by introducing frequency warping to derive an improved damage indicator (DI). Finally, in a semi-supervised manner, we develop simple rules which even in presence of varying loads need only two frequency points for reliable damage detection. This sparsity-enforcing low-complexity approach is particularly beneficial in energy-aware SHM scenarios.


Author(s):  
Masoud Geravanchizadeh ◽  
Elnaz Forouhandeh ◽  
Meysam Bashirpour

AbstractThe performance of speech recognition systems trained with neutral utterances degrades significantly when these systems are tested with emotional speech. Since everybody can speak emotionally in the real-world environment, it is necessary to take account of the emotional states of speech in the performance of the automatic speech recognition system. Limited works have been performed in the field of emotion-affected speech recognition and so far, most of the researches have focused on the classification of speech emotions. In this paper, the vocal tract length normalization method is employed to enhance the robustness of the emotion-affected speech recognition system. For this purpose, two structures of the speech recognition system based on hybrids of hidden Markov model with Gaussian mixture model and deep neural network are used. To achieve this goal, frequency warping is applied to the filterbank and/or discrete-cosine transform domain(s) in the feature extraction process of the automatic speech recognition system. The warping process is conducted in a way to normalize the emotional feature components and make them close to their corresponding neutral feature components. The performance of the proposed system is evaluated in neutrally trained/emotionally tested conditions for different speech features and emotional states (i.e., Anger, Disgust, Fear, Happy, and Sad). In this system, frequency warping is employed for different acoustical features. The constructed emotion-affected speech recognition system is based on the Kaldi automatic speech recognition with the Persian emotional speech database and the crowd-sourced emotional multi-modal actors dataset as the input corpora. The experimental simulations reveal that, in general, the warped emotional features result in better performance of the emotion-affected speech recognition system as compared with their unwarped counterparts. Also, it can be seen that the performance of the speech recognition using the deep neural network-hidden Markov model outperforms the system employing the hybrid with the Gaussian mixture model.


2019 ◽  
Vol 146 (4) ◽  
pp. 2879-2879
Author(s):  
Ching-Hua Lee ◽  
Kuan-Lin Chen ◽  
Fred Harris ◽  
Bhaskar D. Rao ◽  
Harinath Garudadri

2019 ◽  
Author(s):  
Ching-Hua Lee ◽  
Kuan-Lin Chen ◽  
fred harris ◽  
Bhaskar D. Rao ◽  
Harinath Garudadri

2019 ◽  
Vol 28 (5) ◽  
pp. 054302
Author(s):  
Yu-Bo Qi ◽  
Shi-Hong Zhou ◽  
Meng-Xiao Yu ◽  
Shu-Yuan Du ◽  
Mei Sun ◽  
...  

Author(s):  
Shuhua Gao ◽  
Xiaoling Wu ◽  
Cheng Xiang ◽  
Dongyan Huang

Voice conversion aims to change a source speaker's voice to make it sound like the one of a target speaker while preserving linguistic information. Despite the rapid advance of voice conversion algorithms in the last decade, most of them are still too complicated to be accessible to the public. With the popularity of mobile devices especially smart phones, mobile voice conversion applications are highly desirable such that everyone can enjoy the pleasure of high-quality voice mimicry and people with speech disorders can also potentially benefit from it. Due to the limited computing resources on mobile phones, the major concern is the time efficiency of such a mobile application to guarantee positive user experience. In this paper, we detail the development of a mobile voice conversion system based on the Gaussian mixture model (GMM) and the weighted frequency warping methods. We attempt to boost the computational efficiency by making the best of hardware characteristics of today's mobile phones, such as parallel computing on multiple cores and the advanced vectorization support. Experimental evaluation results indicate that our system can achieve acceptable voice conversion performance while the conversion time for a five-second sentence only takes slightly more than one second on iPhone 7.


Sign in / Sign up

Export Citation Format

Share Document