sound recognition
Recently Published Documents


TOTAL DOCUMENTS

342
(FIVE YEARS 111)

H-INDEX

24
(FIVE YEARS 4)

2022 ◽  
Vol 185 ◽  
pp. 108389
Author(s):  
Liping Xie ◽  
Chihua Lu ◽  
Zhien Liu ◽  
Lirong Yan ◽  
Tao Xu

2021 ◽  
Vol 11 (24) ◽  
pp. 11663
Author(s):  
Eugenio Brusa ◽  
Cristiana Delprete ◽  
Luigi Gianpio Di Maggio

Today’s deep learning strategies require ever-increasing computational efforts and demand for very large amounts of labelled data. Providing such expensive resources for machine diagnosis is highly challenging. Transfer learning recently emerged as a valuable approach to address these issues. Thus, the knowledge learned by deep architectures in different scenarios can be reused for the purpose of machine diagnosis, minimizing data collecting efforts. Existing research provides evidence that networks pre-trained for image recognition can classify machine vibrations in the time-frequency domain by means of transfer learning. So far, however, there has been little discussion about the potentials included in networks pre-trained for sound recognition, which are inherently suited for time-frequency tasks. This work argues that deep architectures trained for music recognition and sound detection can perform machine diagnosis. The YAMNet convolutional network was designed to serve extremely efficient mobile applications for sound detection, and it was originally trained on millions of data extracted from YouTube clips. That framework is employed to detect bearing faults for the CWRU dataset. It is shown that transferring knowledge from sound and music recognition to bearing fault detection is successful. The maximum accuracy is achieved using a few hundred data for fine-tuning the fault diagnosis model.


2021 ◽  
Vol 72 (2) ◽  
pp. 510-519
Author(s):  
Richard Holaj ◽  
Petr Pořízka

Abstract In this paper, we would like to provide a brief overview of the current state of pronunciation teaching in e-learning and demonstrate a new approach to building tools for automatic feedback concerning correct pronunciation based on the most frequent or typical errors in speech production made by non-native speakers. We will illustrate this in the process of designing annotation for a sound recognition tool to provide feedback on pronunciation. At the end of the paper, we will also present how we have tried to apply this annotation to the tool, what caveats we have found and what our plans are.


2021 ◽  
Vol 2078 (1) ◽  
pp. 012066
Author(s):  
Rui Cai ◽  
Qian Wang ◽  
Yucheng Hou ◽  
Haorui Liu

Abstract This paper investigates the operation inspection and anomaly diagnosis of transformers in substations, and carries out an application study of artificial intelligence-based sound recognition technology in transformer discharge diagnosis to improve the timeliness and diagnostic capability of intelligent monitoring of substation equipment operation. In this study, a sound parameterization technology in the field of sound recognition is used to implement automatic discharge sound detections. The sound samples are pre-processed and then Mel-frequency cepstrum coefficients (MFCCs) are extracted as features, which are used to train Gaussian mixture models (GMMs). Finally, the trained GMMs are used to detect discharge sounds in the place of transformers in substations. The test results demonstrate that the audio anomaly detection based on MFCCs and GMMs can be used to effectively recognize anomalous discharge in the high scenario of transformers.


2021 ◽  
Author(s):  
Mei Zhang ◽  
Lina Yan ◽  
Guilan Luo ◽  
Gang Li ◽  
Wenzhi Liu ◽  
...  

2021 ◽  
Author(s):  
Gwangsu Kim ◽  
Dong-Kyum Kim ◽  
Hawoong Jeong

Music exists in almost every society, has universal acoustic features, and is processed by distinct neural circuits in humans even with no experience of musical training. These characteristics suggest an innateness of the sense of music in our brain, but it is unclear how this innateness emerges and what functions it has. Here, using an artificial deep neural network that models the auditory information processing of the brain, we show that units tuned to music can spontaneously emerge by learning natural sound detection, even without learning music. By simulating the responses of network units to 35,487 natural sounds in 527 categories, we found that various subclasses of music are strongly clustered in the embedding space, and that this clustering arises from the music-selective response of the network units. The music-selective units encoded the temporal structure of music in multiple timescales, following the population-level response characteristics observed in the brain. The process of generalization was critical for the emergence of music-selectivity, as such properties were absent when the label of the training data was randomized to prevent generalization. We confirmed that music-selectivity can work as a functional basis for the generalization of natural sound, thereby elucidating its origin. These findings suggest that our sense of music can be innate, universally shaped by evolutionary adaptation to process natural sound.


Electronics ◽  
2021 ◽  
Vol 10 (21) ◽  
pp. 2622
Author(s):  
Jurgen Vandendriessche ◽  
Nick Wouters ◽  
Bruno da Silva ◽  
Mimoun Lamrini ◽  
Mohamed Yassin Chkouri ◽  
...  

In recent years, Environmental Sound Recognition (ESR) has become a relevant capability for urban monitoring applications. The techniques for automated sound recognition often rely on machine learning approaches, which have increased in complexity in order to achieve higher accuracy. Nonetheless, such machine learning techniques often have to be deployed on resource and power-constrained embedded devices, which has become a challenge with the adoption of deep learning approaches based on Convolutional Neural Networks (CNNs). Field-Programmable Gate Arrays (FPGAs) are power efficient and highly suitable for computationally intensive algorithms like CNNs. By fully exploiting their parallel nature, they have the potential to accelerate the inference time as compared to other embedded devices. Similarly, dedicated architectures to accelerate Artificial Intelligence (AI) such as Tensor Processing Units (TPUs) promise to deliver high accuracy while achieving high performance. In this work, we evaluate existing tool flows to deploy CNN models on FPGAs as well as on TPU platforms. We propose and adjust several CNN-based sound classifiers to be embedded on such hardware accelerators. The results demonstrate the maturity of the existing tools and how FPGAs can be exploited to outperform TPUs.


2021 ◽  
Vol 11 (18) ◽  
pp. 8394
Author(s):  
Lancelot Lhoest ◽  
Mimoun Lamrini ◽  
Jurgen Vandendriessche ◽  
Nick Wouters ◽  
Bruno da Silva ◽  
...  

Environmental Sound Recognition has become a relevant application for smart cities. Such an application, however, demands the use of trained machine learning classifiers in order to categorize a limited set of audio categories. Although classical machine learning solutions have been proposed in the past, most of the latest solutions that have been proposed toward automated and accurate sound classification are based on a deep learning approach. Deep learning models tend to be large, which can be problematic when considering that sound classifiers often have to be embedded in resource constrained devices. In this paper, a classical machine learning based classifier called MosAIc, and a lighter Convolutional Neural Network model for environmental sound recognition, are proposed to directly compete in terms of accuracy with the latest deep learning solutions. Both approaches are evaluated in an embedded system in order to identify the key parameters when placing such applications on constrained devices. The experimental results show that classical machine learning classifiers can be combined to achieve similar results to deep learning models, and even outperform them in accuracy. The cost, however, is a larger classification time.


Cognition ◽  
2021 ◽  
Vol 214 ◽  
pp. 104627
Author(s):  
James Traer ◽  
Sam V. Norman-Haignere ◽  
Josh H. McDermott

Sign in / Sign up

Export Citation Format

Share Document