scholarly journals Cough Recognition Based on Mel-Spectrogram and Convolutional Neural Network

2021 ◽  
Vol 8 ◽  
Author(s):  
Quan Zhou ◽  
Jianhua Shan ◽  
Wenlong Ding ◽  
Chengyin Wang ◽  
Shi Yuan ◽  
...  

In daily life, there are a variety of complex sound sources. It is important to effectively detect certain sounds in some situations. With the outbreak of COVID-19, it is necessary to distinguish the sound of coughing, to estimate suspected patients in the population. In this paper, we propose a method for cough recognition based on a Mel-spectrogram and a Convolutional Neural Network called the Cough Recognition Network (CRN), which can effectively distinguish cough sounds.

PLoS ONE ◽  
2020 ◽  
Vol 15 (12) ◽  
pp. e0243758
Author(s):  
Hye-kyung Shin ◽  
Sang Hee Park ◽  
Kyoung-woo Kim

In apartment houses, noise between floors can disturb pleasant living environments and cause disputes between neighbors. As a means of resolving disputes caused by inter-floor noise, noises are recorded for 24 hours in a household to verify whether the inter-floor noise exceeded the legal standards. If the noise exceeds the legal standards, the recorded sound is listened to, and it is checked whether the noise comes from neighboring households. When done manually, this process requires time and is costly, and there is a problem of whether the listener’s judgments of the sound source are consistent. This study aims to classify inter-floor noise according to noise sources by using a convolutional neural network model. A total of 1,515 sound sources of data recorded for 24 h from three households were annotated, and 40 4s audio clips of six noise sources, including “Footsteps,” “Dragging furniture,” “Hammering,” “Instant impact (dropping a heavy item),” “Vacuum cleaner,” and “Public announcement system” were identified. Moreover, datasets of 16 classes using ESC50’s urban sound category audio were used to distinguish the inter-floor noise heard indoors from the external noise. Although DenseNet, ResNet, Inception, and EfficientNet are models that use images as their domains, they showed an accuracy of 91.43–95.27% when classifying the inter-floor noise dataset. Among the reviewed models, ResNet showed an accuracy of 95.27±2.30% as well as a highest performance level in the F1 score, precision, and recall metrics. Additionally, ResNet showed the shortest inference time. This paper concludes by suggesting that the present findings can be extended in future research for monitoring acoustic elements of indoor soundscape.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Makoto Sanada ◽  
Tadashi Matsuo ◽  
Nobutaka Shimada ◽  
Yoshiaki Shirai

AbstractIn this study, a method for a robot to recall multiple grasping methods for a given object is proposed. The aim of this study was for robots to learn grasping methods for new objects by observing the grasping activities of humans in daily life without special instructions. For this setting, only one grasping motion was observed for an object at a time, and it was never known whether other grasping methods were possible for the object, although supervised learning generally requires all possible answers for each training input. The proposed method gives a solution for that learning situations by employing a convolutional neural network with automatic clustering of the observed grasping method. In the proposed method, the grasping methods are clustered during the process of learning of the grasping position. The method first recalls grasping positions and the network estimates the multi-channel heatmap such that each channel heatmap indicates one grasping position, then checks the graspability for each estimated position. Finally, the method recalls the hand shapes based on the estimated grasping position and the object’s shape. This paper describes the results of recalling multiple grasping methods and demonstrates the effectiveness of the proposed method.


Author(s):  
Kavita Saxena

Abstract: COVID-19 epidemic has affected our daily life disturbing the world trade and transport. Wearing a face mask has become a new necessity for safety. In the near future, many institutions will ask the customers to wear masks to avail of their services. Therefore, face mask detection has become a necessity to help society. This paper presents a simplified approach to achieve this purpose using some packages like TensorFlow, Keras, OpenCV and Scikit-Learn. This method detects the face from the image in frame and then identifies if it has worn a mask or not. As in a surveillance task, it can also detect a face along with a mask in movement through image processing. The method attains accuracy up to 93% and 91.2% respectively on two datasets. We explore optimized values of parameters using the Sequential CNN (Convolutional Neural Network) model to detect the presence of masks correctly. Keywords: Face Mask Detection, Convolutional Neural Network, TensorFlow, Keras, Image Processing


Author(s):  
Wei Li ◽  
Guang Dai ◽  
Yali Wang ◽  
Feifei Long

Acoustic emission technology is mostly used in corrosion detection of the atmospheric vertical storage tank bottom, but the evaluation results are always affected by the complex sound sources. In this paper, wavelet neural network is used to identify the acoustic emission signals from different types of tanks. Using wavelet transform and threshold denoising to denoise the detection signals, after wavelet packet decomposition, each node’s energy distribution and the feature vectors of extracted corrosion signals of the tank floor are selected as the input. At last, the compact-type wavelet neural network is chosen to recognize different AE signals. The result of magnetic flux leakage test proves that this method can improve acoustic emission signal analysis precision and achieve the accurate corrosion evaluation based on AE technology.


2020 ◽  
Author(s):  
S Kashin ◽  
D Zavyalov ◽  
A Rusakov ◽  
V Khryashchev ◽  
A Lebedev

2020 ◽  
Vol 2020 (10) ◽  
pp. 181-1-181-7
Author(s):  
Takahiro Kudo ◽  
Takanori Fujisawa ◽  
Takuro Yamaguchi ◽  
Masaaki Ikehara

Image deconvolution has been an important issue recently. It has two kinds of approaches: non-blind and blind. Non-blind deconvolution is a classic problem of image deblurring, which assumes that the PSF is known and does not change universally in space. Recently, Convolutional Neural Network (CNN) has been used for non-blind deconvolution. Though CNNs can deal with complex changes for unknown images, some CNN-based conventional methods can only handle small PSFs and does not consider the use of large PSFs in the real world. In this paper we propose a non-blind deconvolution framework based on a CNN that can remove large scale ringing in a deblurred image. Our method has three key points. The first is that our network architecture is able to preserve both large and small features in the image. The second is that the training dataset is created to preserve the details. The third is that we extend the images to minimize the effects of large ringing on the image borders. In our experiments, we used three kinds of large PSFs and were able to observe high-precision results from our method both quantitatively and qualitatively.


Sign in / Sign up

Export Citation Format

Share Document