Reduction of Computational Cost Using Two-Stage Deep Neural Network for Training for Denoising and Sound Source Identification

AbstractPassengers' demands for riding comfort have been getting higher and higher as the high-speed railway develops. Scientific methods to analyze the interior noise of the high-speed train are needed and the operational transfer path analysis (OTPA) method provides a theoretical basis and guidance for the noise control of the train and overcomes the shortcomings of the traditional method, which has high test efficiency and can be carried out during the working state of the targeted machine. The OTPA model is established from the aspects of "path reference point-target point" and "sound source reference point-target point". As for the mechanism of the noise transmission path, an assumption is made that the direct sound propagation is ignored, and the symmetric sound source and the symmetric path are merged. Using the operational test data and the OTPA method, combined with the results of spherical array sound source identification, the path contribution and sound source contribution of the interior noise are analyzed, respectively, from aspects of the total value and spectrum. The results show that the OTPA conforms to the calculation results of the spherical array sound source identification. At low speed, the contribution of the floor path and the contribution of the bogie sources are dominant. When the speed is greater than 300 km/h, the contribution of the roof path is dominant. Moreover, for the carriage with a pantograph, the lifted pantograph is an obvious source. The noise from the exterior sources of the train transfer into the interior mainly through the form of structural excitation, and the contribution of air excitation is non-significant. Certain analyses of train parts provide guides for the interior noise control.

Download Full-text

Sound Source Identification Using Coherence- and Intensity-Based Methods

IEEE Transactions on Instrumentation and Measurement ◽

10.1109/tim.2007.908246 ◽

2007 ◽

Vol 56 (6) ◽

pp. 2478-2485 ◽

Cited By ~ 4

Author(s):

Giovanni Moschioni ◽

Bortolino Saggin ◽

Marco Tarabini

Keyword(s):

Sound Source ◽

Source Identification ◽

Sound Source Identification

Download Full-text

An optimized two-stage cascaded deep neural network for adrenal segmentation on CT images

Computers in Biology and Medicine ◽

10.1016/j.compbiomed.2021.104749 ◽

2021 ◽

pp. 104749

Author(s):

Guoting Luo ◽

Qing Yang ◽

Tao Chen ◽

Tao Zheng ◽

Wei Xie ◽

...

Keyword(s):

Neural Network ◽

Deep Neural Network ◽

Ct Images ◽

Two Stage

Download Full-text

A Two-Stage Approach for Automated Prostate Lesion Detection and Classification with Mask R-CNN and Weakly Supervised Deep Neural Network

Artificial Intelligence in Radiation Therapy - Lecture Notes in Computer Science ◽

10.1007/978-3-030-32486-5_6 ◽

2019 ◽

pp. 43-51

Author(s):

Zhiyu Liu ◽

Wenhao Jiang ◽

Kit-Hang Lee ◽

Yat-Long Lo ◽

Yui-Lun Ng ◽

...

Keyword(s):

Neural Network ◽

Deep Neural Network ◽

Lesion Detection ◽

Two Stage ◽

Weakly Supervised

Download Full-text

Enhancing direct‐path relative transfer function using deep neural network for robust sound source localization

CAAI Transactions on Intelligence Technology ◽

10.1049/cit2.12024 ◽

2021 ◽

Author(s):

Bing Yang ◽

Runwei Ding ◽

Yutong Ban ◽

Xiaofei Li ◽

Hong Liu

Keyword(s):

Neural Network ◽

Transfer Function ◽

Source Localization ◽

Sound Source ◽

Deep Neural Network ◽

Sound Source Localization ◽

Direct Path ◽

Relative Transfer ◽

Relative Transfer Function

Download Full-text

Deep neural network based wake-up-word speech recognition with two-stage detection

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2017.7952659 ◽

2017 ◽

Cited By ~ 3

Author(s):

Fengpei Ge ◽

Yonghong Yan

Keyword(s):

Neural Network ◽

Speech Recognition ◽

Deep Neural Network ◽

Two Stage

Download Full-text

Web application to support evidence of individual emotional impact evoked by COVID-19 pandemic restrictions (Preprint)

10.2196/preprints.33021 ◽

2021 ◽

Author(s):

Hugo Mitre-Hernandez ◽

Rodolfo Ferro-Perez ◽

Francisco Gonzalez-Hernandez

Keyword(s):

Neural Network ◽

Mental Health ◽

Neural Networks ◽

Emotion Recognition ◽

Web Application ◽

Deep Neural Network ◽

Data Transfer ◽

Computational Cost ◽

Low Computational Cost ◽

The Web

BACKGROUND Mental health effects during COVID-19 quarantine need to be handled because patients, relatives, and healthcare workers are living with negative emotional behaviors. The clinical disorders of depression and anxiety are evoking anger, fear, sadness, disgust, and reducing happiness. Therefore, track emotions with the help of psychologists on online consultations –to reduce the risk of contagion– will go a long way in assisting with mental health. The human micro-expressions can describe genuine emotions of people and can be captured by Deep Neural Networks (DNNs) models. But the challenge is to implement it under the poor performance of a part of society's computers and the low speed of internet connection. OBJECTIVE This study aimed to create a useful and usable web application to record emotions in a patient’s card in real-time, achieving a small data transfer, and a Convolutional Neural Networks (CNN) model with a low computational cost. METHODS To validate the low computational cost premise, firstly, we compare DNN architectures results, collecting the floating-point operations per second (FLOPS), the Number of Parameters (NP) and accuracy from the MobileNet, PeleeNet, Extended Deep Neural Network (EDNN), Inception- Based Deep Neural Network (IDNN) and our proposed Residual mobile-based Network (ResmoNet) model. Secondly, we compare the trained models' results in terms of Main Memory Utilization (MMU) and Response Time to complete the Emotion recognition (RTE). Finally, we design a data transfer that includes the raw data of emotions and the basic text information of the patient. The web application was evaluated with the System Usability Scale (SUS) and a utility questionnaire by psychologists and psychiatrists (experts). RESULTS All CNN models were set up using 150 epochs for training and testing comparing the results for each variable in ResmoNet with the best model. It was obtained that ResmoNet has 115,976 NP less than MobileNet, 243,901 FLOPS less than MobileNet, and 5% less accuracy than EDNN (95%). Moreover, ResmoNet used less MMU than any model, only EDNN overcomes ResmoNet in 0.01 seconds for RTE. Finally, with our model, we develop a web application to collect emotions in real-time during a psychological consultation. For data transfer, the patient’s card and raw emotional data have 2 kb with a UTF-8 encoding approximately. Finally, according to the experts, the web application has good usability (73.8 of 100) and utility (3.94 of 5). CONCLUSIONS A usable and useful web application for psychologists and psychiatrists is presented. This tool includes an efficient and light facial emotion recognition model. Its purpose is to be a complementary tool for diagnostic processes.

Download Full-text

Filter-and-sum Beamforming Sound Source Identification Algorithm for Spherical Microphone Arrays Based on Pressure Contrition

Journal of Mechanical Engineering ◽

10.3901/jme.2018.04.238 ◽

2018 ◽

Vol 54 (4) ◽

pp. 238

Author(s):

Zhigang CHU

Keyword(s):

Sound Source ◽

Source Identification ◽

Microphone Arrays ◽

Identification Algorithm ◽

Sound Source Identification ◽

Spherical Microphone Arrays

Download Full-text