scholarly journals Speech Dereverberation Using Deep Learning Algorithm

Author(s):  
Dr. S. Saraswathi ◽  
S. Ramya

This paper focuses on speech derverberation using a single microphone. We investigate the applicability of fully convolutional networks (FCN) to enhance the speech signal represented by short-time Fourier transform (STFT) images in light of their recent success in many image processing applications. We present two variants: a "U-Net," which is an encoder-decoder network with skip connections, and a generative adversarial network (GAN) with the U-Net as the generator, which produces a more intuitive cost function for training. To assess our method, we used data from the REVERB challenge and compared our results to those of other methods tested under the same conditions. In most cases, we discovered that our method outperforms the competing methods.

2021 ◽  
Author(s):  
Tian Xiang Gao ◽  
Jia Yi Li ◽  
Yuji Watanabe ◽  
Chi Jung Hung ◽  
Akihiro Yamanaka ◽  
...  

Abstract Sleep-stage classification is essential for sleep research. Various automatic judgment programs including deep learning algorithms using artificial intelligence (AI) have been developed, but with limitations in data format compatibility, human interpretability, cost, and technical requirements. We developed a novel program called GI-SleepNet, generative adversarial network (GAN)-assisted image-based sleep staging for mice that is accurate, versatile, compact, and easy to use. In this program, electroencephalogram and electromyography data are first visualized as images and then classified into three stages (wake, NREM, and REM) by a supervised image learning algorithm. To increase the accuracy, we adopted GAN and artificially generated fake REM sleep data to equalize the number of stages. This resulted in improved accuracy, and as few as one mouse data yielded significant accuracy. Because of its image-based nature, it is easy to apply to data of different formats, of different species of animals, and even outside of sleep research. Image data can be easily understood by humans, thus especially confirmation by experts is easy when there are some anomalies of prediction. Because deep learning of images is one of the leading fields in AI, numerous algorithms are also available.


2020 ◽  
Vol 27 (2) ◽  
pp. 486-493 ◽  
Author(s):  
Xiaogang Yang ◽  
Maik Kahnt ◽  
Dennis Brückner ◽  
Andreas Schropp ◽  
Yakub Fam ◽  
...  

This paper presents a deep learning algorithm for tomographic reconstruction (GANrec). The algorithm uses a generative adversarial network (GAN) to solve the inverse of the Radon transform directly. It works for independent sinograms without additional training steps. The GAN has been developed to fit the input sinogram with the model sinogram generated from the predicted reconstruction. Good quality reconstructions can be obtained during the minimization of the fitting errors. The reconstruction is a self-training procedure based on the physics model, instead of on training data. The algorithm showed significant improvements in the reconstruction accuracy, especially for missing-wedge tomography acquired at less than 180° rotational range. It was also validated by reconstructing a missing-wedge X-ray ptychographic tomography (PXCT) data set of a macroporous zeolite particle, for which only 51 projections over 70° could be collected. The GANrec recovered the 3D pore structure with reasonable quality for further analysis. This reconstruction concept can work universally for most of the ill-posed inverse problems if the forward model is well defined, such as phase retrieval of in-line phase-contrast imaging.


2021 ◽  
Vol 3 (4) ◽  
pp. 581-597
Author(s):  
Tianxiang Gao ◽  
Jiayi Li ◽  
Yuji Watanabe ◽  
Chijung Hung ◽  
Akihiro Yamanaka ◽  
...  

Sleep-stage classification is essential for sleep research. Various automatic judgment programs, including deep learning algorithms using artificial intelligence (AI), have been developed, but have limitations with regard to data format compatibility, human interpretability, cost, and technical requirements. We developed a novel program called GI-SleepNet, generative adversarial network (GAN)-assisted image-based sleep staging for mice that is accurate, versatile, compact, and easy to use. In this program, electroencephalogram and electromyography data are first visualized as images, and then classified into three stages (wake, NREM, and REM) by a supervised image learning algorithm. To increase its accuracy, we adopted GAN and artificially generated fake REM sleep data to equalize the number of stages. This resulted in improved accuracy, and as little as one mouse’s data yielded significant accuracy. Due to its image-based nature, the program is easy to apply to data of different formats, different species of animals, and even outside sleep research. Image data can be easily understood; thus, confirmation by experts is easily obtained, even when there are prediction anomalies. As deep learning in image processing is one of the leading fields in AI, numerous algorithms are also available.


2021 ◽  
Vol 2021 ◽  
pp. 1-14
Author(s):  
Zhangguo Tang ◽  
Junfeng Wang ◽  
Huanzhou Li ◽  
Jian Zhang ◽  
Junhao Wang

In the intelligent era of human-computer symbiosis, the use of machine learning method for covert communication confrontation has become a hot topic of network security. The existing covert communication technology focuses on the statistical abnormality of traffic behavior and does not consider the sensory abnormality of security censors, so it faces the core problem of lack of cognitive ability. In order to further improve the concealment of communication, a game method of “cognitive deception” is proposed, which is aimed at eliminating the anomaly of traffic in both behavioral and cognitive dimensions. Accordingly, a Wasserstein Generative Adversarial Network of Covert Channel (WCCGAN) model is established. The model uses the constraint sampling of cognitive priors to construct the constraint mechanism of “functional equivalence” and “cognitive equivalence” and is trained by a dynamic strategy updating learning algorithm. Among them, the generative module adopts joint expression learning which integrates network protocol knowledge to improve the expressiveness and discriminability of traffic cognitive features. The equivalent module guides the discriminant module to learn the pragmatic relevance features through the activity loss function of traffic and the application loss function of protocol for end-to-end training. The experimental results show that WCCGAN can directly synthesize traffic with comprehensive concealment ability, and its behavior concealment and cognitive deception are as high as 86.2% and 96.7%, respectively. Moreover, the model has good convergence and generalization ability and does not depend on specific assumptions and specific covert algorithms, which realizes a new paradigm of cognitive game in covert communication.


Sensors ◽  
2020 ◽  
Vol 20 (3) ◽  
pp. 717 ◽  
Author(s):  
Gang Li ◽  
Biao Ma ◽  
Shuanhai He ◽  
Xueli Ren ◽  
Qiangwei Liu

Regular crack inspection of tunnels is essential to guarantee their safe operation. At present, the manual detection method is time-consuming, subjective and even dangerous, while the automatic detection method is relatively inaccurate. Detecting tunnel cracks is a challenging task since cracks are tiny, and there are many noise patterns in the tunnel images. This study proposes a deep learning algorithm based on U-Net and a convolutional neural network with alternately updated clique (CliqueNet), called U-CliqueNet, to separate cracks from background in the tunnel images. A consumer-grade DSC-WX700 camera (SONY, Wuxi, China) was used to collect 200 original images, then cracks are manually marked and divided into sub-images with a resolution of 496   ×   496 pixels. A total of 60,000 sub-images were obtained in the dataset of tunnel cracks, among which 50,000 were used for training and 10,000 were used for testing. The proposed framework conducted training and testing on this dataset, the mean pixel accuracy (MPA), mean intersection over union (MIoU), precision and F1-score are 92.25%, 86.96%, 86.32% and 83.40%, respectively. We compared the U-CliqueNet with fully convolutional networks (FCN), U-net, Encoder–decoder network (SegNet) and the multi-scale fusion crack detection (MFCD) algorithm using hypothesis testing, and it’s proved that the MIoU predicted by U-CliqueNet was significantly higher than that of the other four algorithms. The area, length and mean width of cracks can be calculated, and the relative error between the detected mean crack width and the actual mean crack width ranges from −11.20% to 18.57%. The results show that this framework can be used for fast and accurate crack semantic segmentation of tunnel images.


2020 ◽  
Author(s):  
Xinzheng Lu ◽  
Wenjie Liao ◽  
Yuli Huang ◽  
Zhe Zheng ◽  
Yuanqing Lin

Abstract Artificial intelligence is transforming many industries and reshaping building design processes to be smarter and automated. While a large number of studies on automated building design have been carried out recently, they focused on architectural aspects, leaving a gap in its application to structural design. Considering the increasingly wide application of shear wall systems in high-rise buildings and envisioning the massive benefit of automated structural design, this paper proposes a shear-wall design automation model based on a generative adversarial network (GAN). Its goal is to learn from existing shear wall design documents and then perform structural design intelligently and swiftly. To this end, a database of representative architectural and structural design documents was developed. Then, datasets were prepared via abstraction, semanticization, classification, and parameterization in terms of building height and seismic design category. The GAN model improved its shear wall design proficiency through adversarial training supported by data and hyper-parametric analytics. The performance of the trained GAN model was appraised against the metrics based on the confusion matrix and the intersection-over-union approach. Finally, case studies were conducted to evaluate the applicability, effectiveness, and appropriateness of the innovative GAN-based structural design method.


10.29007/gl3b ◽  
2020 ◽  
Author(s):  
Hoang Son Nguyen ◽  
Yu Takahata ◽  
Masaaki Goto ◽  
Tetsuo Tanaka ◽  
Akihiko Ohsuga ◽  
...  

In this study, we build a system that is able to estimate the concentration degree of students while they are working with computers. The purpose of learning is to gain knowledge of a subject and to reach sufficient performance level about the subject. Concentration is the key in the successful learning process. But the concept of concentration includes some ambiguity and lacks the clear definition form an engineering point of view, and it is difficult to measure its degree by observation from outside. We in this paper begins with a discussion of the concept of concentration, and then a discussion of how to measure it by using standard devices and sensors. The proposed system investigates the facial images of students recorded by the PC webcams attached to the computers to infer their concentration degree. In this study, we define the concentration degree over a short time interval. The value takes continues value from 0 to 1, and is determined based on the efficiency of simple work performed over the interval. We convert the continuous values into three discrete values: low, middle and high. In the first approach in this study, we apply deep learning algorithm with only the facial images. In the next, we obtain the data of face moves as a set of time series, and run the learning algorithm using both of the data. We explain an outline of the methods and the system with several experimental results.


2020 ◽  
Author(s):  
Zekun Chen ◽  
Linning Peng ◽  
Aiqun Hu ◽  
Hua Fu

Abstract With the dramatic development of the internet of things (IoT), security issues such as identity authentication have received serious attention. The radio frequency (RF) fingerprint of IoT device is an inherent feature, which can hardly be imitated. In this paper, we propose a rogue device identification technique via RF fingerprinting using deep learning-based generative adversarial network (GAN). Being different from traditional classification problems in RF fingerprint identifications, this work focuses on unknown accessing device recognition without prior information. A differential constellation trace figure (DCTF) generation process is initially employed to transform RF fingerprint features from time-domain waveforms to 2-dimensional (2D) figures. Then, by using GAN, which is a kind of unsupervised learning algorithm, we can discriminate rogue devices without any prior information. An experimental verification system is built with 54 ZigBee devices regarded as recognized devices and accessing devices. A USRP receiver is used to capture the signal and identify the accessing devices. Experimental results show that the proposed rogue device identification method can achieve 95% identification accuracy in a real environment.


2021 ◽  
Vol 263 (3) ◽  
pp. 3643-3648
Author(s):  
Gyuwon Kim ◽  
Seungchul Lee

Detecting bearing faults in advance is critical for mechanical and electrical systems to prevent economic loss and safety hazards. As part of the recent interest in artificial intelligence, deep learning (DL)-based principles have gained much attention in intelligent fault diagnostics and have mainly been developed in a supervised manner. While these works have shown promising results, several technical setbacks are inherent in a supervised learning setting. Data imbalance is a critical problem as faulty data is scarce in many cases, data labeling is tedious, and unseen cases of faults cannot be detected in a supervised framework. Herein, a generative adversarial network (GAN) is proposed to achieve unsupervised bearing fault diagnostics by utilizing only the normal data. The proposed method first adopts the short-time Fourier transform (STFT) to convert the 1-D vibration signals into 2-D time-frequency representations to use as the input to our (DL) framework. Subsequently, a GAN-based latent mapping is constructed using only the normal data, and faulty signals are detected using an anomaly metric comprised of a discriminator error and an image reconstruction error. The performance of our method is verified using a classic rotating machinery dataset (Case Western Reserve bearing dataset), and the experimental results demonstrate that our method can not only detect the faults but can also cluster the faults in the latent space with high accuracy.


Sign in / Sign up

Export Citation Format

Share Document