scholarly journals Audio-Based Drone Detection and Identification Using Deep Learning Techniques with Dataset Enhancement through Generative Adversarial Networks

Sensors ◽  
2021 ◽  
Vol 21 (15) ◽  
pp. 4953
Author(s):  
Sara Al-Emadi ◽  
Abdulla Al-Ali ◽  
Abdulaziz Al-Ali

Drones are becoming increasingly popular not only for recreational purposes but in day-to-day applications in engineering, medicine, logistics, security and others. In addition to their useful applications, an alarming concern in regard to the physical infrastructure security, safety and privacy has arisen due to the potential of their use in malicious activities. To address this problem, we propose a novel solution that automates the drone detection and identification processes using a drone’s acoustic features with different deep learning algorithms. However, the lack of acoustic drone datasets hinders the ability to implement an effective solution. In this paper, we aim to fill this gap by introducing a hybrid drone acoustic dataset composed of recorded drone audio clips and artificially generated drone audio samples using a state-of-the-art deep learning technique known as the Generative Adversarial Network. Furthermore, we examine the effectiveness of using drone audio with different deep learning algorithms, namely, the Convolutional Neural Network, the Recurrent Neural Network and the Convolutional Recurrent Neural Network in drone detection and identification. Moreover, we investigate the impact of our proposed hybrid dataset in drone detection. Our findings prove the advantage of using deep learning techniques for drone detection and identification while confirming our hypothesis on the benefits of using the Generative Adversarial Networks to generate real-like drone audio clips with an aim of enhancing the detection of new and unfamiliar drones.

2021 ◽  
Author(s):  
Ali Q. Saeed ◽  
Siti Norul Huda Sheikh Abdullah ◽  
Jemaima Che-Hamzah ◽  
Ahmad Tarmizi Abdul Ghani

BACKGROUND Glaucoma means irreversible blindness. Globally, it is the second retinal disease leading to blindness, just preceded by the cataract. Therefore, there is a great need to avoid the silent growth of such disease using the recently developed Generative Adversarial Networks(GANs). OBJECTIVE This paper aims to introduce GAN technology for the diagnosis of eye disorders, particularly glaucoma. This paper illustrates deep adversarial learning as a potential diagnostic tool and the challenges involved in its implementation. This study describes and analyzes many of the pitfalls and problems that researchers will need to overcome in order to implement this kind of technology. METHODS To organize this review comprehensively, we used the keywords: ("Glaucoma", "optic disc", "blood vessels") and ("receptive field", "loss function", "GAN", "Generative Adversarial Network", "Deep learning", "CNN", "convolutional neural network" OR encoder), in different variations to gather all the relevant articles from five highly reputed databases: IEEE Xplore, Web of Science, Scopus, Science Direct, and Pubmed. These libraries broadly cover technical and medical literature. For the latest five years of publications, we only included those within that period. Researchers who used OCT or visual fields in their work were excluded. However, papers that used 2D images were included. A large-scale systematic analysis was performed, then a summary was generated. The study was conducted between March 2020 and November 2020. RESULTS We found 59 articles after a comprehensive survey of the literature. Among 59 articles, 29 present actual attempts to synthesize images and provide accurate segmentation/classification using single/multiple landmarks or share certain experiences. Twenty-nine journal articles discuss recent advances in generative adversarial networks, practical experiments, and analytical studies of retinal disease. CONCLUSIONS Recent deep learning technique, namely generative adversarial network, has shown encouraging retinal disease detection performance. Although this methodology involves an extensive computing budget and optimization process, it saturates the greedy nature of deep learning techniques by synthesizing images and solves major medical issues. There is no existing systematic review paper on retinal disease utilizing generative adversarial networks to the extent of our knowledge. Two paper sets were reported; the first involves surveys on the recent development of GANs or overviews of papers reported in the literature applying machine learning techniques on retinal diseases. While in the second group, researchers have sought to establish and enhance the detection process through generating as real as possible synthetic images with the assistance of GANs. This paper contributes to this research field by offering a thorough analysis of existing works, highlighting current limitations, and suggesting alternatives to support other researchers and participants to improve further and strengthen future work. Finally, the new directions of this research have been identified.


Author(s):  
Iqbal H. Sarker

Deep learning (DL), which is originated from an artificial neural network (ANN), is one of the major technologies of today's smart cybersecurity systems or policies to function in an intelligent manner. Popular deep learning techniques, such as Multi-layer Perceptron (MLP), Convolutional Neural Network (CNN or ConvNet), Recurrent Neural Network (RNN) or Long Short-Term Memory (LSTM), Self-organizing Map (SOM), Auto-Encoder (AE), Restricted Boltzmann Machine (RBM), Deep Belief Networks (DBN), Generative Adversarial Network (GAN), Deep Transfer Learning (DTL or Deep TL), Deep Reinforcement Learning (DRL or Deep RL), or their ensembles and hybrid approaches can be used to intelligently tackle the diverse cybersecurity issues. In this paper, we aim to present a comprehensive overview from the perspective of these neural networks and deep learning techniques according to today's diverse needs. We also discuss the applicability of these techniques in various cybersecurity tasks such as intrusion detection, identification of malware or botnets, phishing, predicting cyber-attacks, e.g. denial of service (DoS), fraud detection or cyber-anomalies, etc. Finally, we highlight several research issues and future directions within the scope of our study in the field. Overall, the ultimate goal of this paper is to serve as a reference point and guidelines for the academia and professionals in the cyber industries, especially from the deep learning point of view.


2021 ◽  
Author(s):  
Thiago Abdo ◽  
Fabiano Silva

The purpose of this paper is to analyze the use of different machine learning approaches and algorithms to be integrated as an automated assistance on a tool to aid the creation of new annotated datasets. We evaluate how they scale in an environment without dedicated machine learning hardware. In particular, we study the impact over a dataset with few examples and one that is being constructed. We experiment using deep learning algorithms (Bert) and classical learning algorithms with a lower computational cost (W2V and Glove combined with RF and SVM). Our experiments show that deep learning algorithms have a performance advantage over classical techniques. However, deep learning algorithms have a high computational cost, making them inadequate to an environment with reduced hardware resources. Simulations using Active and Iterative machine learning techniques to assist the creation of new datasets are conducted. For these simulations, we use the classical learning algorithms because of their computational cost. The knowledge gathered with our experimental evaluation aims to support the creation of a tool for building new text datasets.


Sensors ◽  
2019 ◽  
Vol 19 (15) ◽  
pp. 3269 ◽  
Author(s):  
Hongmin Gao ◽  
Dan Yao ◽  
Mingxia Wang ◽  
Chenming Li ◽  
Haiyun Liu ◽  
...  

Hyperspectral remote sensing images (HSIs) have great research and application value. At present, deep learning has become an important method for studying image processing. The Generative Adversarial Network (GAN) model is a typical network of deep learning developed in recent years and the GAN model can also be used to classify HSIs. However, there are still some problems in the classification of HSIs. On the one hand, due to the existence of different objects with the same spectrum phenomenon, if only according to the original GAN model to generate samples from spectral samples, it will produce the wrong detailed characteristic information. On the other hand, the gradient disappears in the original GAN model and the scoring ability of a single discriminator limits the quality of the generated samples. In order to solve the above problems, we introduce the scoring mechanism of multi-discriminator collaboration and complete semi-supervised classification on three hyperspectral data sets. Compared with the original GAN model with a single discriminator, the adjusted criterion is more rigorous and accurate and the generated samples can show more accurate characteristics. Aiming at the pattern collapse and diversity deficiency of the original GAN generated by single discriminator, this paper proposes a multi-discriminator generative adversarial networks (MDGANs) and studies the influence of the number of discriminators on the classification results. The experimental results show that the introduction of multi-discriminator improves the judgment ability of the model, ensures the effect of generating samples, solves the problem of noise in generating spectral samples and can improve the classification effect of HSIs. At the same time, the number of discriminators has different effects on different data sets.


2020 ◽  
Vol 2020 ◽  
pp. 1-17
Author(s):  
Yirui Wu ◽  
Dabao Wei ◽  
Jun Feng

With the development of the fifth-generation networks and artificial intelligence technologies, new threats and challenges have emerged to wireless communication system, especially in cybersecurity. In this paper, we offer a review on attack detection methods involving strength of deep learning techniques. Specifically, we firstly summarize fundamental problems of network security and attack detection and introduce several successful related applications using deep learning structure. On the basis of categorization on deep learning methods, we pay special attention to attack detection methods built on different kinds of architectures, such as autoencoders, generative adversarial network, recurrent neural network, and convolutional neural network. Afterwards, we present some benchmark datasets with descriptions and compare the performance of representing approaches to show the current working state of attack detection methods with deep learning structures. Finally, we summarize this paper and discuss some ways to improve the performance of attack detection under thoughts of utilizing deep learning structures.


Sensors ◽  
2020 ◽  
Vol 20 (6) ◽  
pp. 1576 ◽  
Author(s):  
Li Zhu ◽  
Lianghao Huang ◽  
Linyu Fan ◽  
Jinsong Huang ◽  
Faming Huang ◽  
...  

Landslide susceptibility prediction (LSP) modeling is an important and challenging problem. Landslide features are generally uncorrelated or nonlinearly correlated, resulting in limited LSP performance when leveraging conventional machine learning models. In this study, a deep-learning-based model using the long short-term memory (LSTM) recurrent neural network and conditional random field (CRF) in cascade-parallel form was proposed for making LSPs based on remote sensing (RS) images and a geographic information system (GIS). The RS images are the main data sources of landslide-related environmental factors, and a GIS is used to analyze, store, and display spatial big data. The cascade-parallel LSTM-CRF consists of frequency ratio values of environmental factors in the input layers, cascade-parallel LSTM for feature extraction in the hidden layers, and cascade-parallel full connection for classification and CRF for landslide/non-landslide state modeling in the output layers. The cascade-parallel form of LSTM can extract features from different layers and merge them into concrete features. The CRF is used to calculate the energy relationship between two grid points, and the extracted features are further smoothed and optimized. As a case study, the cascade-parallel LSTM-CRF was applied to Shicheng County of Jiangxi Province in China. A total of 2709 landslide grid cells were recorded and 2709 non-landslide grid cells were randomly selected from the study area. The results show that, compared with existing main traditional machine learning algorithms, such as multilayer perception, logistic regression, and decision tree, the proposed cascade-parallel LSTM-CRF had a higher landslide prediction rate (positive predictive rate: 72.44%, negative predictive rate: 80%, total predictive rate: 75.67%). In conclusion, the proposed cascade-parallel LSTM-CRF is a novel data-driven deep learning model that overcomes the limitations of traditional machine learning algorithms and achieves promising results for making LSPs.


Author(s):  
Arash Shilandari ◽  
Hossein Marvi ◽  
Hossein Khosravi

Nowadays, and with the mechanization of life, speech processing has become so crucial for the interaction between humans and machines. Deep neural networks require a database with enough data for training. The more features are extracted from the speech signal, the more samples are needed to train these networks. Adequate training of these networks can be ensured when there is access to sufficient and varied data in each class. If there is not enough data; it is possible to use data augmentation methods to obtain a database with enough samples. One of the obstacles to developing speech emotion recognition systems is the Data sparsity problem in each class for neural network training. The current study has focused on making a cycle generative adversarial network for data augmentation in a system for speech emotion recognition. For each of the five emotions employed, an adversarial generating network is designed to generate data that is very similar to the main data in that class, as well as differentiate the emotions of the other classes. These networks are taught in an adversarial way to produce feature vectors like each class in the space of the main feature, and then they add to the training sets existing in the database to train the classifier network. Instead of using the common cross-entropy error to train generative adversarial networks and to remove the vanishing gradient problem, Wasserstein Divergence has been used to produce high-quality artificial samples. The suggested network has been tested to be applied for speech emotion recognition using EMODB as training, testing, and evaluating sets, and the quality of artificial data evaluated using two Support Vector Machine (SVM) and Deep Neural Network (DNN) classifiers. Moreover, it has been revealed that extracting and reproducing high-level features from acoustic features, speech emotion recognition with separating five primary emotions has been done with acceptable accuracy.


2020 ◽  
Author(s):  
Jiyanbo Cao ◽  
Jinan Fiaidhi ◽  
Maolin Qi

This paper has reviewed the deep learning techniques which used in music generation. The research was based on <i>Sageev Oore's</i> proposed LSTM based recurrent neural network (Performance RNN). We have study the history of automatic music generation, and now we are using a state of the art techniques to achieve this mission. We have conclude the process of making a MIDI file to a structure as input of Performance RNN and the network structure of it.


2021 ◽  
pp. 1-11
Author(s):  
Sunil Rao ◽  
Vivek Narayanaswamy ◽  
Michael Esposito ◽  
Jayaraman J. Thiagarajan ◽  
Andreas Spanias

Reliable and rapid non-invasive testing has become essential for COVID-19 diagnosis and tracking statistics. Recent studies motivate the use of modern machine learning (ML) and deep learning (DL) tools that utilize features of coughing sounds for COVID-19 diagnosis. In this paper, we describe system designs that we developed for COVID-19 cough detection with the long-term objective of embedding them in a testing device. More specifically, we use log-mel spectrogram features extracted from the coughing audio signal and design a series of customized deep learning algorithms to develop fast and automated diagnosis tools for COVID-19 detection. We first explore the use of a deep neural network with fully connected layers. Additionally, we investigate prospects of efficient implementation by examining the impact on the detection performance by pruning the fully connected neural network based on the Lottery Ticket Hypothesis (LTH) optimization process. In general, pruned neural networks have been shown to provide similar performance gains to that of unpruned networks with reduced computational complexity in a variety of signal processing applications. Finally, we investigate the use of convolutional neural network architectures and in particular the VGG-13 architecture which we tune specifically for this application. Our results show that a unique ensembling of the VGG-13 architecture trained using a combination of binary cross entropy and focal losses with data augmentation significantly outperforms the fully connected networks and other recently proposed baselines on the DiCOVA 2021 COVID-19 cough audio dataset. Our customized VGG-13 model achieves an average validation AUROC of 82.23% and a test AUROC of 78.3% at a sensitivity of 80.49%.


Author(s):  
A. Courtial ◽  
G. Touya ◽  
X. Zhang

Abstract. This article presents how a generative adversarial network (GAN) can be employed to produce a generalised map that combines several cartographic themes in the dense context of urban areas. We use as input detailed buildings, roads, and rivers from topographic datasets produced by the French national mapping agency (IGN), and we expect as output of the GAN a legible map of these elements at a target scale of 1:50,000. This level of detail requires to reduce the amount of information while preserving patterns; covering dense inner cities block by a unique polygon is also necessary because these blocks cannot be represented with enlarged individual buildings. The target map has a style similar to the topographic map produced by IGN. This experiment succeeded in producing image tiles that look like legible maps. It also highlights the impact of data and representation choices on the quality of predicted images, and the challenge of learning geographic relationships.


Sign in / Sign up

Export Citation Format

Share Document