Known-unknown Data Augmentation Strategies for Detection of Logical Access, Physical Access and Speech Deepfake Attacks: ASVspoof 2021

2021 ◽  
Author(s):  
Rohan Kumar Das
2019 ◽  
Vol 9 (6) ◽  
pp. 1128 ◽  
Author(s):  
Yundong Li ◽  
Wei Hu ◽  
Han Dong ◽  
Xueyan Zhang

Using aerial cameras, satellite remote sensing or unmanned aerial vehicles (UAV) equipped with cameras can facilitate search and rescue tasks after disasters. The traditional manual interpretation of huge aerial images is inefficient and could be replaced by machine learning-based methods combined with image processing techniques. Given the development of machine learning, researchers find that convolutional neural networks can effectively extract features from images. Some target detection methods based on deep learning, such as the single-shot multibox detector (SSD) algorithm, can achieve better results than traditional methods. However, the impressive performance of machine learning-based methods results from the numerous labeled samples. Given the complexity of post-disaster scenarios, obtaining many samples in the aftermath of disasters is difficult. To address this issue, a damaged building assessment method using SSD with pretraining and data augmentation is proposed in the current study and highlights the following aspects. (1) Objects can be detected and classified into undamaged buildings, damaged buildings, and ruins. (2) A convolution auto-encoder (CAE) that consists of VGG16 is constructed and trained using unlabeled post-disaster images. As a transfer learning strategy, the weights of the SSD model are initialized using the weights of the CAE counterpart. (3) Data augmentation strategies, such as image mirroring, rotation, Gaussian blur, and Gaussian noise processing, are utilized to augment the training data set. As a case study, aerial images of Hurricane Sandy in 2012 were maximized to validate the proposed method’s effectiveness. Experiments show that the pretraining strategy can improve of 10% in terms of overall accuracy compared with the SSD trained from scratch. These experiments also demonstrate that using data augmentation strategies can improve mAP and mF1 by 72% and 20%, respectively. Finally, the experiment is further verified by another dataset of Hurricane Irma, and it is concluded that the paper method is feasible.


1993 ◽  
Vol 88 (423) ◽  
pp. 926-938 ◽  
Author(s):  
Richard M. Heiberger ◽  
Dulal K. Bhaumik ◽  
Burt Holland

2021 ◽  
Author(s):  
Sayan Nag

Self-supervised learning and pre-training strategies have developed over the last few years especially for Convolutional Neural Networks (CNNs). Recently application of such methods can also be noticed for Graph Neural Networks (GNNs). In this paper, we have used a graph based self-supervised learning strategy with different loss functions (Barlow Twins[? ], HSIC[? ], VICReg[? ]) which have shown promising results when applied with CNNs previously. We have also proposed a hybrid loss function combining the advantages of VICReg and HSIC and called it as VICRegHSIC. The performance of these aforementioned methods have been compared when applied to two different datasets namely MUTAG and PROTEINS. Moreover, the impact of different batch sizes, projector dimensions and data augmentation strategies have also been explored. The results are preliminary and we will be continuing to explore with other datasets.


2021 ◽  
Author(s):  
Radhika Malhotra ◽  
Jasleen Saini ◽  
Barjinder Singh Saini ◽  
Savita Gupta

In the past decade, there has been a remarkable evolution of convolutional neural networks (CNN) for biomedical image processing. These improvements are inculcated in the basic deep learning-based models for computer-aided detection and prognosis of various ailments. But implementation of these CNN based networks is highly dependent on large data in case of supervised learning processes. This is needed to tackle overfitting issues which is a major concern in supervised techniques. Overfitting refers to the phenomenon when a network starts learning specific patterns of the input such that it fits well on the training data but leads to poor generalization abilities on unseen data. The accessibility of enormous quantity of data limits the field of medical domain research. This paper focuses on utility of data augmentation (DA) techniques, which is a well-recognized solution to the problem of limited data. The experiments were performed on the Brain Tumor Segmentation (BraTS) dataset which is available online. The results signify that different DA approaches have upgraded the accuracies for segmenting brain tumor boundaries using CNN based model.


2021 ◽  
Vol 11 (4) ◽  
pp. 1387
Author(s):  
Daniel Yang ◽  
Kevin Ji ◽  
TJ Tsai

This article studies a composer style classification task based on raw sheet music images. While previous works on composer recognition have relied exclusively on supervised learning, we explore the use of self-supervised pretraining methods that have been recently developed for natural language processing. We first convert sheet music images to sequences of musical words, train a language model on a large set of unlabeled musical “sentences”, initialize a classifier with the pretrained language model weights, and then finetune the classifier on a small set of labeled data. We conduct extensive experiments on International Music Score Library Project (IMSLP) piano data using a range of modern language model architectures. We show that pretraining substantially improves classification performance and that Transformer-based architectures perform best. We also introduce two data augmentation strategies and present evidence that the model learns generalizable and semantically meaningful information.


Sign in / Sign up

Export Citation Format

Share Document