scholarly journals InsulatorGAN: A Transmission Line Insulator Detection Model Using Multi-Granularity Conditional Generative Adversarial Nets for UAV Inspection

2021 ◽  
Vol 13 (19) ◽  
pp. 3971
Author(s):  
Wenxiang Chen ◽  
Yingna Li ◽  
Zhengang Zhao

Insulator detection is one of the most significant issues in high-voltage transmission line inspection using unmanned aerial vehicles (UAVs) and has attracted attention from researchers all over the world. The state-of-the-art models in object detection perform well in insulator detection, but the precision is limited by the scale of the dataset and parameters. Recently, the Generative Adversarial Network (GAN) was found to offer excellent image generation. Therefore, we propose a novel model called InsulatorGAN based on using conditional GANs to detect insulators in transmission lines. However, due to the fixed categories in datasets such as ImageNet and Pascal VOC, the generated insulator images are of a low resolution and are not sufficiently realistic. To solve these problems, we established an insulator dataset called InsuGenSet for model training. InsulatorGAN can generate high-resolution, realistic-looking insulator-detection images that can be used for data expansion. Moreover, InsulatorGAN can be easily adapted to other power equipment inspection tasks and scenarios using one generator and multiple discriminators. To give the generated images richer details, we also introduced a penalty mechanism based on a Monte Carlo search in InsulatorGAN. In addition, we proposed a multi-scale discriminator structure based on a multi-task learning mechanism to improve the quality of the generated images. Finally, experiments on the InsuGenSet and CPLID datasets demonstrated that our model outperforms existing state-of-the-art models by advancing both the resolution and quality of the generated images as well as the position of the detection box in the images.

2019 ◽  
Vol 2019 ◽  
pp. 1-8
Author(s):  
Zishu Gao ◽  
Guodong Yang ◽  
En Li ◽  
Tianyu Shen ◽  
Zhe Wang ◽  
...  

There are a large number of insulators on the transmission line, and insulator damage will have a major impact on power supply security. Image-based segmentation of the insulators in the power transmission lines is a premise and also a critical task for power line inspection. In this paper, a modified conditional generative adversarial network for insulator pixel-level segmentation is proposed. The generator is reconstructed by encoder-decoder layers with asymmetric convolution kernel which can simplify the network complexity and extract more kinds of feature information. The discriminator is composed of a fully convolutional network based on patchGAN and learns the loss to train the generator. It is verified in experiments that the proposed method has better performances on mIoU and computational efficiency than Pix2pix, SegNet, and other state-of-the-art networks.


2020 ◽  
Vol 2020 ◽  
pp. 1-8
Author(s):  
Diqun Yan ◽  
Xiaowen Li ◽  
Li Dong ◽  
Rangding Wang

Adaptive multirate (AMR) compression audio has been exploited as an effective forensic evidence to justify audio authenticity. Little consideration has been given, however, to antiforensic techniques capable of fooling AMR compression forensic algorithms. In this paper, we present an antiforensic method based on generative adversarial network (GAN) to attack AMR compression detectors. The GAN framework is utilized to modify double AMR compressed audio to have the underlying statistics of single compressed one. Three state-of-the-art detectors of AMR compression are selected as the targets to be attacked. The experimental results demonstrate that the proposed method is capable of removing the forensically detectable artifacts of AMR compression under various ratios with an average successful attack rate about 94.75%, which means the modified audios generated by our well-trained generator can treat the forensic detector effectively. Moreover, we show that the perceptual quality of the generated AMR audio is well preserved.


Author(s):  
Wenchao Du ◽  
Hu Chen ◽  
Hongyu Yang ◽  
Yi Zhang

AbstractGenerative adversarial network (GAN) has been applied for low-dose CT images to predict normal-dose CT images. However, the undesired artifacts and details bring uncertainty to the clinical diagnosis. In order to improve the visual quality while suppressing the noise, in this paper, we mainly studied the two key components of deep learning based low-dose CT (LDCT) restoration models—network architecture and adversarial loss, and proposed a disentangled noise suppression method based on GAN (DNSGAN) for LDCT. Specifically, a generator network, which contains the noise suppression and structure recovery modules, is proposed. Furthermore, a multi-scaled relativistic adversarial loss is introduced to preserve the finer structures of generated images. Experiments on simulated and real LDCT datasets show that the proposed method can effectively remove noise while recovering finer details and provide better visual perception than other state-of-the-art methods.


Optik ◽  
2021 ◽  
Vol 227 ◽  
pp. 166060
Author(s):  
Yangdi Hu ◽  
Zhengdong Cheng ◽  
Xiaochun Fan ◽  
Zhenyu Liang ◽  
Xiang Zhai

Proceedings ◽  
2021 ◽  
Vol 77 (1) ◽  
pp. 17
Author(s):  
Andrea Giussani

In the last decade, advances in statistical modeling and computer science have boosted the production of machine-produced contents in different fields: from language to image generation, the quality of the generated outputs is remarkably high, sometimes better than those produced by a human being. Modern technological advances such as OpenAI’s GPT-2 (and recently GPT-3) permit automated systems to dramatically alter reality with synthetic outputs so that humans are not able to distinguish the real copy from its counteracts. An example is given by an article entirely written by GPT-2, but many other examples exist. In the field of computer vision, Nvidia’s Generative Adversarial Network, commonly known as StyleGAN (Karras et al. 2018), has become the de facto reference point for the production of a huge amount of fake human face portraits; additionally, recent algorithms were developed to create both musical scores and mathematical formulas. This presentation aims to stimulate participants on the state-of-the-art results in this field: we will cover both GANs and language modeling with recent applications. The novelty here is that we apply a transformer-based machine learning technique, namely RoBerta (Liu et al. 2019), to the detection of human-produced versus machine-produced text concerning fake news detection. RoBerta is a recent algorithm that is based on the well-known Bidirectional Encoder Representations from Transformers algorithm, known as BERT (Devlin et al. 2018); this is a bi-directional transformer used for natural language processing developed by Google and pre-trained over a huge amount of unlabeled textual data to learn embeddings. We will then use these representations as an input of our classifier to detect real vs. machine-produced text. The application is demonstrated in the presentation.


Author(s):  
Khaled ELKarazle ◽  
Valliappan Raman ◽  
Patrick Then

Age estimation models can be employed in many applications, including soft biometrics, content access control, targeted advertising, and many more. However, as some facial images are taken in unrestrained conditions, the quality relegates, which results in the loss of several essential ageing features. This study investigates how introducing a new layer of data processing based on a super-resolution generative adversarial network (SRGAN) model can influence the accuracy of age estimation by enhancing the quality of both the training and testing samples. Additionally, we introduce a novel convolutional neural network (CNN) classifier to distinguish between several age classes. We train one of our classifiers on a reconstructed version of the original dataset and compare its performance with an identical classifier trained on the original version of the same dataset. Our findings reveal that the classifier which trains on the reconstructed dataset produces better classification accuracy, opening the door for more research into building data-centric machine learning systems.


2020 ◽  
Vol 36 (10) ◽  
pp. 3011-3017 ◽  
Author(s):  
Olga Mineeva ◽  
Mateo Rojas-Carulla ◽  
Ruth E Ley ◽  
Bernhard Schölkopf ◽  
Nicholas D Youngblut

Abstract Motivation Methodological advances in metagenome assembly are rapidly increasing in the number of published metagenome assemblies. However, identifying misassemblies is challenging due to a lack of closely related reference genomes that can act as pseudo ground truth. Existing reference-free methods are no longer maintained, can make strong assumptions that may not hold across a diversity of research projects, and have not been validated on large-scale metagenome assemblies. Results We present DeepMAsED, a deep learning approach for identifying misassembled contigs without the need for reference genomes. Moreover, we provide an in silico pipeline for generating large-scale, realistic metagenome assemblies for comprehensive model training and testing. DeepMAsED accuracy substantially exceeds the state-of-the-art when applied to large and complex metagenome assemblies. Our model estimates a 1% contig misassembly rate in two recent large-scale metagenome assembly publications. Conclusions DeepMAsED accurately identifies misassemblies in metagenome-assembled contigs from a broad diversity of bacteria and archaea without the need for reference genomes or strong modeling assumptions. Running DeepMAsED is straight-forward, as well as is model re-training with our dataset generation pipeline. Therefore, DeepMAsED is a flexible misassembly classifier that can be applied to a wide range of metagenome assembly projects. Availability and implementation DeepMAsED is available from GitHub at https://github.com/leylabmpi/DeepMAsED. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Han Xu ◽  
Pengwei Liang ◽  
Wei Yu ◽  
Junjun Jiang ◽  
Jiayi Ma

In this paper, we propose a new end-to-end model, called dual-discriminator conditional generative adversarial network (DDcGAN), for fusing infrared and visible images of different resolutions. Unlike the pixel-level methods and existing deep learning-based methods, the fusion task is accomplished through the adversarial process between a generator and two discriminators, in addition to the specially designed content loss. The generator is trained to generate real-like fused images to fool discriminators. The two discriminators are trained to calculate the JS divergence between the probability distribution of downsampled fused images and infrared images, and the JS divergence between the probability distribution of gradients of fused images and gradients of visible images, respectively. Thus, the fused images can compensate for the features that are not constrained by the single content loss. Consequently, the prominence of thermal targets in the infrared image and the texture details in the visible image can be preserved or even enhanced in the fused image simultaneously. Moreover, by constraining and distinguishing between the downsampled fused image and the low-resolution infrared image, DDcGAN can be preferably applied to the fusion of different resolution images. Qualitative and quantitative experiments on publicly available datasets demonstrate the superiority of our method over the state-of-the-art.


2020 ◽  
Vol 34 (05) ◽  
pp. 8830-8837
Author(s):  
Xin Sheng ◽  
Linli Xu ◽  
Junliang Guo ◽  
Jingchang Liu ◽  
Ruoyu Zhao ◽  
...  

We propose a novel introspective model for variational neural machine translation (IntroVNMT) in this paper, inspired by the recent successful application of introspective variational autoencoder (IntroVAE) in high quality image synthesis. Different from the vanilla variational NMT model, IntroVNMT is capable of improving itself introspectively by evaluating the quality of the generated target sentences according to the high-level latent variables of the real and generated target sentences. As a consequence of introspective training, the proposed model is able to discriminate between the generated and real sentences of the target language via the latent variables generated by the encoder of the model. In this way, IntroVNMT is able to generate more realistic target sentences in practice. In the meantime, IntroVNMT inherits the advantages of the variational autoencoders (VAEs), and the model training process is more stable than the generative adversarial network (GAN) based models. Experimental results on different translation tasks demonstrate that the proposed model can achieve significant improvements over the vanilla variational NMT model.


Sign in / Sign up

Export Citation Format

Share Document