scholarly journals SRGAN Assisted Encoder-Decoder Deep Neural Network for Colorectal Polyp Semantic Segmentation

2021 ◽  
Vol 35 (5) ◽  
pp. 395-401
Author(s):  
Mohan Mahanty ◽  
Debnath Bhattacharyya ◽  
Divya Midhunchakkaravarthy

Colon cancer is thought about as the third most regularly identified cancer after Brest and lung cancer. Most colon cancers are adenocarcinomas developing from adenomatous polyps, grow on the intima of the colon. The standard procedure for polyp detection is colonoscopy, where the success of the standard colonoscopy depends on the colonoscopist experience and other environmental factors. Nonetheless, throughout colonoscopy procedures, a considerable number (8-37%) of polyps are missed due to human mistakes, and these missed polyps are the prospective reason for colorectal cancer cells. In the last few years, many research groups developed deep learning-based computer-aided (CAD) systems that recommended many techniques for automated polyp detection, localization, and segmentation. Still, accurate polyp detection, segmentation is required to minimize polyp miss out rates. This paper suggested a Super-Resolution Generative Adversarial Network (SRGAN) assisted Encoder-Decoder network for fully automated colon polyp segmentation from colonoscopic images. The proposed deep learning model incorporates the SRGAN in the up-sampling process to achieve more accurate polyp segmentation. We examined our model on the publicly available benchmark datasets CVC-ColonDB and Warwick- QU. The model accomplished a dice score of 0.948 on the CVC-ColonDB dataset, surpassed the recently advanced state-of-the-art (SOTA) techniques. When it is evaluated on the Warwick-QU dataset, it attains a Dice Score of 0.936 on part A and 0.895 on Part B. Our model showed more accurate results for sessile and smaller-sized polyps.

Author(s):  
M. Cao ◽  
H. Ji ◽  
Z. Gao ◽  
T. Mei

Abstract. Vehicle detection in remote sensing image has been attracting remarkable attention over past years for its applications in traffic, security, military, and surveillance fields. Due to the stunning success of deep learning techniques in object detection community, we consider to utilize CNNs for vehicle detection task in remote sensing image. Specifically, we take advantage of deep residual network, multi-scale feature fusion, hard example mining and homography augmentation to realize vehicle detection, which almost integrates all the advanced techniques in deep learning community. Furthermore, we simultaneously address super-resolution (SR) and detection problems of low-resolution (LR) image in an end-to-end manner. In consideration of the absence of paired low-/highresolution data which are generally time-consuming and cumbersome to collect, we leverage generative adversarial network (GAN) for unsupervised SR. Detection loss is back-propagated to SR generator to boost detection performance. We conduct experiments on representative benchmark datasets and demonstrate that our model yields significant improvements over state-of-the-art methods in deep learning and remote sensing areas.


Electronics ◽  
2020 ◽  
Vol 9 (8) ◽  
pp. 1312
Author(s):  
Debapriya Hazra ◽  
Yung-Cheol Byun

Video super-resolution has become an emerging topic in the field of machine learning. The generative adversarial network is a framework that is widely used to develop solutions for low-resolution videos. Video surveillance using closed-circuit television (CCTV) is significant in every field, all over the world. A common problem with CCTV videos is sudden video loss or poor quality. In this paper, we propose a generative adversarial network that implements spatio-temporal generators and discriminators to enhance real-time low-resolution CCTV videos to high-resolution. The proposed model considers both foreground and background motion of a CCTV video and effectively models the spatial and temporal consistency from low-resolution video frames to generate high-resolution videos. Quantitative and qualitative experiments on benchmark datasets, including Kinetics-700, UCF101, HMDB51 and IITH_Helmet2, showed that our model outperforms the existing GAN models for video super-resolution.


2020 ◽  
Vol 2020 ◽  
pp. 1-17
Author(s):  
Yirui Wu ◽  
Dabao Wei ◽  
Jun Feng

With the development of the fifth-generation networks and artificial intelligence technologies, new threats and challenges have emerged to wireless communication system, especially in cybersecurity. In this paper, we offer a review on attack detection methods involving strength of deep learning techniques. Specifically, we firstly summarize fundamental problems of network security and attack detection and introduce several successful related applications using deep learning structure. On the basis of categorization on deep learning methods, we pay special attention to attack detection methods built on different kinds of architectures, such as autoencoders, generative adversarial network, recurrent neural network, and convolutional neural network. Afterwards, we present some benchmark datasets with descriptions and compare the performance of representing approaches to show the current working state of attack detection methods with deep learning structures. Finally, we summarize this paper and discuss some ways to improve the performance of attack detection under thoughts of utilizing deep learning structures.


2018 ◽  
Vol 7 (10) ◽  
pp. 389 ◽  
Author(s):  
Wei He ◽  
Naoto Yokoya

In this paper, we present the optical image simulation from synthetic aperture radar (SAR) data using deep learning based methods. Two models, i.e., optical image simulation directly from the SAR data and from multi-temporal SAR-optical data, are proposed to testify the possibilities. The deep learning based methods that we chose to achieve the models are a convolutional neural network (CNN) with a residual architecture and a conditional generative adversarial network (cGAN). We validate our models using the Sentinel-1 and -2 datasets. The experiments demonstrate that the model with multi-temporal SAR-optical data can successfully simulate the optical image; meanwhile, the state-of-the-art model with simple SAR data as input failed. The optical image simulation results indicate the possibility of SAR-optical information blending for the subsequent applications such as large-scale cloud removal, and optical data temporal super-resolution. We also investigate the sensitivity of the proposed models against the training samples, and reveal possible future directions.


Generative Adversarial Networks have gained prominence in a short span of time as they can synthesize images from latent noise by minimizing the adversarial cost function. New variants of GANs have been developed to perform specific tasks using state-of-the-art GAN models, like image translation, single image super resolution, segmentation, classification, style transfer etc. However, a combination of two GANs to perform two different applications in one model has been sparsely explored. Hence, this paper concatenates two GANs and aims to perform Image Translation using Cycle GAN model on bird images and improve their resolution using SRGAN. During the extensive survey, it is observed that most of the deep learning databases on Aves were built using the new world species (i.e. species found in North America). Hence, to bridge this gap, a new Ave database, 'Common Birds of North - Western India' (CBNWI-50), is also proposed in this work.


2021 ◽  
Author(s):  
Jiaoyue Li ◽  
Weifeng Liu ◽  
Kai Zhang ◽  
Baodi Liu

Remote sensing image super-resolution (SR) plays an essential role in many remote sensing applications. Recently, remote sensing image super-resolution methods based on deep learning have shown remarkable performance. However, directly utilizing the deep learning methods becomes helpless to recover the remote sensing images with a large number of complex objectives or scene. So we propose an edge-based dense connection generative adversarial network (SREDGAN), which minimizes the edge differences between the generated image and its corresponding ground truth. Experimental results on NWPU-VHR-10 and UCAS-AOD datasets demonstrate that our method improves 1.92 and 0.045 in PSNR and SSIM compared with SRGAN, respectively.


2019 ◽  
Vol 11 (11) ◽  
pp. 1262 ◽  
Author(s):  
Ksenia Bittner ◽  
Marco Körner ◽  
Friedrich Fraundorfer ◽  
Peter Reinartz

Various deep learning applications benefit from multi-task learning with multiple regression and classification objectives by taking advantage of the similarities between individual tasks. This can result in improved learning efficiency and prediction accuracy for the task-specific models compared to separately trained models. In this paper, we make an observation of such influences for important remote sensing applications like elevation model generation and semantic segmentation tasks from the stereo half-meter resolution satellite digital surface models (DSMs). Mainly, we aim to generate good-quality DSMs with complete, as well as accurate level of detail (LoD)2-like building forms and to assign an object class label to each pixel in the DSMs. For the label assignment task, we select the roof type classification problem to distinguish between flat, non-flat, and background pixels. To realize those tasks, we train a conditional generative adversarial network (cGAN) with an objective function based on least squares residuals and an auxiliary term based on normal vectors for further roof surface refinement. Besides, we investigate recently published deep learning architectures for both tasks and develop the final end-to-end network, which combines different models, as using them first separately, they provide the best results for their individual tasks.


Sensors ◽  
2021 ◽  
Vol 21 (9) ◽  
pp. 2978
Author(s):  
Hongtao Zhang ◽  
Yuki Shinomiya ◽  
Shinichi Yoshida

The diagnosis of brain pathologies usually involves imaging to analyze the condition of the brain. Magnetic resonance imaging (MRI) technology is widely used in brain disorder diagnosis. The image quality of MRI depends on the magnetostatic field strength and scanning time. Scanners with lower field strengths have the disadvantages of a low resolution and high imaging cost, and scanning takes a long time. The traditional super-resolution reconstruction method based on MRI generally states an optimization problem in terms of prior information. It solves the problem using an iterative approach with a large time cost. Many methods based on deep learning have emerged to replace traditional methods. MRI super-resolution technology based on deep learning can effectively improve MRI resolution through a three-dimensional convolutional neural network; however, the training costs are relatively high. In this paper, we propose the use of two-dimensional super-resolution technology for the super-resolution reconstruction of MRI images. In the first reconstruction, we choose a scale factor of 2 and simulate half the volume of MRI slices as input. We utilize a receiving field block enhanced super-resolution generative adversarial network (RFB-ESRGAN), which is superior to other super-resolution technologies in terms of texture and frequency information. We then rebuild the super-resolution reconstructed slices in the MRI. In the second reconstruction, the image after the first reconstruction is composed of only half of the slices, and there are still missing values. In our previous work, we adopted the traditional interpolation method, and there was still a gap in the visual effect of the reconstructed images. Therefore, we propose a noise-based super-resolution network (nESRGAN). The noise addition to the network can provide additional texture restoration possibilities. We use nESRGAN to further restore MRI resolution and high-frequency information. Finally, we achieve the 3D reconstruction of brain MRI images through two super-resolution reconstructions. Our proposed method is superior to 3D super-resolution technology based on deep learning in terms of perception range and image quality evaluation standards.


AI ◽  
2021 ◽  
Vol 2 (4) ◽  
pp. 600-620
Author(s):  
Gabriele Accarino ◽  
Marco Chiarelli ◽  
Francesco Immorlano ◽  
Valeria Aloisi ◽  
Andrea Gatto ◽  
...  

One of the most important open challenges in climate science is downscaling. It is a procedure that allows making predictions at local scales, starting from climatic field information available at large scale. Recent advances in deep learning provide new insights and modeling solutions to tackle downscaling-related tasks by automatically learning the coarse-to-fine grained resolution mapping. In particular, deep learning models designed for super-resolution problems in computer vision can be exploited because of the similarity between images and climatic fields maps. For this reason, a new architecture tailored for statistical downscaling (SD), named MSG-GAN-SD, has been developed, allowing interpretability and good stability during training, due to multi-scale gradient information. The proposed architecture, based on a Generative Adversarial Network (GAN), was applied to downscale ERA-Interim 2-m temperature fields, from 83.25 to 13.87 km resolution, covering the EURO-CORDEX domain within the 1979–2018 period. The training process involves seasonal and monthly dataset arrangements, in addition to different training strategies, leading to several models. Furthermore, a model selection framework is introduced in order to mathematically select the best models during the training. The selected models were then tested on the 2015–2018 period using several metrics to identify the best training strategy and dataset arrangement, which finally produced several evaluation maps. This work is the first attempt to use the MSG-GAN architecture for statistical downscaling. The achieved results demonstrate that the models trained on seasonal datasets performed better than those trained on monthly datasets. This study presents an accurate and cost-effective solution that is able to perform downscaling of 2 m temperature climatic maps.


2021 ◽  
Vol 2021 ◽  
pp. 1-25
Author(s):  
Young Ha Shin ◽  
Dong-Cheon Lee

Orthoimage, which is geometrically equivalent to a map, is one of the important geospatial products. Displacement and occlusion in optical images are caused by perspective projection, camera tilt, and object relief. A digital surface model (DSM) is essential data for generating true orthoimages to correct displacement and to recover occlusion areas. Light detection and ranging (LiDAR) data collected from an airborne laser scanner (ALS) system is a major source of DSM. The traditional methods require sophisticated procedures to produce a true orthoimage. Most methods utilize 3D coordinates of the DSM and multiview images with overlapping areas for orthorectifying displacement and detecting and recovering occlusion areas. LiDAR point cloud data provides not only 3D coordinates but also intensity information reflected from object surfaces in the georeferenced orthoprojected space. This paper proposes true orthoimage generation based on a generative adversarial network (GAN) deep learning (DL) with the Pix2Pix model using intensity and DSM of the LiDAR data. The major advantage of using LiDAR data is that the data is occlusion-free true orthoimage in terms of projection geometry except in the case of low image quality. Intensive experiments were performed using the benchmark datasets provided by the International Society for Photogrammetry and Remote Sensing (ISPRS). The results demonstrate that the proposed approach could have the capability of efficiently generating true orthoimages directly from LiDAR data. However, it is crucial to find appropriate preprocessing to improve the quality of the intensity of the LiDAR data to produce a higher quality of the true orthoimages.


Sign in / Sign up

Export Citation Format

Share Document