Image Deblurring Using Multi-Stream Bottom-Top-Bottom Attention Network and Global Information-Based Fusion and Reconstruction Network

Image deblurring has been a challenging ill-posed problem in computer vision. Gaussian blur is a common model for image and signal degradation. The deep learning-based deblurring methods have attracted much attention due to their advantages over the traditional methods relying on hand-designed features. However, the existing deep learning-based deblurring techniques still cannot perform well in restoring the fine details and reconstructing the sharp edges. To address this issue, we have designed an effective end-to-end deep learning-based non-blind image deblurring algorithm. In the proposed method, a multi-stream bottom-top-bottom attention network (MBANet) with the encoder-to-decoder structure is designed to integrate low-level cues and high-level semantic information, which can facilitate extracting image features more effectively and improve the computational efficiency of the network. Moreover, the MBANet adopts a coarse-to-fine multi-scale strategy to process the input images to improve image deblurring performance. Furthermore, the global information-based fusion and reconstruction network is proposed to fuse multi-scale output maps to improve the global spatial information and recurrently refine the output deblurred image. The experiments were done on the public GoPro dataset and the realistic and dynamic scenes (REDS) dataset to evaluate the effectiveness and robustness of the proposed method. The experimental results show that the proposed method generally outperforms some traditional deburring methods and deep learning-based state-of-the-art deblurring methods such as scale-recurrent network (SRN) and denoising prior driven deep neural network (DPDNN) in terms of such quantitative indexes as peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) and human vision.

Download Full-text

A Lightweight Fusion Distillation Network for Image Deblurring and Deraining

Sensors ◽

10.3390/s21165312 ◽

2021 ◽

Vol 21 (16) ◽

pp. 5312

Author(s):

Yanni Zhang ◽

Yiming Liu ◽

Qiang Li ◽

Jianzhong Wang ◽

Miao Qi ◽

...

Keyword(s):

Deep Learning ◽

Image Deblurring ◽

Image Features ◽

Image Feature ◽

Model Complexity ◽

Small Scale ◽

Feature Maps ◽

Learning Framework ◽

Channel Information ◽

Scale Spaces

Recently, deep learning-based image deblurring and deraining have been well developed. However, most of these methods fail to distill the useful features. What is more, exploiting the detailed image features in a deep learning framework always requires a mass of parameters, which inevitably makes the network suffer from a high computational burden. We propose a lightweight fusion distillation network (LFDN) for image deblurring and deraining to solve the above problems. The proposed LFDN is designed as an encoder–decoder architecture. In the encoding stage, the image feature is reduced to various small-scale spaces for multi-scale information extraction and fusion without much information loss. Then, a feature distillation normalization block is designed at the beginning of the decoding stage, which enables the network to distill and screen valuable channel information of feature maps continuously. Besides, an information fusion strategy between distillation modules and feature channels is also carried out by the attention mechanism. By fusing different information in the proposed approach, our network can achieve state-of-the-art image deblurring and deraining results with a smaller number of parameters and outperform the existing methods in model complexity.

Download Full-text

Deep Learning Image Processing Enables 40% Faster Spinal MR Scans Which Match or Exceed Quality of Standard of Care

Clinical Neuroradiology ◽

10.1007/s00062-021-01121-2 ◽

2021 ◽

Author(s):

S. Bash ◽

B. Johnson ◽

W. Gibbs ◽

T. Zhang ◽

A. Shankaranarayanan ◽

...

Keyword(s):

Deep Learning ◽

Signal To Noise Ratio ◽

Similarity Index ◽

Standard Of Care ◽

Structural Similarity ◽

Image Features ◽

Scan Time ◽

Magnetic Resonance Imaging Mri ◽

Display Order ◽

Spine Mri

Abstract Objective This prospective multicenter multireader study evaluated the performance of 40% scan-time reduced spinal magnetic resonance imaging (MRI) reconstructed with deep learning (DL). Methods A total of 61 patients underwent standard of care (SOC) and accelerated (FAST) spine MRI. DL was used to enhance the accelerated set (FAST-DL). Three neuroradiologists were presented with paired side-by-side datasets (666 series). Datasets were blinded and randomized in sequence and left-right display order. Image features were preference rated. Structural similarity index (SSIM) and per pixel L1 was assessed for the image sets pre and post DL-enhancement as a quantitative assessment of image integrity impact. Results FAST-DL was qualitatively better than SOC for perceived signal-to-noise ratio (SNR) and artifacts and equivalent for other features. Quantitative SSIM was high, supporting the absence of image corruption by DL processing. Conclusion DL enables 40% spine MRI scan time reduction while maintaining diagnostic integrity and image quality with perceived benefits in SNR and artifact reduction, suggesting potential for clinical practice utility.

Download Full-text

Efficient, high-performance semantic segmentation using multi-scale feature extraction

PLoS ONE ◽

10.1371/journal.pone.0255397 ◽

2021 ◽

Vol 16 (8) ◽

pp. e0255397

Author(s):

Moritz Knolle ◽

Georgios Kaissis ◽

Friederike Jungmann ◽

Sebastian Ziegelmayer ◽

Daniel Sasse ◽

...

Keyword(s):

Deep Learning ◽

Graphics Processing Units ◽

Substantial Reduction ◽

Image Features ◽

Tumor Segmentation ◽

Processing Unit ◽

Central Processing ◽

Multi Scale ◽

Computational Performance ◽

Wide Range

The success of deep learning in recent years has arguably been driven by the availability of large datasets for training powerful predictive algorithms. In medical applications however, the sensitive nature of the data limits the collection and exchange of large-scale datasets. Privacy-preserving and collaborative learning systems can enable the successful application of machine learning in medicine. However, collaborative protocols such as federated learning require the frequent transfer of parameter updates over a network. To enable the deployment of such protocols to a wide range of systems with varying computational performance, efficient deep learning architectures for resource-constrained environments are required. Here we present MoNet, a small, highly optimized neural-network-based segmentation algorithm leveraging efficient multi-scale image features. MoNet is a shallow, U-Net-like architecture based on repeated, dilated convolutions with decreasing dilation rates. We apply and test our architecture on the challenging clinical tasks of pancreatic segmentation in computed tomography (CT) images as well as brain tumor segmentation in magnetic resonance imaging (MRI) data. We assess our model’s segmentation performance and demonstrate that it provides performance on par with compared architectures while providing superior out-of-sample generalization performance, outperforming larger architectures on an independent validation set, while utilizing significantly fewer parameters. We furthermore confirm the suitability of our architecture for federated learning applications by demonstrating a substantial reduction in serialized model storage requirement as a surrogate for network data transfer. Finally, we evaluate MoNet’s inference latency on the central processing unit (CPU) to determine its utility in environments without access to graphics processing units. Our implementation is publicly available as free and open-source software.

Download Full-text

MS-AFF: A Novel Semantic Segmentation Approach for Buried Object Based on Multi-scale Attentional Feature Fusion

10.21203/rs.3.rs-193757/v1 ◽

2021 ◽

Author(s):

Chao Lu ◽

Fansheng Chen ◽

Xiaofeng Su ◽

Dan Zeng

Keyword(s):

Deep Learning ◽

Spatial Information ◽

Feature Fusion ◽

Infrared Image ◽

Semantic Segmentation ◽

Target Object ◽

Infrared Images ◽

Feature Maps ◽

Multi Scale ◽

Visible Images

Abstract Infrared technology is a widely used in precision guidance and mine detection since it can capture the heat radiated outward from the target object. We use infrared (IR) thermography to get the infrared image of the buried obje cts. Compared to the visible images, infrared images present poor resolution, low contrast, and fuzzy visual effect, which make it difficult to segment the target object, specifically in the complex backgrounds. In this condition, traditional segmentation methods cannot perform well in infrared images since they are easily disturbed by the noise and non-target objects in the images. With the advance of deep convolutional neural network (CNN), the deep learning-based methods have made significant improvements in semantic segmentation task. However, few of them research Infrared image semantic segmentation, which is a more challenging scenario compared to visible images. Moreover, the lack of an Infrared image dataset is also a problem for current methods based on deep learning. We raise a multi-scale attentional feature fusion (MS-AFF) module for infrared image semantic segmentation to solve this problem. Precisely, we integrate a series of feature maps from different levels by an atrous spatial pyramid structure. In this way, the model can obtain rich representation ability on the infrared images. Besides, a global spatial information attention module is employed to let the model focus on the target region and reduce disturbance in infrared images' background. In addition, we propose an infrared segmentation dataset based on the infrared thermal imaging system. Extensive experiments conducted in the infrared image segmentation dataset show the superiority of our method.

Download Full-text

A Fast Aircraft Detection Method for SAR Images Based on Efficient Bidirectional Path Aggregated Attention Network

Remote Sensing ◽

10.3390/rs13152940 ◽

2021 ◽

Vol 13 (15) ◽

pp. 2940

Author(s):

Ru Luo ◽

Lifu Chen ◽

Jin Xing ◽

Zhihui Yuan ◽

Siyu Tan ◽

...

Keyword(s):

Spatial Information ◽

Speckle Noise ◽

False Alarms ◽

Detection Accuracy ◽

Sar Images ◽

Attention Network ◽

Multi Scale ◽

Complex Background ◽

Size Heterogeneity ◽

Aircraft Detection

In aircraft detection from synthetic aperture radar (SAR) images, there are several major challenges: the shattered features of the aircraft, the size heterogeneity and the interference of a complex background. To address these problems, an Efficient Bidirectional Path Aggregation Attention Network (EBPA2N) is proposed. In EBPA2N, YOLOv5s is used as the base network and then the Involution Enhanced Path Aggregation (IEPA) module and Effective Residual Shuffle Attention (ERSA) module are proposed and systematically integrated to improve the detection accuracy of the aircraft. The IEPA module aims to effectively extract advanced semantic and spatial information to better capture multi-scale scattering features of aircraft. Then, the lightweight ERSA module further enhances the extracted features to overcome the interference of complex background and speckle noise, so as to reduce false alarms. To verify the effectiveness of the proposed network, Gaofen-3 airports SAR data with 1 m resolution are utilized in the experiment. The detection rate and false alarm rate of our EBPA2N algorithm are 93.05% and 4.49%, respectively, which is superior to the latest networks of EfficientDet-D0 and YOLOv5s, and it also has an advantage of detection speed.

Download Full-text

Variational-EM-Based Deep Learning for Noise-Blind Image Deblurring

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) ◽

10.1109/cvpr42600.2020.00368 ◽

2020 ◽

Cited By ~ 1

Author(s):

Yuesong Nan ◽

Yuhui Quan ◽

Hui Ji

Keyword(s):

Deep Learning ◽

Image Deblurring ◽

Blind Image Deblurring ◽

Variational Em ◽

Blind Image

Download Full-text

Learning Multi-Scale Shrinkage Fields for Blind Image Deblurring

Communications in Computer and Information Science - Internet Multimedia Computing and Service ◽

10.1007/978-981-10-8530-7_11 ◽

2018 ◽

pp. 108-115

Author(s):

Bingwang Zhang ◽

Risheng Liu ◽

Haojie Li ◽

Qi Yuan ◽

Xin Fan ◽

...

Keyword(s):

Image Deblurring ◽

Multi Scale ◽

Blind Image Deblurring ◽

Blind Image

Download Full-text

Multi-Scale Inception Based Super-Resolution Using Deep Learning Approach

Electronics ◽

10.3390/electronics8080892 ◽

2019 ◽

Vol 8 (8) ◽

pp. 892 ◽

Cited By ~ 1

Author(s):

Wazir Muhammad ◽

Supavadee Aramvith

Keyword(s):

Deep Learning ◽

Network Architecture ◽

Computational Cost ◽

Similarity Index ◽

Super Resolution ◽

Structural Similarity ◽

Learning Approach ◽

Deep Convolutional Neural Networks ◽

Multi Scale ◽

Interpolation Techniques

Single image super-resolution (SISR) aims to reconstruct a high-resolution (HR) image from a low-resolution (LR) image. In order to address the SISR problem, recently, deep convolutional neural networks (CNNs) have achieved remarkable progress in terms of accuracy and efficiency. In this paper, an innovative technique, namely a multi-scale inception-based super-resolution (SR) using deep learning approach, or MSISRD, was proposed for fast and accurate reconstruction of SISR. The proposed network employs the deconvolution layer to upsample the LR image to the desired HR image. The proposed method is in contrast to existing approaches that use the interpolation techniques to upscale the LR image. Primarily, interpolation techniques are not designed for this purpose, which results in the creation of undesired noise in the model. Moreover, the existing methods mainly focus on the shallow network or stacking multiple layers in the model with the aim of creating a deeper network architecture. The technique based on the aforementioned design creates the vanishing gradients problem during the training and increases the computational cost of the model. Our proposed method does not use any hand-designed pre-processing steps, such as the bicubic interpolation technique. Furthermore, an asymmetric convolution block is employed to reduce the number of parameters, in addition to the inception block adopted from GoogLeNet, to reconstruct the multiscale information. Experimental results demonstrate that the proposed model exhibits an enhanced performance compared to twelve state-of-the-art methods in terms of the average peak signal-to-noise ratio (PSNR), structural similarity index (SSIM) with a reduced number of parameters for the scale factor of 2 × , 4 × , and 8 × .

Download Full-text

Blind Image Deblurring Based on Dual Attention Network and 2D Blur Kernel Estimation

10.1109/icip42928.2021.9506342 ◽

2021 ◽

Author(s):

Senmao Tian ◽

Shunli Zhang ◽

Beibei Lin

Keyword(s):

Kernel Estimation ◽

Image Deblurring ◽

Attention Network ◽

Blind Image Deblurring ◽

Blur Kernel Estimation ◽

Blur Kernel ◽

Blind Image

Download Full-text

ANALISIS SENSITIVITAS VIDEO MPEG-4 BERDASARKAN STRUKTUR FRAME PADA TRANSMISI DVB-T

Jurnal Ilmiah Informatika Komputer ◽

10.35760/ik.2020.v25i2.2691 ◽

2020 ◽

Vol 25 (2) ◽

pp. 86-97

Author(s):

Sandy Suryo Prayogo ◽

Tubagus Maulana Kusuma

Keyword(s):

Deep Learning ◽

Bit Error Rate ◽

Error Rate ◽

Signal To Noise Ratio ◽

Similarity Index ◽

Structural Similarity ◽

Signal To Noise ◽

Structural Similarity Index ◽

Noise Ratio

DVB merupakan standar transmisi televisi digital yang paling banyak digunakan saat ini. Unsur terpenting dari suatu proses transmisi adalah kualitas gambar dari video yang diterima setelah melalui proses transimisi tersebut. Banyak faktor yang dapat mempengaruhi kualitas dari suatu gambar, salah satunya adalah struktur frame dari video. Pada tulisan ini dilakukan pengujian sensitifitas video MPEG-4 berdasarkan struktur frame pada transmisi DVB-T. Pengujian dilakukan menggunakan simulasi matlab dan simulink. Digunakan juga ffmpeg untuk menyediakan format dan pengaturan video akan disimulasikan. Variabel yang diubah dari video adalah bitrate dan juga group-of-pictures (GOP), sedangkan variabel yang diubah dari transmisi DVB-T adalah signal-to-noise-ratio (SNR) pada kanal AWGN di antara pengirim (Tx) dan penerima (Rx). Hasil yang diperoleh dari percobaan berupa kualitas rata-rata gambar pada video yang diukur menggunakan metode pengukuran structural-similarity-index (SSIM). Dilakukan juga pengukuran terhadap jumlah bit-error-rate BER pada bitstream DVB-T. Percobaan yang dilakukan dapat menunjukkan seberapa besar sensitifitas bitrate dan GOP dari video pada transmisi DVB-T dengan kesimpulan semakin besar bitrate maka akan semakin buruk nilai kualitas gambarnya, dan semakin kecil nilai GOP maka akan semakin baik nilai kualitasnya. Penilitian diharapkan dapat dikembangkan menggunakan deep learning untuk memperoleh frame struktur yang tepat di kondisi-kondisi tertentu dalam proses transmisi televisi digital.

Download Full-text