scholarly journals Single Image Dehazing Using Deep Learning

2021 ◽  
Vol 5 (1) ◽  
pp. 76
Author(s):  
Cahyo Adhi Hartanto ◽  
Laksmita Rahadianti

Many real-world situations such as bad weather may result in hazy environments. Images captured in these hazy conditions will have low image quality due to microparticles in the air. The microparticles light to scatter and absorb, resulting in hazy images with various effects. In recent years, image dehazing has been researched in depth to handle images captured in these conditions. Various methods were developed, from traditional methods to deep learning methods. Traditional methods focus more on the use of statistical prior. These statistical prior have weaknesses in certain conditions. This paper proposes a novel architecture based on PDR-Net by using a pyramid dilated convolution and pre-processing modules, processing modules, post-processing modules, and attention applications. The proposed network is trained to minimize L1 loss and perceptual loss with the O-Haze dataset. To evaluate our architecture's result, we used structural similarity index measure (SSIM), peak signal-to-noise ratio (PSNR), and color difference as an objective assessment and psychovisual experiment as a subjective assessment. Our architecture obtained better results than the previous method using the O-Haze dataset with an SSIM of 0.798, a PSNR of 25.39, but not better on the color difference. The SSIM and PSNR results were strengthened by using subjective assessments and 65 respondents, most of whom chose the results of the restoration of the image produced by our architecture.

2020 ◽  
Vol 25 (2) ◽  
pp. 86-97
Author(s):  
Sandy Suryo Prayogo ◽  
Tubagus Maulana Kusuma

DVB merupakan standar transmisi televisi digital yang paling banyak digunakan saat ini. Unsur terpenting dari suatu proses transmisi adalah kualitas gambar dari video yang diterima setelah melalui proses transimisi tersebut. Banyak faktor yang dapat mempengaruhi kualitas dari suatu gambar, salah satunya adalah struktur frame dari video. Pada tulisan ini dilakukan pengujian sensitifitas video MPEG-4 berdasarkan struktur frame pada transmisi DVB-T. Pengujian dilakukan menggunakan simulasi matlab dan simulink. Digunakan juga ffmpeg untuk menyediakan format dan pengaturan video akan disimulasikan. Variabel yang diubah dari video adalah bitrate dan juga group-of-pictures (GOP), sedangkan variabel yang diubah dari transmisi DVB-T adalah signal-to-noise-ratio (SNR) pada kanal AWGN di antara pengirim (Tx) dan penerima (Rx). Hasil yang diperoleh dari percobaan berupa kualitas rata-rata gambar pada video yang diukur menggunakan metode pengukuran structural-similarity-index (SSIM). Dilakukan juga pengukuran terhadap jumlah bit-error-rate BER pada bitstream DVB-T. Percobaan yang dilakukan dapat menunjukkan seberapa besar sensitifitas bitrate dan GOP dari video pada transmisi DVB-T dengan kesimpulan semakin besar bitrate maka akan semakin buruk nilai kualitas gambarnya, dan semakin kecil nilai GOP maka akan semakin baik nilai kualitasnya. Penilitian diharapkan dapat dikembangkan menggunakan deep learning untuk memperoleh frame struktur yang tepat di kondisi-kondisi tertentu dalam proses transmisi televisi digital.


Author(s):  
S. Bash ◽  
B. Johnson ◽  
W. Gibbs ◽  
T. Zhang ◽  
A. Shankaranarayanan ◽  
...  

Abstract Objective This prospective multicenter multireader study evaluated the performance of 40% scan-time reduced spinal magnetic resonance imaging (MRI) reconstructed with deep learning (DL). Methods A total of 61 patients underwent standard of care (SOC) and accelerated (FAST) spine MRI. DL was used to enhance the accelerated set (FAST-DL). Three neuroradiologists were presented with paired side-by-side datasets (666 series). Datasets were blinded and randomized in sequence and left-right display order. Image features were preference rated. Structural similarity index (SSIM) and per pixel L1 was assessed for the image sets pre and post DL-enhancement as a quantitative assessment of image integrity impact. Results FAST-DL was qualitatively better than SOC for perceived signal-to-noise ratio (SNR) and artifacts and equivalent for other features. Quantitative SSIM was high, supporting the absence of image corruption by DL processing. Conclusion DL enables 40% spine MRI scan time reduction while maintaining diagnostic integrity and image quality with perceived benefits in SNR and artifact reduction, suggesting potential for clinical practice utility.


2015 ◽  
pp. 1233-1245
Author(s):  
T. Chandrakanth ◽  
B. Sandhya

Advances in imaging and computing hardware have led to an explosion in the use of color images in image processing, graphics and computer vision applications across various domains such as medical imaging, satellite imagery, document analysis and biometrics to name a few. However, these images are subjected to a wide variety of distortions during its acquisition, subsequent compression, transmission, processing and then reproduction, which degrade their visual quality. Hence objective quality assessment of color images has emerged as one of the essential operations in image processing. During the last two decades, efforts have been put to design such an image quality metric which can be calculated simply but can accurately reflect subjective quality of human perception. In this paper, the authors evaluated the quality assessment of color images using SSIM (structural similarity index) metric across various color spaces. They experimented to study the effect of color spaces in metric based and distance based quality assessment. The authors proposed a metric using CIE Lab color space and SSIM, which has better correlation to the subjective assessment in a benchmark dataset.


2021 ◽  
Vol 11 (8) ◽  
pp. 3508
Author(s):  
Pedro Miguel Martinez-Girones ◽  
Javier Vera-Olmos ◽  
Mario Gil-Correa ◽  
Ana Ramos ◽  
Lina Garcia-Cañamaque ◽  
...  

Typically, pseudo-Computerized Tomography (CT) synthesis schemes proposed in the literature rely on complete atlases acquired with the same field of view (FOV) as the input volume. However, clinical CTs are usually acquired in a reduced FOV to decrease patient ionization. In this work, we present the Franken-CT approach, showing how the use of a non-parametric atlas composed of diverse anatomical overlapping Magnetic Resonance (MR)-CT scans and deep learning methods based on the U-net architecture enable synthesizing extended head and neck pseudo-CTs. Visual inspection of the results shows the high quality of the pseudo-CT and the robustness of the method, which is able to capture the details of the bone contours despite synthesizing the resulting image from knowledge obtained from images acquired with a completely different FOV. The experimental Zero-Normalized Cross-Correlation (ZNCC) reports 0.9367 ± 0.0138 (mean ± SD) and 95% confidence interval (0.9221, 0.9512); the experimental Mean Absolute Error (MAE) reports 73.9149 ± 9.2101 HU and 95% confidence interval (66.3383, 81.4915); the Structural Similarity Index Measure (SSIM) reports 0.9943 ± 0.0009 and 95% confidence interval (0.9935, 0.9951); and the experimental Dice coefficient for bone tissue reports 0.7051 ± 0.1126 and 95% confidence interval (0.6125, 0.7977). The voxel-by-voxel correlation plot shows an excellent correlation between pseudo-CT and ground-truth CT Hounsfield Units (m = 0.87; adjusted R2 = 0.91; p < 0.001). The Bland–Altman plot shows that the average of the differences is low (−38.6471 ± 199.6100; 95% CI (−429.8827, 352.5884)). This work serves as a proof of concept to demonstrate the great potential of deep learning methods for pseudo-CT synthesis and their great potential using real clinical datasets.


2021 ◽  
Vol 11 (3) ◽  
pp. 1089
Author(s):  
Suhong Yoo ◽  
Jisang Lee ◽  
Junsu Bae ◽  
Hyoseon Jang ◽  
Hong-Gyoo Sohn

Aerial images are an outstanding option for observing terrain with their high-resolution (HR) capability. The high operational cost of aerial images makes it difficult to acquire periodic observation of the region of interest. Satellite imagery is an alternative for the problem, but low-resolution is an obstacle. In this study, we proposed a context-based approach to simulate the 10 m resolution of Sentinel-2 imagery to produce 2.5 and 5.0 m prediction images using the aerial orthoimage acquired over the same period. The proposed model was compared with an enhanced deep super-resolution network (EDSR), which has excellent performance among the existing super-resolution (SR) deep learning algorithms, using the peak signal-to-noise ratio (PSNR), structural similarity index measure (SSIM), and root-mean-squared error (RMSE). Our context-based ResU-Net outperformed the EDSR in all three metrics. The inclusion of the 60 m resolution of Sentinel-2 imagery performs better through fine-tuning. When 60 m images were included, RMSE decreased, and PSNR and SSIM increased. The result also validated that the denser the neural network, the higher the quality. Moreover, the accuracy is much higher when both denser feature dimensions and the 60 m images were used.


2021 ◽  
Vol 38 (5) ◽  
pp. 1361-1368
Author(s):  
Fatih M. Senalp ◽  
Murat Ceylan

The thermal camera systems can be used in all kinds of applications that require the detection of heat change, but thermal imaging systems are highly costly systems. In recent years, developments in the field of deep learning have increased the success by obtaining quality results compared to traditional methods. In this paper, thermal images of neonates (healthy - unhealthy) obtained from a high-resolution thermal camera were used and these images were evaluated as high resolution (ground truth) images. Later, these thermal images were downscaled at 1/2, 1/4, 1/8 ratios, and three different datasets consisting of low-resolution images in different sizes were obtained. In this way, super-resolution applications have been carried out on the deep network model developed based on generative adversarial networks (GAN) by using three different datasets. The successful performance of the results was evaluated with PSNR (peak signal to noise ratio) and SSIM (structural similarity index measure). In addition, healthy - unhealthy classification application was carried out by means of a classifier network developed based on convolutional neural networks (CNN) to evaluate the super-resolution images obtained using different datasets. The obtained results show the importance of combining medical thermal imaging with super-resolution methods.


Electronics ◽  
2021 ◽  
Vol 10 (22) ◽  
pp. 2855
Author(s):  
Rabia Naseem ◽  
Faouzi Alaya Cheikh ◽  
Azeddine Beghdadi ◽  
Khan Muhammad ◽  
Muhammad Sajjad

Cross-modal medical imaging techniques are predominantly being used in the clinical suite. The ensemble learning methods using cross-modal medical imaging adds reliability to several medical image analysis tasks. Motivated by the performance of deep learning in several medical imaging tasks, a deep learning-based denoising method Cross-Modality Guided Denoising Network CMGDNet for removing Rician noise in T1-weighted (T1-w) Magnetic Resonance Images (MRI) is proposed in this paper. CMGDNet uses a guidance image, which is a cross-modal (T2-w) image of better perceptual quality to guide the model in denoising its noisy T1-w counterpart. This cross-modal combination allows the network to exploit complementary information existing in both images and therefore improve the learning capability of the model. The proposed framework consists of two components: Paired Hierarchical Learning (PHL) module and Cross-Modal Assisted Reconstruction (CMAR) module. PHL module uses Siamese network to extract hierarchical features from dual images, which are then combined in a densely connected manner in the CMAR module to finally reconstruct the image. The impact of using registered guidance data is investigated in removing noise as well as retaining structural similarity with the original image. Several experiments were conducted on two publicly available brain imaging datasets available on the IXI database. The quantitative assessment using Peak Signal to noise ratio (PSNR), Structural Similarity Index (SSIM), and Feature Similarity Index (FSIM) demonstrates that the proposed method exhibits 4.7% and 2.3% gain (average), respectively, in SSIM and FSIM values compared to other state-of-the-art denoising methods that do not integrate cross-modal image information in removing various levels of noise.


Author(s):  
Maximilian Gilles ◽  
Sascha Ibrahimpasic

AbstractKnowing the robot's pose is a crucial prerequisite for mobile robot tasks such as collision avoidance or autonomous navigation. Using powerful predictive models to estimate transformations for visual odometry via downward facing cameras is an understudied area of research. This work proposes a novel approach based on deep learning for estimating ego motion with a downward looking camera. The network can be trained completely unsupervised and is not restricted to a specific motion model. We propose two neural network architectures based on the Early Fusion and Slow Fusion design principle: “EarlyBird” and “SlowBird”. Both networks share a Spatial Transformer layer for image warping and are trained with a modified structural similarity index (SSIM) loss function. Experiments carried out in simulation and for a real world differential drive robot show similar and partially better results of our proposed deep learning based approaches compared to a state-of-the-art method based on fast Fourier transformation.


Author(s):  
Manish Balamurugan ◽  
Kathryn Chung ◽  
Venkat Kuppoor ◽  
Smruti Mahapatra ◽  
Aliaksei Pustavoitau ◽  
...  

Abstract In this study, we present USDL, a novel model that employs deep learning algorithms in order to reconstruct and enhance corrupted ultrasound images. We utilize an unsupervised neural network called an autoencoder which works by compressing its input into a latent-space representation and then reconstructing the output from this representation. We trained our model on a dataset that compromises of 15,700 in vivo images of the neck, wrist, elbow, and knee vasculature and compared the quality of the images generated using the structural similarity index (SSIM) and peak to noise ratio (PSNR). In closely simulated conditions, the architecture exhibited an average reconstruction accuracy of 90% as indicated by our SSIM. Our study demonstrates that USDL outperforms state of the art image enhancement and reconstruction techniques in both image quality and computational complexity, while maintaining the architecture efficiency.


Informatics ◽  
2021 ◽  
Vol 8 (4) ◽  
pp. 84
Author(s):  
Noé Tits ◽  
Kevin El Haddad ◽  
Thierry Dutoit

In this paper, we study the controllability of an Expressive TTS system trained on a dataset for a continuous control. The dataset is the Blizzard 2013 dataset based on audiobooks read by a female speaker containing a great variability in styles and expressiveness. Controllability is evaluated with both an objective and a subjective experiment. The objective assessment is based on a measure of correlation between acoustic features and the dimensions of the latent space representing expressiveness. The subjective assessment is based on a perceptual experiment in which users are shown an interface for Controllable Expressive TTS and asked to retrieve a synthetic utterance whose expressiveness subjectively corresponds to that a reference utterance.


Sign in / Sign up

Export Citation Format

Share Document