scholarly journals Unsupervised deep learning based ego motion estimation with a downward facing camera

Author(s):  
Maximilian Gilles ◽  
Sascha Ibrahimpasic

AbstractKnowing the robot's pose is a crucial prerequisite for mobile robot tasks such as collision avoidance or autonomous navigation. Using powerful predictive models to estimate transformations for visual odometry via downward facing cameras is an understudied area of research. This work proposes a novel approach based on deep learning for estimating ego motion with a downward looking camera. The network can be trained completely unsupervised and is not restricted to a specific motion model. We propose two neural network architectures based on the Early Fusion and Slow Fusion design principle: “EarlyBird” and “SlowBird”. Both networks share a Spatial Transformer layer for image warping and are trained with a modified structural similarity index (SSIM) loss function. Experiments carried out in simulation and for a real world differential drive robot show similar and partially better results of our proposed deep learning based approaches compared to a state-of-the-art method based on fast Fourier transformation.

2020 ◽  
Vol 25 (2) ◽  
pp. 86-97
Author(s):  
Sandy Suryo Prayogo ◽  
Tubagus Maulana Kusuma

DVB merupakan standar transmisi televisi digital yang paling banyak digunakan saat ini. Unsur terpenting dari suatu proses transmisi adalah kualitas gambar dari video yang diterima setelah melalui proses transimisi tersebut. Banyak faktor yang dapat mempengaruhi kualitas dari suatu gambar, salah satunya adalah struktur frame dari video. Pada tulisan ini dilakukan pengujian sensitifitas video MPEG-4 berdasarkan struktur frame pada transmisi DVB-T. Pengujian dilakukan menggunakan simulasi matlab dan simulink. Digunakan juga ffmpeg untuk menyediakan format dan pengaturan video akan disimulasikan. Variabel yang diubah dari video adalah bitrate dan juga group-of-pictures (GOP), sedangkan variabel yang diubah dari transmisi DVB-T adalah signal-to-noise-ratio (SNR) pada kanal AWGN di antara pengirim (Tx) dan penerima (Rx). Hasil yang diperoleh dari percobaan berupa kualitas rata-rata gambar pada video yang diukur menggunakan metode pengukuran structural-similarity-index (SSIM). Dilakukan juga pengukuran terhadap jumlah bit-error-rate BER pada bitstream DVB-T. Percobaan yang dilakukan dapat menunjukkan seberapa besar sensitifitas bitrate dan GOP dari video pada transmisi DVB-T dengan kesimpulan semakin besar bitrate maka akan semakin buruk nilai kualitas gambarnya, dan semakin kecil nilai GOP maka akan semakin baik nilai kualitasnya. Penilitian diharapkan dapat dikembangkan menggunakan deep learning untuk memperoleh frame struktur yang tepat di kondisi-kondisi tertentu dalam proses transmisi televisi digital.


Author(s):  
S. Bash ◽  
B. Johnson ◽  
W. Gibbs ◽  
T. Zhang ◽  
A. Shankaranarayanan ◽  
...  

Abstract Objective This prospective multicenter multireader study evaluated the performance of 40% scan-time reduced spinal magnetic resonance imaging (MRI) reconstructed with deep learning (DL). Methods A total of 61 patients underwent standard of care (SOC) and accelerated (FAST) spine MRI. DL was used to enhance the accelerated set (FAST-DL). Three neuroradiologists were presented with paired side-by-side datasets (666 series). Datasets were blinded and randomized in sequence and left-right display order. Image features were preference rated. Structural similarity index (SSIM) and per pixel L1 was assessed for the image sets pre and post DL-enhancement as a quantitative assessment of image integrity impact. Results FAST-DL was qualitatively better than SOC for perceived signal-to-noise ratio (SNR) and artifacts and equivalent for other features. Quantitative SSIM was high, supporting the absence of image corruption by DL processing. Conclusion DL enables 40% spine MRI scan time reduction while maintaining diagnostic integrity and image quality with perceived benefits in SNR and artifact reduction, suggesting potential for clinical practice utility.


2017 ◽  
Vol 26 (03) ◽  
pp. 1740006 ◽  
Author(s):  
Kiarash Ahi ◽  
Abdiel Rivera ◽  
Anas Mazadi ◽  
Mehdi Anwar

In this paper, a novel approach for marking integrated circuit packages with authentication nanosignatures is introduced. In this work, the signatures patterns are fabricated using electron beam lithography. Moreover, the robustness of these signatures against aging and humidity is investigated. A recipe comprising image processing techniques and measurement of similarity indices has been developed. These signatures are proposed to be fabricated at the manufacturer side of the supply chain. Then, they are decoded at the consumer end. Thus, robustness against ambient environment and aging is a requirement for these signatures to survive in the global supply chain. Calculated Mean Square Error and Structural SIMilarity Index confirmed that the reflected patterns of the signatures remain unchanged against aging and humidity.


2021 ◽  
Vol 11 (8) ◽  
pp. 3508
Author(s):  
Pedro Miguel Martinez-Girones ◽  
Javier Vera-Olmos ◽  
Mario Gil-Correa ◽  
Ana Ramos ◽  
Lina Garcia-Cañamaque ◽  
...  

Typically, pseudo-Computerized Tomography (CT) synthesis schemes proposed in the literature rely on complete atlases acquired with the same field of view (FOV) as the input volume. However, clinical CTs are usually acquired in a reduced FOV to decrease patient ionization. In this work, we present the Franken-CT approach, showing how the use of a non-parametric atlas composed of diverse anatomical overlapping Magnetic Resonance (MR)-CT scans and deep learning methods based on the U-net architecture enable synthesizing extended head and neck pseudo-CTs. Visual inspection of the results shows the high quality of the pseudo-CT and the robustness of the method, which is able to capture the details of the bone contours despite synthesizing the resulting image from knowledge obtained from images acquired with a completely different FOV. The experimental Zero-Normalized Cross-Correlation (ZNCC) reports 0.9367 ± 0.0138 (mean ± SD) and 95% confidence interval (0.9221, 0.9512); the experimental Mean Absolute Error (MAE) reports 73.9149 ± 9.2101 HU and 95% confidence interval (66.3383, 81.4915); the Structural Similarity Index Measure (SSIM) reports 0.9943 ± 0.0009 and 95% confidence interval (0.9935, 0.9951); and the experimental Dice coefficient for bone tissue reports 0.7051 ± 0.1126 and 95% confidence interval (0.6125, 0.7977). The voxel-by-voxel correlation plot shows an excellent correlation between pseudo-CT and ground-truth CT Hounsfield Units (m = 0.87; adjusted R2 = 0.91; p < 0.001). The Bland–Altman plot shows that the average of the differences is low (−38.6471 ± 199.6100; 95% CI (−429.8827, 352.5884)). This work serves as a proof of concept to demonstrate the great potential of deep learning methods for pseudo-CT synthesis and their great potential using real clinical datasets.


2021 ◽  
Vol 11 (3) ◽  
pp. 1089
Author(s):  
Suhong Yoo ◽  
Jisang Lee ◽  
Junsu Bae ◽  
Hyoseon Jang ◽  
Hong-Gyoo Sohn

Aerial images are an outstanding option for observing terrain with their high-resolution (HR) capability. The high operational cost of aerial images makes it difficult to acquire periodic observation of the region of interest. Satellite imagery is an alternative for the problem, but low-resolution is an obstacle. In this study, we proposed a context-based approach to simulate the 10 m resolution of Sentinel-2 imagery to produce 2.5 and 5.0 m prediction images using the aerial orthoimage acquired over the same period. The proposed model was compared with an enhanced deep super-resolution network (EDSR), which has excellent performance among the existing super-resolution (SR) deep learning algorithms, using the peak signal-to-noise ratio (PSNR), structural similarity index measure (SSIM), and root-mean-squared error (RMSE). Our context-based ResU-Net outperformed the EDSR in all three metrics. The inclusion of the 60 m resolution of Sentinel-2 imagery performs better through fine-tuning. When 60 m images were included, RMSE decreased, and PSNR and SSIM increased. The result also validated that the denser the neural network, the higher the quality. Moreover, the accuracy is much higher when both denser feature dimensions and the 60 m images were used.


2021 ◽  
Vol 38 (5) ◽  
pp. 1361-1368
Author(s):  
Fatih M. Senalp ◽  
Murat Ceylan

The thermal camera systems can be used in all kinds of applications that require the detection of heat change, but thermal imaging systems are highly costly systems. In recent years, developments in the field of deep learning have increased the success by obtaining quality results compared to traditional methods. In this paper, thermal images of neonates (healthy - unhealthy) obtained from a high-resolution thermal camera were used and these images were evaluated as high resolution (ground truth) images. Later, these thermal images were downscaled at 1/2, 1/4, 1/8 ratios, and three different datasets consisting of low-resolution images in different sizes were obtained. In this way, super-resolution applications have been carried out on the deep network model developed based on generative adversarial networks (GAN) by using three different datasets. The successful performance of the results was evaluated with PSNR (peak signal to noise ratio) and SSIM (structural similarity index measure). In addition, healthy - unhealthy classification application was carried out by means of a classifier network developed based on convolutional neural networks (CNN) to evaluate the super-resolution images obtained using different datasets. The obtained results show the importance of combining medical thermal imaging with super-resolution methods.


Electronics ◽  
2021 ◽  
Vol 10 (22) ◽  
pp. 2855
Author(s):  
Rabia Naseem ◽  
Faouzi Alaya Cheikh ◽  
Azeddine Beghdadi ◽  
Khan Muhammad ◽  
Muhammad Sajjad

Cross-modal medical imaging techniques are predominantly being used in the clinical suite. The ensemble learning methods using cross-modal medical imaging adds reliability to several medical image analysis tasks. Motivated by the performance of deep learning in several medical imaging tasks, a deep learning-based denoising method Cross-Modality Guided Denoising Network CMGDNet for removing Rician noise in T1-weighted (T1-w) Magnetic Resonance Images (MRI) is proposed in this paper. CMGDNet uses a guidance image, which is a cross-modal (T2-w) image of better perceptual quality to guide the model in denoising its noisy T1-w counterpart. This cross-modal combination allows the network to exploit complementary information existing in both images and therefore improve the learning capability of the model. The proposed framework consists of two components: Paired Hierarchical Learning (PHL) module and Cross-Modal Assisted Reconstruction (CMAR) module. PHL module uses Siamese network to extract hierarchical features from dual images, which are then combined in a densely connected manner in the CMAR module to finally reconstruct the image. The impact of using registered guidance data is investigated in removing noise as well as retaining structural similarity with the original image. Several experiments were conducted on two publicly available brain imaging datasets available on the IXI database. The quantitative assessment using Peak Signal to noise ratio (PSNR), Structural Similarity Index (SSIM), and Feature Similarity Index (FSIM) demonstrates that the proposed method exhibits 4.7% and 2.3% gain (average), respectively, in SSIM and FSIM values compared to other state-of-the-art denoising methods that do not integrate cross-modal image information in removing various levels of noise.


Author(s):  
Manish Balamurugan ◽  
Kathryn Chung ◽  
Venkat Kuppoor ◽  
Smruti Mahapatra ◽  
Aliaksei Pustavoitau ◽  
...  

Abstract In this study, we present USDL, a novel model that employs deep learning algorithms in order to reconstruct and enhance corrupted ultrasound images. We utilize an unsupervised neural network called an autoencoder which works by compressing its input into a latent-space representation and then reconstructing the output from this representation. We trained our model on a dataset that compromises of 15,700 in vivo images of the neck, wrist, elbow, and knee vasculature and compared the quality of the images generated using the structural similarity index (SSIM) and peak to noise ratio (PSNR). In closely simulated conditions, the architecture exhibited an average reconstruction accuracy of 90% as indicated by our SSIM. Our study demonstrates that USDL outperforms state of the art image enhancement and reconstruction techniques in both image quality and computational complexity, while maintaining the architecture efficiency.


2021 ◽  
Vol 5 (1) ◽  
pp. 76
Author(s):  
Cahyo Adhi Hartanto ◽  
Laksmita Rahadianti

Many real-world situations such as bad weather may result in hazy environments. Images captured in these hazy conditions will have low image quality due to microparticles in the air. The microparticles light to scatter and absorb, resulting in hazy images with various effects. In recent years, image dehazing has been researched in depth to handle images captured in these conditions. Various methods were developed, from traditional methods to deep learning methods. Traditional methods focus more on the use of statistical prior. These statistical prior have weaknesses in certain conditions. This paper proposes a novel architecture based on PDR-Net by using a pyramid dilated convolution and pre-processing modules, processing modules, post-processing modules, and attention applications. The proposed network is trained to minimize L1 loss and perceptual loss with the O-Haze dataset. To evaluate our architecture's result, we used structural similarity index measure (SSIM), peak signal-to-noise ratio (PSNR), and color difference as an objective assessment and psychovisual experiment as a subjective assessment. Our architecture obtained better results than the previous method using the O-Haze dataset with an SSIM of 0.798, a PSNR of 25.39, but not better on the color difference. The SSIM and PSNR results were strengthened by using subjective assessments and 65 respondents, most of whom chose the results of the restoration of the image produced by our architecture.


Author(s):  
Song Xue ◽  
Rui Guo ◽  
Karl Peter Bohn ◽  
Jared Matzke ◽  
Marco Viscione ◽  
...  

Abstract Purpose A critical bottleneck for the credibility of artificial intelligence (AI) is replicating the results in the diversity of clinical practice. We aimed to develop an AI that can be independently applied to recover high-quality imaging from low-dose scans on different scanners and tracers. Methods Brain [18F]FDG PET imaging of 237 patients scanned with one scanner was used for the development of AI technology. The developed algorithm was then tested on [18F]FDG PET images of 45 patients scanned with three different scanners, [18F]FET PET images of 18 patients scanned with two different scanners, as well as [18F]Florbetapir images of 10 patients. A conditional generative adversarial network (GAN) was customized for cross-scanner and cross-tracer optimization. Three nuclear medicine physicians independently assessed the utility of the results in a clinical setting. Results The improvement achieved by AI recovery significantly correlated with the baseline image quality indicated by structural similarity index measurement (SSIM) (r = −0.71, p < 0.05) and normalized dose acquisition (r = −0.60, p < 0.05). Our cross-scanner and cross-tracer AI methodology showed utility based on both physical and clinical image assessment (p < 0.05). Conclusion The deep learning development for extensible application on unknown scanners and tracers may improve the trustworthiness and clinical acceptability of AI-based dose reduction.


Sign in / Sign up

Export Citation Format

Share Document