Comparing optimization methods for deep learning in image processing applications

2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Alexander Geng ◽  
Ali Moghiseh ◽  
Claudia Redenbach ◽  
Katja Schladitz

Abstract Training a deep learning network requires choosing its weights such that the output minimizes a given loss function. In practice, stochastic gradient descent is frequently used for solving the optimization problem. Several variants of this approach have been suggested in the literature. We study the impact of the choice of the optimization method on the outcome of the learning process at the example of two image processing applications from quite different fields. The first one is artistic style transfer, where the content of one image is combined with the style of another one. The second application is a real world classification task from industry, namely detecting defects in images of air filters. In both cases, clear differences between the results of the individual optimization methods are observed.

2021 ◽  
Author(s):  
Shidong Li ◽  
Jianwei Liu ◽  
Zhanjie Song

Abstract Since magnetic resonance imaging (MRI) has superior soft tissue contrast, contouring (brain) tumor accurately by MRI images is essential in medical image processing. Segmenting tumor accurately is immensely challenging, since tumor and normal tissues are often inextricably intertwined in the brain. It is also extremely time consuming manually. Late deep learning techniques start to show reasonable success in brain tumor segmentation automatically. The purpose of this study is to develop a new region-ofinterest-aided (ROI-aided) deep learning technique for automatic brain tumor MRI segmentation. The method consists of two major steps. Step one is to use a 2D network with U-Net architecture to localize the tumor ROI, which is to reduce the impact of normal tissue’s disturbance. Then a 3D U-Net is performed in step 2 for tumor segmentation within identified ROI. The proposed method is validated on MICCAI BraTS 2015 Challenge with 220 high Gliomas grade (HGG) and 54 low Gliomas grade (LGG) patients’ data. The Dice similarity coefficient and the Hausdorff distance between the manual tumor contour and that segmented by the proposed method are 0.876 ±0.068 and 3.594±1.347 mm, respectively. These numbers are indications that our proposed method is an effective ROI-aided deep learning strategy for brain MRI tumor segmentation, and a valid and useful tool in medical image processing.


Author(s):  
Mohammed Abdulla Salim Al Husaini ◽  
Mohamed Hadi Habaebi ◽  
Teddy Surya Gunawan ◽  
Md Rafiqul Islam ◽  
Elfatih A. A. Elsheikh ◽  
...  

AbstractBreast cancer is one of the most significant causes of death for women around the world. Breast thermography supported by deep convolutional neural networks is expected to contribute significantly to early detection and facilitate treatment at an early stage. The goal of this study is to investigate the behavior of different recent deep learning methods for identifying breast disorders. To evaluate our proposal, we built classifiers based on deep convolutional neural networks modelling inception V3, inception V4, and a modified version of the latter called inception MV4. MV4 was introduced to maintain the computational cost across all layers by making the resultant number of features and the number of pixel positions equal. DMR database was used for these deep learning models in classifying thermal images of healthy and sick patients. A set of epochs 3–30 were used in conjunction with learning rates 1 × 10–3, 1 × 10–4 and 1 × 10–5, Minibatch 10 and different optimization methods. The training results showed that inception V4 and MV4 with color images, a learning rate of 1 × 10–4, and SGDM optimization method, reached very high accuracy, verified through several experimental repetitions. With grayscale images, inception V3 outperforms V4 and MV4 by a considerable accuracy margin, for any optimization methods. In fact, the inception V3 (grayscale) performance is almost comparable to inception V4 and MV4 (color) performance but only after 20–30 epochs. inception MV4 achieved 7% faster classification response time compared to V4. The use of MV4 model is found to contribute to saving energy consumed and fluidity in arithmetic operations for the graphic processor. The results also indicate that increasing the number of layers may not necessarily be useful in improving the performance.


Author(s):  
Xiaohui Wang ◽  
Yiran Lyu ◽  
Junfeng Huang ◽  
Ziying Wang ◽  
Jingyan Qin

AbstractArtistic style transfer is to render an image in the style of another image, which is a challenge problem in both image processing and arts. Deep neural networks are adopted to artistic style transfer and achieve remarkable success, such as AdaIN (adaptive instance normalization), WCT (whitening and coloring transforms), MST (multimodal style transfer), and SEMST (structure-emphasized multimodal style transfer). These algorithms modify the content image as a whole using only one style and one algorithm, which is easy to cause the foreground and background to be blurred together. In this paper, an iterative artistic multi-style transfer system is built to edit the image with multiple styles by flexible user interaction. First, a subjective evaluation experiment with art professionals is conducted to build an open evaluation framework for style transfer, including the universal evaluation questions and personalized answers for ten typical artistic styles. Then, we propose the interactive artistic multi-style transfer system, in which an interactive image crop tool is designed to cut a content image into several parts. For each part, users select a style image and an algorithm from AdaIN, WCT, MST, and SEMST by referring to the characteristics of styles and algorithms summarized by the evaluation experiments. To obtain richer results, the system provides a semantic-based parameter adjustment mode and the function of preserving colors of content image. Finally, case studies show the effectiveness and flexibility of the system.


2021 ◽  
Vol 13 (9) ◽  
pp. 1689
Author(s):  
Chuang Lin ◽  
Shanxin Guo ◽  
Jinsong Chen ◽  
Luyi Sun ◽  
Xiaorou Zheng ◽  
...  

The deep-learning-network performance depends on the accuracy of the training samples. The training samples are commonly labeled by human visual investigation or inherited from historical land-cover or land-use maps, which usually contain label noise, depending on subjective knowledge and the time of the historical map. Helping the network to distinguish noisy labels during the training process is a prerequisite for applying the model for training across time and locations. This study proposes an antinoise framework, the Weight Loss Network (WLN), to achieve this goal. The WLN contains three main parts: (1) the segmentation subnetwork, which any state-of-the-art segmentation network can replace; (2) the attention subnetwork (λ); and (3) the class-balance coefficient (α). Four types of label noise (an insufficient label, redundant label, missing label and incorrect label) were simulated by dilate and erode processing to test the network’s antinoise ability. The segmentation task was set to extract buildings from the Inria Aerial Image Labeling Dataset, which includes Austin, Chicago, Kitsap County, Western Tyrol and Vienna. The network’s performance was evaluated by comparing it with the original U-Net model by adding noisy training samples with different noise rates and noise levels. The result shows that the proposed antinoise framework (WLN) can maintain high accuracy, while the accuracy of the U-Net model dropped. Specifically, after adding 50% of dilated-label samples at noise level 3, the U-Net model’s accuracy dropped by 12.7% for OA, 20.7% for the Mean Intersection over Union (MIOU) and 13.8% for Kappa scores. By contrast, the accuracy of the WLN dropped by 0.2% for OA, 0.3% for the MIOU and 0.8% for Kappa scores. For eroded-label samples at the same level, the accuracy of the U-Net model dropped by 8.4% for OA, 24.2% for the MIOU and 43.3% for Kappa scores, while the accuracy of the WLN dropped by 4.5% for OA, 4.7% for the MIOU and 0.5% for Kappa scores. This result shows that the antinoise framework proposed in this paper can help current segmentation models to avoid the impact of noisy training labels and has the potential to be trained by a larger remote sensing image set regardless of the inner label error.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Qingliang Meng ◽  
Meiyu Huang ◽  
Yao Xu ◽  
Naijin Liu ◽  
Xueshuang Xiang

For the space-based remote sensing system, onboard intelligent processing based on deep learning has become an inevitable trend. To adapt to the dynamic changes of the observation scenes, there is an urgent need to perform distributed deep learning onboard to fully utilize the plentiful real-time sensing data of multiple satellites from a smart constellation. However, the network bandwidth of the smart constellation is very limited. Therefore, it is of great significance to carry out distributed training research in a low-bandwidth environment. This paper proposes a Randomized Decentralized Parallel Stochastic Gradient Descent (RD-PSGD) method for distributed training in a low-bandwidth network. To reduce the communication cost, each node in RD-PSGD just randomly transfers part of the information of the local intelligent model to its neighborhood. We further speed up the algorithm by optimizing the programming of random index generation and parameter extraction. For the first time, we theoretically analyze the convergence property of the proposed RD-PSGD and validate the advantage of this method by simulation experiments on various distributed training tasks for image classification on different benchmark datasets and deep learning network architectures. The results show that RD-PSGD can effectively save the time and bandwidth cost of distributed training and reduce the complexity of parameter selection compared with the TopK-based method. The method proposed in this paper provides a new perspective for the study of onboard intelligent processing, especially for online learning on a smart satellite constellation.


2022 ◽  
Vol 2022 ◽  
pp. 1-10
Author(s):  
Xuhui Fu

With the continuous development and popularization of artificial intelligence technology in recent years, the field of deep learning has also developed relatively rapidly. The application of deep learning technology has attracted attention in image detection, image recognition, image recoloring, and image artistic style transfer. Some image art style transfer techniques with deep learning as the core are also widely used. This article intends to create an image art style transfer algorithm to quickly realize the image art style transfer based on the generation of confrontation network. The principle of generating a confrontation network is mainly to change the traditional deconvolution operation, by adjusting the image size and then convolving, using the content encoder and style encoder to encode the content and style of the selected image, and by extracting the content and style features. In order to enhance the effect of image artistic style transfer, the image is recognized by using a multi-scale discriminator. The experimental results show that this algorithm is effective and has great application and promotion value.


Author(s):  
Zhihong He ◽  
Wenjie Jia ◽  
Erhua Sun ◽  
Huilong Sun

The existing optimization methods have the problem of image edge blur, which leads to a high degree of shadow residue. In order to address this problem, reduce the shadow residual degree, this paper designs a 3D video image processing effect optimization method supported by virtual reality technology. Coding was used to eliminate redundant data in video and eliminate image noise using median filtering. The virtual reality technology detects the image edge and determines the motion offset between the image frames. According to the motion parameters of the camera carrier obtained from the motion estimation, the feature point matching algorithm constructs the video image motion model, and uses the camera calibration technology to set the processing effect optimization mode. It is optimized by perspective projection transformation. Experimental results: the average shadow residual degree of the optimization method and the two existing optimization methods are 3.108%, 6.167% and 6.396% respectively, which proves that the optimization method combined with virtual reality technology has higher practical application value.


Sign in / Sign up

Export Citation Format

Share Document