scholarly journals Image editing-based data augmentation for illumination-insensitive background subtraction

2020 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Dimitrios Sakkos ◽  
Edmond S. L. Ho ◽  
Hubert P. H. Shum ◽  
Garry Elvin

PurposeA core challenge in background subtraction (BGS) is handling videos with sudden illumination changes in consecutive frames. In our pilot study published in, Sakkos:SKIMA 2019, we tackle the problem from a data point-of-view using data augmentation. Our method performs data augmentation that not only creates endless data on the fly but also features semantic transformations of illumination which enhance the generalisation of the model.Design/methodology/approachIn our pilot study published in SKIMA 2019, the proposed framework successfully simulates flashes and shadows by applying the Euclidean distance transform over a binary mask generated randomly. In this paper, we further enhance the data augmentation framework by proposing new variations in image appearance both locally and globally.FindingsExperimental results demonstrate the contribution of the synthetics in the ability of the models to perform BGS even when significant illumination changes take place.Originality/valueSuch data augmentation allows us to effectively train an illumination-invariant deep learning model for BGS. We further propose a post-processing method that removes noise from the output binary map of segmentation, resulting in a cleaner, more accurate segmentation map that can generalise to multiple scenes of different conditions. We show that it is possible to train deep learning models even with very limited training samples. The source code of the project is made publicly available at https://github.com/dksakkos/illumination_augmentation

2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
BinBin Zhang ◽  
Fumin Zhang ◽  
Xinghua Qu

Purpose Laser-based measurement techniques offer various advantages over conventional measurement techniques, such as no-destructive, no-contact, fast and long measuring distance. In cooperative laser ranging systems, it’s crucial to extract center coordinates of retroreflectors to accomplish automatic measurement. To solve this problem, this paper aims to propose a novel method. Design/methodology/approach We propose a method using Mask RCNN (Region Convolutional Neural Network), with ResNet101 (Residual Network 101) and FPN (Feature Pyramid Network) as the backbone, to localize retroreflectors, realizing automatic recognition in different backgrounds. Compared with two other deep learning algorithms, experiments show that the recognition rate of Mask RCNN is better especially for small-scale targets. Based on this, an ellipse detection algorithm is introduced to obtain the ellipses of retroreflectors from recognized target areas. The center coordinates of retroreflectors in the camera coordinate system are obtained by using a mathematics method. Findings To verify the accuracy of this method, an experiment was carried out: the distance between two retroreflectors with a known distance of 1,000.109 mm was measured, with 2.596 mm root-mean-squar error, meeting the requirements of the coarse location of retroreflectors. Research limitations/implications The research limitations/implications are as follows: (i) As the data set only has 200 pictures, although we have used some data augmentation methods such as rotating, mirroring and cropping, there is still room for improvement in the generalization ability of detection. (ii) The ellipse detection algorithm needs to work in relatively dark conditions, as the retroreflector is made of stainless steel, which easily reflects light. Originality/value The originality/value of the article lies in being able to obtain center coordinates of multiple retroreflectors automatically even in a cluttered background; being able to recognize retroreflectors with different sizes, especially for small targets; meeting the recognition requirement of multiple targets in a large field of view and obtaining 3 D centers of targets by monocular model-based vision.


2020 ◽  
Vol 13 (4) ◽  
pp. 389-406
Author(s):  
Jiten Chaudhary ◽  
Rajneesh Rani ◽  
Aman Kamboj

PurposeBrain tumor is one of the most dangerous and life-threatening disease. In order to decide the type of tumor, devising a treatment plan and estimating the overall survival time of the patient, accurate segmentation of tumor region from images is extremely important. The process of manual segmentation is very time-consuming and prone to errors; therefore, this paper aims to provide a deep learning based method, that automatically segment the tumor region from MR images.Design/methodology/approachIn this paper, the authors propose a deep neural network for automatic brain tumor (Glioma) segmentation. Intensity normalization and data augmentation have been incorporated as pre-processing steps for the images. The proposed model is trained on multichannel magnetic resonance imaging (MRI) images. The model outputs high-resolution segmentations of brain tumor regions in the input images.FindingsThe proposed model is evaluated on benchmark BRATS 2013 dataset. To evaluate the performance, the authors have used Dice score, sensitivity and positive predictive value (PPV). The superior performance of the proposed model is validated by training very popular UNet model in the similar conditions. The results indicate that proposed model has obtained promising results and is effective for segmentation of Glioma regions in MRI at a clinical level.Practical implicationsThe model can be used by doctors to identify the exact location of the tumorous region.Originality/valueThe proposed model is an improvement to the UNet model. The model has fewer layers and a smaller number of parameters in comparison to the UNet model. This helps the network to train over databases with fewer images and gives superior results. Moreover, the information of bottleneck feature learned by the network has been fused with skip connection path to enrich the feature map.


2021 ◽  
Vol 39 (3) ◽  
pp. 408-418 ◽  
Author(s):  
Changro Lee

PurposePrior studies on the application of deep-learning techniques have focused on enhancing computation algorithms. However, the amount of data is also a key element when attempting to achieve a goal using a quantitative approach, which is often underestimated in practice. The problem of sparse sales data is well known in the valuation of commercial properties. This study aims to expand the limited data available to exploit the capability inherent in deep learning techniques.Design/methodology/approachThe deep learning approach is used. Seoul, the capital of South Korea is selected as a case study area. Second, data augmentation is performed for properties with low trade volume in the market using a variational autoencoder (VAE), which is a generative deep learning technique. Third, the generated samples are added into the original dataset of commercial properties to alleviate data insufficiency. Finally, the accuracy of the price estimation is analyzed for the original and augmented datasets to assess the model performance.FindingsThe results using the sales datasets of commercial properties in Seoul, South Korea as a case study show that the augmented dataset by a VAE consistently shows higher accuracy of price estimation for all 30 trials, and the capabilities inherent in deep learning techniques can be fully exploited, promoting the rapid adoption of artificial intelligence skills in the real estate industry.Originality/valueAlthough deep learning-based algorithms are gaining popularity, they are likely to show limited performance when data are insufficient. This study suggests an alternative approach to overcome the lack of data problem in property valuation.


Electronics ◽  
2019 ◽  
Vol 8 (10) ◽  
pp. 1088 ◽  
Author(s):  
Zhao Pei ◽  
Hang Xu ◽  
Yanning Zhang ◽  
Min Guo ◽  
Yee-Hong Yang

Class attendance is an important means in the management of university students. Using face recognition is one of the most effective techniques for taking daily class attendance. Recently, many face recognition algorithms via deep learning have achieved promising results with large-scale labeled samples. However, due to the difficulties of collecting samples, face recognition using convolutional neural networks (CNNs) for daily attendance taking remains a challenging problem. Data augmentation can enlarge the samples and has been applied to the small sample learning. In this paper, we address this problem using data augmentation through geometric transformation, image brightness changes, and the application of different filter operations. In addition, we determine the best data augmentation method based on orthogonal experiments. Finally, the performance of our attendance method is demonstrated in a real class. Compared with PCA and LBPH methods with data augmentation and VGG-16 network, the accuracy of our proposed method can achieve 86.3%. Additionally, after a period of collecting more data, the accuracy improves to 98.1%.


Sensors ◽  
2021 ◽  
Vol 21 (21) ◽  
pp. 7018
Author(s):  
Justin Lo ◽  
Jillian Cardinell ◽  
Alejo Costanzo ◽  
Dafna Sussman

Deep learning (DL) algorithms have become an increasingly popular choice for image classification and segmentation tasks; however, their range of applications can be limited. Their limitation stems from them requiring ample data to achieve high performance and adequate generalizability. In the case of clinical imaging data, images are not always available in large quantities. This issue can be alleviated by using data augmentation (DA) techniques. The choice of DA is important because poor selection can possibly hinder the performance of a DL algorithm. We propose a DA policy search algorithm that offers an extended set of transformations that accommodate the variations in biomedical imaging datasets. The algorithm makes use of the efficient and high-dimensional optimizer Bi-Population Covariance Matrix Adaptation Evolution Strategy (BIPOP-CMA-ES) and returns an optimal DA policy based on any input imaging dataset and a DL algorithm. Our proposed algorithm, Medical Augmentation (Med-Aug), can be implemented by other researchers in related medical DL applications to improve their model’s performance. Furthermore, we present our found optimal DA policies for a variety of medical datasets and popular segmentation networks for other researchers to use in related tasks.


Sign in / Sign up

Export Citation Format

Share Document