scholarly journals Do Game Data Generalize Well for Remote Sensing Image Segmentation?

2020 ◽  
Vol 12 (2) ◽  
pp. 275 ◽  
Author(s):  
Zhengxia Zou ◽  
Tianyang Shi ◽  
Wenyuan Li ◽  
Zhou Zhang ◽  
Zhenwei Shi

Despite the recent progress in deep learning and remote sensing image interpretation, the adaption of a deep learning model between different sources of remote sensing data still remains a challenge. This paper investigates an interesting question: do synthetic data generalize well for remote sensing image applications? To answer this question, we take the building segmentation as an example by training a deep learning model on the city map of a well-known video game “Grand Theft Auto V” and then adapting the model to real-world remote sensing images. We propose a generative adversarial training based segmentation framework to improve the adaptability of the segmentation model. Our model consists of a CycleGAN model and a ResNet based segmentation network, where the former one is a well-known image-to-image translation framework which learns a mapping of the image from the game domain to the remote sensing domain; and the latter one learns to predict pixel-wise building masks based on the transformed data. All models in our method can be trained in an end-to-end fashion. The segmentation model can be trained without using any additional ground truth reference of the real-world images. Experimental results on a public building segmentation dataset suggest the effectiveness of our adaptation method. Our method shows superiority over other state-of-the-art semantic segmentation methods, for example, Deeplab-v3 and UNet. Another advantage of our method is that by introducing semantic information to the image-to-image translation framework, the image style conversion can be further improved.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Rajat Garg ◽  
Anil Kumar ◽  
Nikunj Bansal ◽  
Manish Prateek ◽  
Shashi Kumar

AbstractUrban area mapping is an important application of remote sensing which aims at both estimation and change in land cover under the urban area. A major challenge being faced while analyzing Synthetic Aperture Radar (SAR) based remote sensing data is that there is a lot of similarity between highly vegetated urban areas and oriented urban targets with that of actual vegetation. This similarity between some urban areas and vegetation leads to misclassification of the urban area into forest cover. The present work is a precursor study for the dual-frequency L and S-band NASA-ISRO Synthetic Aperture Radar (NISAR) mission and aims at minimizing the misclassification of such highly vegetated and oriented urban targets into vegetation class with the help of deep learning. In this study, three machine learning algorithms Random Forest (RF), K-Nearest Neighbour (KNN), and Support Vector Machine (SVM) have been implemented along with a deep learning model DeepLabv3+ for semantic segmentation of Polarimetric SAR (PolSAR) data. It is a general perception that a large dataset is required for the successful implementation of any deep learning model but in the field of SAR based remote sensing, a major issue is the unavailability of a large benchmark labeled dataset for the implementation of deep learning algorithms from scratch. In current work, it has been shown that a pre-trained deep learning model DeepLabv3+ outperforms the machine learning algorithms for land use and land cover (LULC) classification task even with a small dataset using transfer learning. The highest pixel accuracy of 87.78% and overall pixel accuracy of 85.65% have been achieved with DeepLabv3+ and Random Forest performs best among the machine learning algorithms with overall pixel accuracy of 77.91% while SVM and KNN trail with an overall accuracy of 77.01% and 76.47% respectively. The highest precision of 0.9228 is recorded for the urban class for semantic segmentation task with DeepLabv3+ while machine learning algorithms SVM and RF gave comparable results with a precision of 0.8977 and 0.8958 respectively.


2020 ◽  
Vol 22 (Supplement_3) ◽  
pp. iii359-iii359
Author(s):  
Lydia Tam ◽  
Edward Lee ◽  
Michelle Han ◽  
Jason Wright ◽  
Leo Chen ◽  
...  

Abstract BACKGROUND Brain tumors are the most common solid malignancies in childhood, many of which develop in the posterior fossa (PF). Manual tumor measurements are frequently required to optimize registration into surgical navigation systems or for surveillance of nonresectable tumors after therapy. With recent advances in artificial intelligence (AI), automated MRI-based tumor segmentation is now feasible without requiring manual measurements. Our goal was to create a deep learning model for automated PF tumor segmentation that can register into navigation systems and provide volume output. METHODS 720 pre-surgical MRI scans from five pediatric centers were divided into training, validation, and testing datasets. The study cohort comprised of four PF tumor types: medulloblastoma, diffuse midline glioma, ependymoma, and brainstem or cerebellar pilocytic astrocytoma. Manual segmentation of the tumors by an attending neuroradiologist served as “ground truth” labels for model training and evaluation. We used 2D Unet, an encoder-decoder convolutional neural network architecture, with a pre-trained ResNet50 encoder. We assessed ventricle segmentation accuracy on a held-out test set using Dice similarity coefficient (0–1) and compared ventricular volume calculation between manual and model-derived segmentations using linear regression. RESULTS Compared to the ground truth expert human segmentation, overall Dice score for model performance accuracy was 0.83 for automatic delineation of the 4 tumor types. CONCLUSIONS In this multi-institutional study, we present a deep learning algorithm that automatically delineates PF tumors and outputs volumetric information. Our results demonstrate applied AI that is clinically applicable, potentially augmenting radiologists, neuro-oncologists, and neurosurgeons for tumor evaluation, surveillance, and surgical planning.


Stroke ◽  
2021 ◽  
Vol 52 (Suppl_1) ◽  
Author(s):  
Yannan Yu ◽  
Soren Christensen ◽  
Yuan Xie ◽  
Enhao Gong ◽  
Maarten G Lansberg ◽  
...  

Objective: Ischemic core prediction from CT perfusion (CTP) remains inaccurate compared with gold standard diffusion-weighted imaging (DWI). We evaluated if a deep learning model to predict the DWI lesion from MR perfusion (MRP) could facilitate ischemic core prediction on CTP. Method: Using the multi-center CRISP cohort of acute ischemic stroke patient with CTP before thrombectomy, we included patients with major reperfusion (TICI score≥2b), adequate image quality, and follow-up MRI at 3-7 days. Perfusion parameters including Tmax, mean transient time, cerebral blood flow (CBF), and cerebral blood volume were reconstructed by RAPID software. Core lab experts outlined the stroke lesion on the follow-up MRI. A previously trained MRI model in a separate group of patients was used as a starting point, which used MRP parameters as input and RAPID ischemic core on DWI as ground truth. We fine-tuned this model, using CTP parameters as input, and follow-up MRI as ground truth. Another model was also trained from scratch with only CTP data. 5-fold cross validation was used. Performance of the models was compared with ischemic core (rCBF≤30%) from RAPID software to identify the presence of a large infarct (volume>70 or >100ml). Results: 94 patients in the CRISP trial met the inclusion criteria (mean age 67±15 years, 52% male, median baseline NIHSS 18, median 90-day mRS 2). Without fine-tuning, the MRI model had an agreement of 73% in infarct >70ml, and 69% in >100ml; the MRI model fine-tuned on CT improved the agreement to 77% and 73%; The CT model trained from scratch had agreements of 73% and 71%; All of the deep learning models outperformed the rCBF segmentation from RAPID, which had agreements of 51% and 64%. See Table and figure. Conclusions: It is feasible to apply MRP-based deep learning model to CT. Fine-tuning with CTP data further improves the predictions. All deep learning models predict the stroke lesion after major recanalization better than thresholding approaches based on rCBF.


2020 ◽  
Vol 39 (10) ◽  
pp. 734-741
Author(s):  
Sébastien Guillon ◽  
Frédéric Joncour ◽  
Pierre-Emmanuel Barrallon ◽  
Laurent Castanié

We propose new metrics to measure the performance of a deep learning model applied to seismic interpretation tasks such as fault and horizon extraction. Faults and horizons are thin geologic boundaries (1 pixel thick on the image) for which a small prediction error could lead to inappropriately large variations in common metrics (precision, recall, and intersection over union). Through two examples, we show how classical metrics could fail to indicate the true quality of fault or horizon extraction. Measuring the accuracy of reconstruction of thin objects or boundaries requires introducing a tolerance distance between ground truth and prediction images to manage the uncertainties inherent in their delineation. We therefore adapt our metrics by introducing a tolerance function and illustrate their ability to manage uncertainties in seismic interpretation. We compare classical and new metrics through different examples and demonstrate the robustness of our metrics. Finally, we show on a 3D West African data set how our metrics are used to tune an optimal deep learning model.


2021 ◽  
Author(s):  
Federico Figari Tomenotti

Change detection is a well-known topic of remote sensing. The goal is to track and monitor the evolution of changes affecting the Earth surface over time. The recently increased availability in remote sensing data for Earth observation and in computational power has raised the interest in this field of research. In particular, the keywords “multitemporal” and “heterogeneous” play prominent roles. The former refers to the availability and the comparison of two or more satellite images of the same place on the ground, in order to find changes and track the evolution of the observed surface, maybe with different time sensitivities. The latter refers to the capability of performing change detection with images coming from different sources, corresponding to different sensors, wavelengths, polarizations, acquisition geometries, etc. This thesis addresses the challenging topic of multitemporal change detection with heterogeneous remote sensing images. It proposes a novel approach, taking inspiration from recent developments in the literature. The proposed method is based on deep learning - involving autoencoders of convolutional neural networks - and represents an exapmple of unsupervised change detection. A major novelty of the work consists in including a prior information model, used to make the method unsupervised, within a well-established algorithm such as the canonical correlation analysis, and in combining these with a deep learning framework to give rise to an image translation method able to compare heterogeneous images regardless of their highly different domains. The theoretical analysis is supported by experimental results, comparing the proposed methodology to the state of the art of this discipline. Two different datasets were used for the experiments, and the results obtained on both of them show the effectiveness of the proposed method.


Author(s):  
Antonios Alexos ◽  
Sotirios Chatzis

In this paper we address the understanding of the problem, of why a deep learning model decides that an individual is eligible for a loan or not. Here we propose a novel approach for inferring, which attributes matter the most, for making a decision in each specific individual case. Specifically we leverage concepts from neural attention to devise a novel feature wise attention mechanism. As we show, using real world datasets, our approach offers unique insights into the importance of various features, by producing a decision explanation for each specific loan case. At the same time, we observe that our novel mechanism, generates decisions which are much closer to the decisions generated by human experts, compared to the existent competitors.


Sign in / Sign up

Export Citation Format

Share Document