Apache Spark Accelerated Deep Learning Inference for Large Scale Satellite Image Analytics

Self-Supervised Pre-Training of Transformers for Satellite Image Time Series Classification

10.36227/techrxiv.13025039.v1 ◽

2020 ◽

Author(s):

Yuan Yuan ◽

Lei Lin

Keyword(s):

Time Series ◽

Deep Learning ◽

Large Scale ◽

Temporal Structure ◽

Satellite Image ◽

Fine Tuning ◽

Small Scale ◽

Model Parameters ◽

Learning Approaches ◽

Wide Range

Satellite image time series (SITS) classification is a major research topic in remote sensing and is relevant for a wide range of applications. Deep learning approaches have been commonly employed for SITS classification and have provided state-of-the-art performance. However, deep learning methods suffer from overfitting when labeled data is scarce. To address this problem, we propose a novel self-supervised pre-training scheme to initialize a Transformer-based network by utilizing large-scale unlabeled data. In detail, the model is asked to predict randomly contaminated observations given an entire time series of a pixel. The main idea of our proposal is to leverage the inherent temporal structure of satellite time series to learn general-purpose spectral-temporal representations related to land cover semantics. Once pre-training is completed, the pre-trained network can be further adapted to various SITS classification tasks by fine-tuning all the model parameters on small-scale task-related labeled data. In this way, the general knowledge and representations about SITS can be transferred to a label-scarce task, thereby improving the generalization performance of the model as well as reducing the risk of overfitting. Comprehensive experiments have been carried out on three benchmark datasets over large study areas. Experimental results demonstrate the effectiveness of the proposed method, leading to a classification accuracy increment up to 1.91% to 6.69%. <div><b>This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.</b></div>

Download Full-text

Self-Supervised Pre-Training of Transformers for Satellite Image Time Series Classification

10.36227/techrxiv.13025039.v3 ◽

2020 ◽

Author(s):

Yuan Yuan ◽

Lei Lin

Keyword(s):

Time Series ◽

Deep Learning ◽

Large Scale ◽

Temporal Structure ◽

Satellite Image ◽

Fine Tuning ◽

Small Scale ◽

Model Parameters ◽

Learning Approaches ◽

Wide Range

<div>Satellite image time series (SITS) classification is a major research topic in remote sensing and is relevant for a wide range of applications. Deep learning approaches have been commonly employed for SITS classification and have provided state-of-the-art performance. However, deep learning methods suffer from overfitting when labeled data is scarce. To address this problem, we propose a novel self-supervised pre-training scheme to initialize a Transformer-based network by utilizing large-scale unlabeled data. In detail, the model is asked to predict randomly contaminated observations given an entire time series of a pixel. The main idea of our proposal is to leverage the inherent temporal structure of satellite time series to learn general-purpose spectral-temporal representations related to land cover semantics. Once pre-training is completed, the pre-trained network can be further adapted to various SITS classification tasks by fine-tuning all the model parameters on small-scale task-related labeled data. In this way, the general knowledge and representations about SITS can be transferred to a label-scarce task, thereby improving the generalization performance of the model as well as reducing the risk of overfitting. Comprehensive experiments have been carried out on three benchmark datasets over large study areas. Experimental results demonstrate the effectiveness of the proposed method, leading to a classification accuracy increment up to 2.38% to 5.27%. The code and the pre-trained model will be available at https://github.com/linlei1214/SITS-BERT upon publication.</div><div><b>This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.</b></div>

Download Full-text

Vegetation change detection based on time series analysis by Apache Spark and RasterFrame

Journal of Mining and Earth Sciences ◽

10.46326/jmes.2021.62(1).06 ◽

2021 ◽

Vol 62 (1) ◽

pp. 42-52

Author(s):

Dung Mai Thi Nguyen ◽

Thu Hoai Thi Vu ◽

Keyword(s):

Time Series ◽

Large Scale ◽

Data Science ◽

Vegetation Change ◽

Environmental Changes ◽

Satellite Image ◽

Image Data ◽

Apache Spark ◽

Raster Data ◽

Average Value

Spatial big data has a large scale and complex, therefore, it cannot be collected, managed, and analyzed by traditional data analytic software shortly. These platforms in many situations are restricted to vectors data. However, the raster data generated by the sensors on the enormous number of satellites now needs to be processed in parallel on the cluster environment. The article introduces the satellite image data analyzing method using the RasterFrames library on the Apache Spark platform. The RasterFrames library examines raster data for Python, Scala, and SQL, bringing the power of Spark DataFrames to access to Earth Observation, cloud computing, and data science. In the experimental part, the NDVI and the change in the average value of NDVI in the time series are calculated to demonstrate the vegetation mantle changes in Phu Tho province. These results are the reference data source in the assessment of weather, climate, and environmental changes in the study area during that time.

Download Full-text

A Large-Scale Dataset and Deep Learning Model for Detecting and Counting Olive Trees in Satellite Imagery

Computational Intelligence and Neuroscience ◽

10.1155/2022/1549842 ◽

2022 ◽

Vol 2022 ◽

pp. 1-8

Author(s):

Amr Abozeid ◽

Rayan Alanazi ◽

Ahmed Elhadad ◽

Ahmed I. Taloba ◽

Rasha M. Abd El-Aziz

Keyword(s):

Deep Learning ◽

Satellite Imagery ◽

Large Scale ◽

Estimation Error ◽

Satellite Image ◽

Olive Tree ◽

Learning Model ◽

Large Area ◽

Olive Trees ◽

Deep Learning Model

Since the Pre-Roman era, olive trees have a significant economic and cultural value. In 2019, the Al-Jouf region, in the north of the Kingdom of Saudi Arabia, gained a global presence by entering the Guinness World Records, with the largest number of olive trees in the world. Olive tree detecting and counting from a given satellite image are a significant and difficult computer vision problem. Because olive farms are spread out over a large area, manually counting the trees is impossible. Moreover, accurate automatic detection and counting of olive trees in satellite images have many challenges such as scale variations, weather changes, perspective distortions, and orientation changes. Another problem is the lack of a standard database of olive trees available for deep learning applications. To address these problems, we first build a large-scale olive dataset dedicated to deep learning research and applications. The dataset consists of 230 RGB images collected over the territory of Al-Jouf, KSA. We then propose an efficient deep learning model (SwinTUnet) for detecting and counting olive trees from satellite imagery. The proposed SwinTUnet is a Unet-like network which consists of an encoder, a decoder, and skip connections. Swin Transformer block is the fundamental unit of SwinTUnet to learn local and global semantic information. The results of an experimental study on the proposed dataset show that the SwinTUnet model outperforms the related studies in terms of overall detection with a 0.94% estimation error.

Download Full-text

Self-Supervised Pre-Training of Transformers for Satellite Image Time Series Classification

10.36227/techrxiv.13025039.v2 ◽

2020 ◽

Author(s):

Yuan Yuan ◽

Lei Lin

Keyword(s):

Time Series ◽

Deep Learning ◽

Large Scale ◽

Temporal Structure ◽

Satellite Image ◽

Fine Tuning ◽

Small Scale ◽

Model Parameters ◽

Learning Approaches ◽

Wide Range

<div>Satellite image time series (SITS) classification is a major research topic in remote sensing and is relevant for a wide range of applications. Deep learning approaches have been commonly employed for SITS classification and have provided state-of-the-art performance. However, deep learning methods suffer from overfitting when labeled data is scarce. To address this problem, we propose a novel self-supervised pre-training scheme to initialize a Transformer-based network by utilizing large-scale unlabeled data. In detail, the model is asked to predict randomly contaminated observations given an entire time series of a pixel. The main idea of our proposal is to leverage the inherent temporal structure of satellite time series to learn general-purpose spectral-temporal representations related to land cover semantics. Once pre-training is completed, the pre-trained network can be further adapted to various SITS classification tasks by fine-tuning all the model parameters on small-scale task-related labeled data. In this way, the general knowledge and representations about SITS can be transferred to a label-scarce task, thereby improving the generalization performance of the model as well as reducing the risk of overfitting. Comprehensive experiments have been carried out on three benchmark datasets over large study areas. Experimental results demonstrate the effectiveness of the proposed method, leading to a classification accuracy increment up to 2.38% to 5.27%. The code and the pre-trained model will be available at https://github.com/linlei1214/SITS-BERT upon publication.</div><div><b>This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.</b></div>

Download Full-text

Self-Supervised Pre-Training of Transformers for Satellite Image Time Series Classification

10.36227/techrxiv.13025039 ◽

2020 ◽

Author(s):

Yuan Yuan ◽

Lei Lin

Keyword(s):

Time Series ◽

Deep Learning ◽

Large Scale ◽

Temporal Structure ◽

Satellite Image ◽

Fine Tuning ◽

Small Scale ◽

Model Parameters ◽

Learning Approaches ◽

Wide Range

<div>Satellite image time series (SITS) classification is a major research topic in remote sensing and is relevant for a wide range of applications. Deep learning approaches have been commonly employed for SITS classification and have provided state-of-the-art performance. However, deep learning methods suffer from overfitting when labeled data is scarce. To address this problem, we propose a novel self-supervised pre-training scheme to initialize a Transformer-based network by utilizing large-scale unlabeled data. In detail, the model is asked to predict randomly contaminated observations given an entire time series of a pixel. The main idea of our proposal is to leverage the inherent temporal structure of satellite time series to learn general-purpose spectral-temporal representations related to land cover semantics. Once pre-training is completed, the pre-trained network can be further adapted to various SITS classification tasks by fine-tuning all the model parameters on small-scale task-related labeled data. In this way, the general knowledge and representations about SITS can be transferred to a label-scarce task, thereby improving the generalization performance of the model as well as reducing the risk of overfitting. Comprehensive experiments have been carried out on three benchmark datasets over large study areas. Experimental results demonstrate the effectiveness of the proposed method, leading to a classification accuracy increment up to 2.38% to 5.27%. The code and the pre-trained model will be available at https://github.com/linlei1214/SITS-BERT upon publication.</div><div><b>This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.</b></div>

Download Full-text

Multi Disease-Prediction Framework Using Hybrid Deep Learning: An Optimal Prediction Model (Preprint)

10.2196/preprints.22865 ◽

2020 ◽

Author(s):

Anusha Ampavathi ◽

Vijaya Saradhi T

Keyword(s):

Feature Extraction ◽

Big Data ◽

Deep Learning ◽

Weight Function ◽

Optimization Algorithm ◽

Large Scale ◽

Heuristic Algorithms ◽

Disease Prediction ◽

Health Care Decisions ◽

Proposed Model

UNSTRUCTURED Big data and its approaches are generally helpful for healthcare and biomedical sectors for predicting the disease. For trivial symptoms, the difficulty is to meet the doctors at any time in the hospital. Thus, big data provides essential data regarding the diseases on the basis of the patient’s symptoms. For several medical organizations, disease prediction is important for making the best feasible health care decisions. Conversely, the conventional medical care model offers input as structured that requires more accurate and consistent prediction. This paper is planned to develop the multi-disease prediction using the improvised deep learning concept. Here, the different datasets pertain to “Diabetes, Hepatitis, lung cancer, liver tumor, heart disease, Parkinson’s disease, and Alzheimer’s disease”, from the benchmark UCI repository is gathered for conducting the experiment. The proposed model involves three phases (a) Data normalization (b) Weighted normalized feature extraction, and (c) prediction. Initially, the dataset is normalized in order to make the attribute's range at a certain level. Further, weighted feature extraction is performed, in which a weight function is multiplied with each attribute value for making large scale deviation. Here, the weight function is optimized using the combination of two meta-heuristic algorithms termed as Jaya Algorithm-based Multi-Verse Optimization algorithm (JA-MVO). The optimally extracted features are subjected to the hybrid deep learning algorithms like “Deep Belief Network (DBN) and Recurrent Neural Network (RNN)”. As a modification to hybrid deep learning architecture, the weight of both DBN and RNN is optimized using the same hybrid optimization algorithm. Further, the comparative evaluation of the proposed prediction over the existing models certifies its effectiveness through various performance measures.

Download Full-text

Deep Learning-Based Large-Scale Automatic Satellite Crosswalk Classification

IEEE Geoscience and Remote Sensing Letters ◽

10.1109/lgrs.2017.2719863 ◽

2017 ◽

Vol 14 (9) ◽

pp. 1513-1517 ◽

Cited By ~ 19

Author(s):

Rodrigo F. Berriel ◽

Andre Teixeira Lopes ◽

Alberto F. de Souza ◽

Thiago Oliveira-Santos

Keyword(s):

Deep Learning ◽

Large Scale

Download Full-text

Deep Learning-Based Classification of Large-Scale Airborne LiDAR Point Cloud

Canadian Journal of Remote Sensing ◽

10.1080/07038992.2021.1927687 ◽

2021 ◽

pp. 1-15

Author(s):

Mathieu Turgeon-Pelchat ◽

Samuel Foucher ◽

Yacine Bouroubi

Keyword(s):

Deep Learning ◽

Point Cloud ◽

Large Scale ◽

Airborne Lidar

Download Full-text

Development and validation of a deep learning system to screen vision-threatening conditions in high myopia using optical coherence tomography images

British Journal of Ophthalmology ◽

10.1136/bjophthalmol-2020-317825 ◽

2020 ◽

pp. bjophthalmol-2020-317825

Author(s):

Yonghao Li ◽

Weibo Feng ◽

Xiujuan Zhao ◽

Bingqian Liu ◽

Yan Zhang ◽

...

Keyword(s):

Optical Coherence Tomography ◽

Deep Learning ◽

High Myopia ◽

Large Scale ◽

Learning System ◽

Youden Index ◽

Optical Coherence ◽

Test Dataset ◽

Independent Test ◽

Independent Test Dataset

Background/aimsTo apply deep learning technology to develop an artificial intelligence (AI) system that can identify vision-threatening conditions in high myopia patients based on optical coherence tomography (OCT) macular images.MethodsIn this cross-sectional, prospective study, a total of 5505 qualified OCT macular images obtained from 1048 high myopia patients admitted to Zhongshan Ophthalmic Centre (ZOC) from 2012 to 2017 were selected for the development of the AI system. The independent test dataset included 412 images obtained from 91 high myopia patients recruited at ZOC from January 2019 to May 2019. We adopted the InceptionResnetV2 architecture to train four independent convolutional neural network (CNN) models to identify the following four vision-threatening conditions in high myopia: retinoschisis, macular hole, retinal detachment and pathological myopic choroidal neovascularisation. Focal Loss was used to address class imbalance, and optimal operating thresholds were determined according to the Youden Index.ResultsIn the independent test dataset, the areas under the receiver operating characteristic curves were high for all conditions (0.961 to 0.999). Our AI system achieved sensitivities equal to or even better than those of retina specialists as well as high specificities (greater than 90%). Moreover, our AI system provided a transparent and interpretable diagnosis with heatmaps.ConclusionsWe used OCT macular images for the development of CNN models to identify vision-threatening conditions in high myopia patients. Our models achieved reliable sensitivities and high specificities, comparable to those of retina specialists and may be applied for large-scale high myopia screening and patient follow-up.

Download Full-text