scholarly journals Automated Mapping of Antarctic Supraglacial Lakes Using a Machine Learning Approach

2020 ◽  
Vol 12 (7) ◽  
pp. 1203 ◽  
Author(s):  
Mariel Dirscherl ◽  
Andreas J. Dietz ◽  
Christof Kneisel ◽  
Claudia Kuenzer

Supraglacial lakes can have considerable impact on ice sheet mass balance and global sea-level-rise through ice shelf fracturing and subsequent glacier speedup. In Antarctica, the distribution and temporal development of supraglacial lakes as well as their potential contribution to increased ice mass loss remains largely unknown, requiring a detailed mapping of the Antarctic surface hydrological network. In this study, we employ a Machine Learning algorithm trained on Sentinel-2 and auxiliary TanDEM-X topographic data for automated mapping of Antarctic supraglacial lakes. To ensure the spatio-temporal transferability of our method, a Random Forest was trained on 14 training regions and applied over eight spatially independent test regions distributed across the whole Antarctic continent. In addition, we employed our workflow for large-scale application over Amery Ice Shelf where we calculated interannual supraglacial lake dynamics between 2017 and 2020 at full ice shelf coverage. To validate our supraglacial lake detection algorithm, we randomly created point samples over our classification results and compared them to Sentinel-2 imagery. The point comparisons were evaluated using a confusion matrix for calculation of selected accuracy metrics. Our analysis revealed wide-spread supraglacial lake occurrence in all three Antarctic regions. For the first time, we identified supraglacial meltwater features on Abbott, Hull and Cosgrove Ice Shelves in West Antarctica as well as for the entire Amery Ice Shelf for years 2017–2020. Over Amery Ice Shelf, maximum lake extent varied strongly between the years with the 2019 melt season characterized by the largest areal coverage of supraglacial lakes (~763 km2). The accuracy assessment over the test regions revealed an average Kappa coefficient of 0.86 where the largest value of Kappa reached 0.98 over George VI Ice Shelf. Future developments will involve the generation of circum-Antarctic supraglacial lake mapping products as well as their use for further methodological developments using Sentinel-1 SAR data in order to characterize intraannual supraglacial meltwater dynamics also during polar night and independent of meteorological conditions. In summary, the implementation of the Random Forest classifier enabled the development of the first automated mapping method applied to Sentinel-2 data distributed across all three Antarctic regions.

2020 ◽  
Author(s):  
Mariel Dirscherl ◽  
Andreas Dietz ◽  
Celia Baumhoer ◽  
Christof Kneisel ◽  
Claudia Kuenzer

<p>Antarctica stores ~91 % of the global ice mass making it the biggest potential contributor to global sea-level-rise. With increased surface air temperatures during austral summer as well as in consequence of global climate change, the ice sheet is subject to surface melting resulting in the formation of supraglacial lakes in local surface depressions. Supraglacial meltwater features may impact Antarctic ice dynamics and mass balance through three main processes. First of all, it may cause enhanced ice thinning thus a potentially negative Antarctic Surface Mass Balance (SMB). Second, the temporary injection of meltwater to the glacier bed may cause transient ice speed accelerations and increased ice discharge. The last mechanism involves a process called hydrofracturing i.e. meltwater-induced ice shelf collapse caused by the downward propagation of surface meltwater into crevasses or fractures, as observed along large coastal sections of the northern Antarctic Peninsula. Despite the known impact of supraglacial meltwater features on ice dynamics and mass balance, the Antarctic surface hydrological network remains largely understudied with an automated method for supraglacial lake and stream detection still missing. Spaceborne remote sensing and data of the Sentinel missions in particular provide an excellent basis for the monitoring of the Antarctic surface hydrological network at unprecedented spatial and temporal coverage.</p><p>In this study, we employ state-of-the-art machine learning for automated supraglacial lake and stream mapping on basis of optical Sentinel-2 satellite data. With more detail, we use a total of 72 Sentinel-2 acquisitions distributed across the Antarctic Ice Sheet together with topographic information to train and test the selected machine learning algorithm. In general, our machine learning workflow is designed to discriminate between surface water, ice/snow, rock and shadow being further supported by several automated post-processing steps. In order to ensure the algorithm’s transferability in space and time, the acquisitions used for training the machine learning model are chosen to cover the full circle of the 2019 melt season and the data selected for testing the algorithm span the 2017 and 2018 melt seasons. Supraglacial lake predictions are presented for several regions of interest on the East and West Antarctic Ice Sheet as well as along the Antarctic Peninsula and are validated against randomly sampled points in the underlying Sentinel-2 RGB images. To highlight the performance of our model, we specifically focus on the example of the Amery Ice Shelf in East Antarctica, where we applied our algorithm on Sentinel-2 data in order to present the temporal evolution of maximum lake extent during three consecutive melt seasons (2017, 2018 and 2019).</p>


2021 ◽  
Vol 13 (16) ◽  
pp. 3176
Author(s):  
Beata Hejmanowska ◽  
Piotr Kramarczyk ◽  
Ewa Głowienka ◽  
Sławomir Mikrut

The study presents the analysis of the possible use of limited number of the Sentinel-2 and Sentinel-1 to check if crop declarations that the EU farmers submit to receive subsidies are true. The declarations used in the research were randomly divided into two independent sets (training and test). Based on the training set, supervised classification of both single images and their combinations was performed using random forest algorithm in SNAP (ESA) and our own Python scripts. A comparative accuracy analysis was performed on the basis of two forms of confusion matrix (full confusion matrix commonly used in remote sensing and binary confusion matrix used in machine learning) and various accuracy metrics (overall accuracy, accuracy, specificity, sensitivity, etc.). The highest overall accuracy (81%) was obtained in the simultaneous classification of multitemporal images (three Sentinel-2 and one Sentinel-1). An unexpectedly high accuracy (79%) was achieved in the classification of one Sentinel-2 image at the end of May 2018. Noteworthy is the fact that the accuracy of the random forest method trained on the entire training set is equal 80% while using the sampling method ca. 50%. Based on the analysis of various accuracy metrics, it can be concluded that the metrics used in machine learning, for example: specificity and accuracy, are always higher then the overall accuracy. These metrics should be used with caution, because unlike the overall accuracy, to calculate these metrics, not only true positives but also false positives are used as positive results, giving the impression of higher accuracy. Correct calculation of overall accuracy values is essential for comparative analyzes. Reporting the mean accuracy value for the classes as overall accuracy gives a false impression of high accuracy. In our case, the difference was 10–16% for the validation data, and 25–45% for the test data.


Drones ◽  
2020 ◽  
Vol 4 (2) ◽  
pp. 21 ◽  
Author(s):  
Francisco Rodríguez-Puerta ◽  
Rafael Alonso Ponce ◽  
Fernando Pérez-Rodríguez ◽  
Beatriz Águeda ◽  
Saray Martín-García ◽  
...  

Controlling vegetation fuels around human settlements is a crucial strategy for reducing fire severity in forests, buildings and infrastructure, as well as protecting human lives. Each country has its own regulations in this respect, but they all have in common that by reducing fuel load, we in turn reduce the intensity and severity of the fire. The use of Unmanned Aerial Vehicles (UAV)-acquired data combined with other passive and active remote sensing data has the greatest performance to planning Wildland-Urban Interface (WUI) fuelbreak through machine learning algorithms. Nine remote sensing data sources (active and passive) and four supervised classification algorithms (Random Forest, Linear and Radial Support Vector Machine and Artificial Neural Networks) were tested to classify five fuel-area types. We used very high-density Light Detection and Ranging (LiDAR) data acquired by UAV (154 returns·m−2 and ortho-mosaic of 5-cm pixel), multispectral data from the satellites Pleiades-1B and Sentinel-2, and low-density LiDAR data acquired by Airborne Laser Scanning (ALS) (0.5 returns·m−2, ortho-mosaic of 25 cm pixels). Through the Variable Selection Using Random Forest (VSURF) procedure, a pre-selection of final variables was carried out to train the model. The four algorithms were compared, and it was concluded that the differences among them in overall accuracy (OA) on training datasets were negligible. Although the highest accuracy in the training step was obtained in SVML (OA=94.46%) and in testing in ANN (OA=91.91%), Random Forest was considered to be the most reliable algorithm, since it produced more consistent predictions due to the smaller differences between training and testing performance. Using a combination of Sentinel-2 and the two LiDAR data (UAV and ALS), Random Forest obtained an OA of 90.66% in training and of 91.80% in testing datasets. The differences in accuracy between the data sources used are much greater than between algorithms. LiDAR growth metrics calculated using point clouds in different dates and multispectral information from different seasons of the year are the most important variables in the classification. Our results support the essential role of UAVs in fuelbreak planning and management and thus, in the prevention of forest fires.


2020 ◽  
Vol 12 (15) ◽  
pp. 5972
Author(s):  
Nicholas Fiorentini ◽  
Massimo Losa

Screening procedures in road blackspot detection are essential tools for road authorities for quickly gathering insights on the safety level of each road site they manage. This paper suggests a road blackspot screening procedure for two-lane rural roads, relying on five different machine learning algorithms (MLAs) and real long-term traffic data. The network analyzed is the one managed by the Tuscany Region Road Administration, mainly composed of two-lane rural roads. An amount of 995 road sites, where at least one accident occurred in 2012–2016, have been labeled as “Accident Case”. Accordingly, an equal number of sites where no accident occurred in the same period, have been randomly selected and labeled as “Non-Accident Case”. Five different MLAs, namely Logistic Regression, Classification and Regression Tree, Random Forest, K-Nearest Neighbor, and Naïve Bayes, have been trained and validated. The output response of the MLAs, i.e., crash occurrence susceptibility, is a binary categorical variable. Therefore, such algorithms aim to classify a road site as likely safe (“Accident Case”) or potentially susceptible to an accident occurrence (“Non-Accident Case”) over five years. Finally, algorithms have been compared by a set of performance metrics, including precision, recall, F1-score, overall accuracy, confusion matrix, and the Area Under the Receiver Operating Characteristic. Outcomes show that the Random Forest outperforms the other MLAs with an overall accuracy of 73.53%. Furthermore, all the MLAs do not show overfitting issues. Road authorities could consider MLAs to draw up a priority list of on-site inspections and maintenance interventions.


2019 ◽  
Vol 3 (s1) ◽  
pp. 2-2
Author(s):  
Megan C Hollister ◽  
Jeffrey D. Blume

OBJECTIVES/SPECIFIC AIMS: To examine and compare the claims in Bzdok, Altman, and Brzywinski under a broader set of conditions by using unbiased methods of comparison. To explore how to accurately use various machine learning and traditional statistical methods in large-scale translational research by estimating their accuracy statistics. Then we will identify the methods with the best performance characteristics. METHODS/STUDY POPULATION: We conducted a simulation study with a microarray of gene expression data. We maintained the original structure proposed by Bzdok, Altman, and Brzywinski. The structure for gene expression data includes a total of 40 genes from 20 people, in which 10 people are phenotype positive and 10 are phenotype negative. In order to find a statistical difference 25% of the genes were set to be dysregulated across phenotype. This dysregulation forced the positive and negative phenotypes to have different mean population expressions. Additional variance was included to simulate genetic variation across the population. We also allowed for within person correlation across genes, which was not done in the original simulations. The following methods were used to determine the number of dysregulated genes in simulated data set: unadjusted p-values, Benjamini-Hochberg adjusted p-values, Bonferroni adjusted p-values, random forest importance levels, neural net prediction weights, and second-generation p-values. RESULTS/ANTICIPATED RESULTS: Results vary depending on whether a pre-specified significance level is used or the top 10 ranked values are taken. When all methods are given the same prior information of 10 dysregulated genes, the Benjamini-Hochberg adjusted p-values and the second-generation p-values generally outperform all other methods. We were not able to reproduce or validate the finding that random forest importance levels via a machine learning algorithm outperform classical methods. Almost uniformly, the machine learning methods did not yield improved accuracy statistics and they depend heavily on the a priori chosen number of dysregulated genes. DISCUSSION/SIGNIFICANCE OF IMPACT: In this context, machine learning methods do not outperform standard methods. Because of this and their additional complexity, machine learning approaches would not be preferable. Of all the approaches the second-generation p-value appears to offer significant benefit for the cost of a priori defining a region of trivially null effect sizes. The choice of an analysis method for large-scale translational data is critical to the success of any statistical investigation, and our simulations clearly highlight the various tradeoffs among the available methods.


Polar Record ◽  
1989 ◽  
Vol 25 (153) ◽  
pp. 99-106 ◽  
Author(s):  
Michael J. Hambrey ◽  
Birger Larsen ◽  
Werner U. Ehrmann

AbstractDuring Leg 119 of the Ocean Drilling Program, between December 1987 and February 1988, six holes were drilled in the Kerguelen Plateau, southern Indian Ocean, and five in Prydz Bay at the mouth of the Amery Ice Shelf, on the East Antarctic continental shelf. The Prydz Bay holes, reported here, form a transect from the inner shelf to the continental slope, recording a prograding sequence of possible Late Paleozoic to Eocene continental sediments of fluvial aspect, followed by several hundred metres of Early Oligocene (possibly Middle Eocene) to Quaternary glaciallydominated sediments. This extends the known onset of large-scale glaciation of Antarctica back to about 36–40 million years ago, the sedimentary record suggesting that a fully developed East Antarctic Ice Sheet reached the coast at Prydz Bay at this time, and was more extensive than the present sheet. Subsequent glacial history is complex, with the bulk of sedimentation in the outer shelf taking place close to the grounding line of an extended Amery Ice S helf. However, breaks in the record and intervals of no recovery may hide evidence of periods of glacial retreat.


2021 ◽  
Vol 8 ◽  
Author(s):  
Xue Liu ◽  
Temilola E. Fatoyinbo ◽  
Nathan M. Thomas ◽  
Weihe Wendy Guan ◽  
Yanni Zhan ◽  
...  

Coastal mangrove forests provide important ecosystem goods and services, including carbon sequestration, biodiversity conservation, and hazard mitigation. However, they are being destroyed at an alarming rate by human activities. To characterize mangrove forest changes, evaluate their impacts, and support relevant protection and restoration decision making, accurate and up-to-date mangrove extent mapping at large spatial scales is essential. Available large-scale mangrove extent data products use a single machine learning method commonly with 30 m Landsat imagery, and significant inconsistencies remain among these data products. With huge amounts of satellite data involved and the heterogeneity of land surface characteristics across large geographic areas, finding the most suitable method for large-scale high-resolution mangrove mapping is a challenge. The objective of this study is to evaluate the performance of a machine learning ensemble for mangrove forest mapping at 20 m spatial resolution across West Africa using Sentinel-2 (optical) and Sentinel-1 (radar) imagery. The machine learning ensemble integrates three commonly used machine learning methods in land cover and land use mapping, including Random Forest (RF), Gradient Boosting Machine (GBM), and Neural Network (NN). The cloud-based big geospatial data processing platform Google Earth Engine (GEE) was used for pre-processing Sentinel-2 and Sentinel-1 data. Extensive validation has demonstrated that the machine learning ensemble can generate mangrove extent maps at high accuracies for all study regions in West Africa (92%–99% Producer’s Accuracy, 98%–100% User’s Accuracy, 95%–99% Overall Accuracy). This is the first-time that mangrove extent has been mapped at a 20 m spatial resolution across West Africa. The machine learning ensemble has the potential to be applied to other regions of the world and is therefore capable of producing high-resolution mangrove extent maps at global scales periodically.


SLEEP ◽  
2020 ◽  
Author(s):  
Sowmya M Ramaswamy ◽  
Maud A S Weerink ◽  
Michel M R F Struys ◽  
Sunil B Nagaraj

Abstract Study Objectives Dexmedetomidine-induced electroencephalogram (EEG) patterns during deep sedation are comparable with natural sleep patterns. Using large-scale EEG recordings and machine learning techniques, we investigated whether dexmedetomidine-induced deep sedation indeed mimics natural sleep patterns. Methods We used EEG recordings from three sources in this study: 8,707 overnight sleep EEG and 30 dexmedetomidine clinical trial EEG. Dexmedetomidine-induced sedation levels were assessed using the Modified Observer’s Assessment of Alertness/Sedation (MOAA/S) score. We extracted 22 spectral features from each EEG recording using a multitaper spectral estimation method. Elastic-net regularization method was used for feature selection. We compared the performance of several machine learning algorithms (logistic regression, support vector machine, and random forest), trained on individual sleep stages, to predict different levels of the MOAA/S sedation state. Results The random forest algorithm trained on non-rapid eye movement stage 3 (N3) predicted dexmedetomidine-induced deep sedation (MOAA/S = 0) with area under the receiver operator characteristics curve >0.8 outperforming other machine learning models. Power in the delta band (0–4 Hz) was selected as an important feature for prediction in addition to power in theta (4–8 Hz) and beta (16–30 Hz) bands. Conclusions Using a large-scale EEG data-driven approach and machine learning framework, we show that dexmedetomidine-induced deep sedation state mimics N3 sleep EEG patterns. Clinical Trials Name—Pharmacodynamic Interaction of REMI and DMED (PIRAD), URL—https://clinicaltrials.gov/ct2/show/NCT03143972, and registration—NCT03143972.


2020 ◽  
Vol 12 (7) ◽  
pp. 1176 ◽  
Author(s):  
Yukun Lin ◽  
Zhe Zhu ◽  
Wenxuan Guo ◽  
Yazhou Sun ◽  
Xiaoyuan Yang ◽  
...  

Monitoring cotton status during the growing season is critical in increasing production efficiency. The water status in cotton is a key factor for yield and cotton quality. Stem water potential (SWP) is a precise indicator for assessing cotton water status. Satellite remote sensing is an effective approach for monitoring cotton growth at a large scale. The aim of this study is to estimate cotton water stress at a high temporal frequency and at a large scale. In this study, we measured midday SWP samples according to the acquisition dates of Sentinel-2 images and used them to build linear-regression-based and machine-learning-based models to estimate cotton water stress during the growing season (June to August, 2018). For the linear-regression-based method, we estimated SWP based on different Sentinel-2 spectral bands and vegetation indices, where the normalized difference index 45 (NDI45) achieved the best performance (R2 = 0.6269; RMSE = 3.6802 (-1*swp (bars))). For the machine-learning-based method, we used random forest regression to estimate SWP and received even better results (R2 = 0.6709; RMSE = 3.3742 (-1*swp (bars))). To find the best selection of input variables for the machine-learning-based approach, we tried three different data input datasets, including (1) 9 original spectral bands (e.g., blue, green, red, red edge, near infrared (NIR), and shortwave infrared (SWIR)), (2) 21 vegetation indices, and (3) a combination of original Sentinel-2 spectral bands and vegetation indices. The highest accuracy was achieved when only the original spectral bands were used. We also found the SWIR and red edge band were the most important spectral bands, and the vegetation indices based on red edge and NIR bands were particularly helpful. Finally, we applied the best approach for the linear-regression-based and the machine-learning-based methods to generate cotton water potential maps at a large scale and high temporal frequency. Results suggests that the methods developed here has the potential for continuous monitoring of SWP at large scales and the machine-learning-based method is preferred.


2015 ◽  
Vol 61 (226) ◽  
pp. 243-252 ◽  
Author(s):  
Catherine C. Walker ◽  
Jeremy N. Bassis ◽  
Helen A. Fricker ◽  
Robin J. Czerwinski

AbstractIceberg calving and basal melting are the two primary mass loss processes from the Antarctic ice sheet, accounting for approximately equal amounts of mass loss. Basal melting under ice shelves has been increasingly well constrained in recent work, but changes in iceberg calving rates remain poorly quantified. Here we examine the processes that precede iceberg calving, and focus on initiation and propagation of ice-shelf rifts. Using satellite imagery from the Moderate Resolution Imaging Spectroradiometer (MODIS) and the Multi-angle Imaging Spectroradiometer (MISR), we monitored five active rifts on the Amery Ice Shelf, Antarctica, from 2002 to 2014. We found a strong seasonal component: propagation rates were highest during (austral) summer and nearly zero during winter. We found substantial variability in summer propagation rates, but found no evidence that the variability was correlated with large-scale environmental drivers, such as atmospheric temperature, winds or sea-ice concentration. We did find a positive correlation between large propagation events and the arrival of tsunamis in the region. The variability appears to be related to visible structural boundaries within the ice shelf, e.g. suture zones or crevasse fields. This suggests that a complete understanding of rift propagation and iceberg calving needs to consider local heterogeneities within an ice shelf.


Sign in / Sign up

Export Citation Format

Share Document