scholarly journals Automated Training Data Generation from Spectral Indexes for Mapping Surface Water Extent with Sentinel-2 Satellite Imagery at 10 m and 20 m Resolutions

2021 ◽  
Vol 13 (22) ◽  
pp. 4531
Author(s):  
Kristofer Lasko ◽  
Megan C. Maloney ◽  
Sarah J. Becker ◽  
Andrew W. H. Griffin ◽  
Susan L. Lyon ◽  
...  

This study presents an automated methodology to generate training data for surface water mapping from a single Sentinel-2 granule at 10 m (4 band, VIS/NIR) or 20 m (9 band, VIS/NIR/SWIR) resolution without the need for ancillary training data layers. The 20 m method incorporates an ensemble of three spectral indexes with optimal band thresholds, whereas the 10 m method achieves similar results using fewer bands and a single spectral index. A spectrally balanced and randomly generated set of training data based on the index values and optimal thresholds is used to fit machine learning classifiers. Statistical validation compares the 20 m ensemble-only method to the 20 m ensemble method with a random forest classifier. Results show the 20 m ensemble-only method had an overall accuracy of 89.5% (±1.7%), whereas the ensemble method combined with the random forest classifier performed better, with a ~4.8% higher overall accuracy: 20 m method (94.3% (±1.3%)) with optimal spectral index and SWIR thresholds of −0.03 and 800, respectively, and 10 m method (93.4% (±1.5%)) with optimal spectral index and NIR thresholds of −0.01 and 800, respectively. Comparison of other supervised classifiers trained automatically with the framework typically resulted in less than 1% accuracy improvement compared with the random forest, suggesting that training data quality is more important than classifier type. This straightforward framework enables accurate surface water classification across diverse geographies, making it ideal for development into a decision support tool for water resource managers.

2018 ◽  
Vol 10 (10) ◽  
pp. 1642 ◽  
Author(s):  
Kristof Van Tricht ◽  
Anne Gobin ◽  
Sven Gilliams ◽  
Isabelle Piccard

A timely inventory of agricultural areas and crop types is an essential requirement for ensuring global food security and allowing early crop monitoring practices. Satellite remote sensing has proven to be an increasingly more reliable tool to identify crop types. With the Copernicus program and its Sentinel satellites, a growing source of satellite remote sensing data is publicly available at no charge. Here, we used joint Sentinel-1 radar and Sentinel-2 optical imagery to create a crop map for Belgium. To ensure homogenous radar and optical inputs across the country, Sentinel-1 12-day backscatter mosaics were created after incidence angle normalization, and Sentinel-2 normalized difference vegetation index (NDVI) images were smoothed to yield 10-daily cloud-free mosaics. An optimized random forest classifier predicted the eight crop types with a maximum accuracy of 82% and a kappa coefficient of 0.77. We found that a combination of radar and optical imagery always outperformed a classification based on single-sensor inputs, and that classification performance increased throughout the season until July, when differences between crop types were largest. Furthermore, we showed that the concept of classification confidence derived from the random forest classifier provided insight into the reliability of the predicted class for each pixel, clearly showing that parcel borders have a lower classification confidence. We concluded that the synergistic use of radar and optical data for crop classification led to richer information increasing classification accuracies compared to optical-only classification. Further work should focus on object-level classification and crop monitoring to exploit the rich potential of combined radar and optical observations.


Author(s):  
S. Kuny ◽  
H. Hammer ◽  
K. Schulz

Abstract. Urban areas struck by disasters such as earthquakes are in need of a fast damage detection assessment. A post-event SAR image often is the first available image, most likely with no matching pre-event image to perform change detection. In previous work we have introduced a debris detection algorithm for this scenario that is trained exclusively with synthetically generated training data. A classification step is employed to separate debris from similar textures such as vegetation. In order to verify the use of a random forest classifier for this context, we conduct a performance comparison with two alternative popular classifiers, a support vector machine and a convolutional neural network. With the direct comparison revealing the random forest classifier to be best suited, the effective performance on the prospect of debris detection is investigated for the post-earthquake Christchurch scene. Results show a good separation of debris from vegetation and gravel, thus reducing the false alarm rate in the damage detection operation considerably.


Author(s):  
Aqilah Aini Zahra ◽  
Widyawan Widyawan ◽  
Silmi Fauziati

A Twitter bot is a Twitter account programmed to automatically do social activities by sending tweets through a scheduling program. Some bots intend to disseminate useful information such as earthquake and weather information. However, not a few bots have a negative influence, such as broadcasting false news, spam, or become a follower to increase an account's popularity. It can change public sentiments about an issue, decrease user confidence, or even change the social order. Therefore, an application is needed to distinguish between a bot and non-bot accounts. Based on these problems, this paper develops bot detection systems using machine learning for multiclass classification. These classes include human classes, informative, spammers, and fake followers. The model training used guided methods based on labeled training data. First, a dataset of 2,333 accounts was pre-processed to obtain 28 feature sets for classification. This feature set came from analysis of user profiles, temporal analysis, and analysis of tweets with numeric values. Afterward, the data was partitioned, normalized with scaling, and a random forest classifier algorithm was implemented on the data. After that, the features were reselected into 17 feature sets to obtain the highest accuracy achieved by the model. In the evaluation stage, bot detection models generated an accuracy of 96.79%, 97% precision, 96% recall, and an f-1 score of 96%. Therefore, the detection model was classified as having high accuracy. The bot detection model that had been completed was then implemented on the website and deployed to the cloud. In the end, this machine learning-based web application could be accessed and used by the public to detect Twitter bots.


Forests ◽  
2020 ◽  
Vol 11 (9) ◽  
pp. 941
Author(s):  
Adam Waśniewski ◽  
Agata Hościło ◽  
Bogdan Zagajewski ◽  
Dieudonné Moukétou-Tarazewicz

This study is focused on the assessment of the potential of Sentinel-2 satellite images and the Random Forest classifier for mapping forest cover and forest types in northwest Gabon. The main goal was to investigate the impact of various spectral bands collected by the Sentinel-2 satellite, normalized difference vegetation index (NDVI) and digital elevation model (DEM), and their combination on the accuracy of the classification of forest cover and forest type. Within the study area, five classes of forest type were delineated: semi-evergreen moist forest, lowland forest, freshwater swamp forest, mangroves, and disturbed natural forest. The classification was performed using the Random Forest (RF) classifier. The overall accuracy for the forest cover ranged between 92.6% and 98.5%, whereas for forest type, the accuracy was 83.4 to 97.4%. The highest accuracy for forest cover and forest type classifications were obtained using a combination of spectral bands at spatial resolutions of 10 m and 20 m and DEM. In both cases, the use of the NDVI did not increase the classification accuracy. The DEM was shown to be the most important variable in distinguishing the forest type. Among the Sentinel-2 spectral bands, the red-edge followed by the SWIR contributed the most to the accuracy of the forest type classification. Additionally, the Random Forest model for forest cover classification was successfully transferred from one master image to other images. In contrast, the transferability of the forest type model was more complex, because of the heterogeneity of the forest type and environmental conditions across the study area.


Land ◽  
2020 ◽  
Vol 9 (11) ◽  
pp. 420
Author(s):  
Jiří Šandera ◽  
Přemysl Štych

Permanent grassland is one of the monitored categories of land use, land use change, and forestry (LULUCF) within the climate concept and greenhouse gas reduction policy (Regulation (EU) 2018/841). Mapping the conditions and changes of permanent grasslands is thus very important. The area of permanent grassland is strongly influenced by agricultural subsidy policies. Over the course of history, it is possible to trace different shares of permanent grassland within agricultural land and areas with significant changes from grassland to arable land. The need for monitoring permanent grassland and arable land has been growing in recent years. New subsidy policies determining farm management are beginning to affect land use, especially in countries that have joined the EU in recent waves. The large amount of freely available satellite data enables this monitoring to take place, mainly owing to data products of the Copernicus program. There are a large number of parameters (predictors) that can be calculated from satellite data, but finding the right combination is very difficult. This study presents a methodical, systematic procedure using the random forest classifier and its internal metric of mean decrease accuracy (MDA) to select the most suitable predictors to detect changes from permanent grassland to arable land. The relevance of suitable predictors takes into account the date of the satellite image, the overall accuracy of change detection, and the time required for calculations. Biological predictors, such as leaf area index (LAI), fraction absorbed photosynthetically active radiation (FAPAR), normalized difference vegetation index (NDVI), etc. were tested in the form of a time series from the Sentinel-2 satellite, and the most suitable ones were selected. FAPAR, canopy water content (CWC), and LAI seemed to be the most suitable. The proposed change detection procedure achieved a very high accuracy of more than 95% within the study site with an area of 8766 km2.


2020 ◽  
Vol 12 (20) ◽  
pp. 3428
Author(s):  
Cidália C. Fonte ◽  
Joaquim Patriarca ◽  
Ismael Jesus ◽  
Diogo Duarte

This paper tests an automated methodology for generating training data from OpenStreetMap (OSM) to classify Sentinel-2 imagery into Land Use/Land Cover (LULC) classes. Different sets of training data were generated and used as inputs for the image classification. Firstly, OSM data was converted into LULC maps using the OSM2LULC_4T software package. The Random Forest classifier was then trained to classify a time-series of Sentinel-2 imagery into 8 LULC classes with samples extracted from: (1) The LULC maps produced by OSM2LULC_4T (TD0); (2) the TD1 dataset, obtained after removing mixed pixels from TD0; (3) the TD2 dataset, obtained by filtering TD1 using radiometric indices. The classification results were generalized using a majority filter and hybrid maps were created by merging the classification results with the OSM2LULC outputs. The accuracy of all generated maps was assessed using the 2018 official “Carta de Ocupação do Solo” (COS). The methodology was applied to two study areas with different characteristics. The results show that in some cases the filtering procedures improve the training data and the classification results. This automated methodology allowed the production of maps with overall accuracy between 55% and 78% greater than that of COS, even though the used nomenclature includes classes that can be easily confused by the classifiers.


Author(s):  
E. Roteta ◽  
P. Oliva

Abstract. Due to the high variability of biomes throughout the country, the classification of burned areas is a challenge. We calibrated a random forest classifier to account for all this variability and ensure an accurate classification of burned areas. The classifier was optimized in three steps, generating a version of the burned area product in each step. According to the visual assessment, the final version of the BA product is more accurate than the perimeters created by the Chilean National Forest Corporation, which overestimate large burned areas because it does not consider the inner unburned areas and, it omits some small burned areas. The total burned surface from January to March 2017 was 5,000 km2 in Chile, 20 % of it belonging to a single burned area in the Maule Region, and with 91 % of the total burned surface distributed in 6 adjacent regions of Central Chile.


Sign in / Sign up

Export Citation Format

Share Document