scholarly journals EXPLORING MACHINE LEARNING CLASSIFICATION ALGORITHMS FOR CROP CLASSIFICATION USING SENTINEL 2 DATA

Author(s):  
◽  
S. S. Ray

<p><strong>Abstract.</strong> Crop Classification and recognition is a very important application of Remote Sensing. In the last few years, Machine learning classification techniques have been emerging for crop classification. Google Earth Engine (GEE) is a platform to explore the multiple satellite data with different advanced classification techniques without even downloading the satellite data. The main objective of this study is to explore the ability of different machine learning classification techniques like, Random Forest (RF), Classification And Regression Trees (CART) and Support Vector Machine (SVM) for crop classification. High Resolution optical data, Sentinel-2, MSI (10&amp;thinsp;m) was used for crop classification in the Indian Agricultural Research Institute (IARI) farm for the Rabi season 2016 for major crops. Around 100 crop fields (~400 Hectare) in IARI were analysed. Smart phone-based ground truth data were collected. The best cloud free image of Sentinel 2 MSI data (5 Feb 2016) was used for classification using automatic filtering by percentage cloud cover property using the GEE. Polygons as feature space was used as training data sets based on the ground truth data for crop classification using machine learning techniques. Post classification, accuracy assessment analysis was done through the generation of the confusion matrix (producer and user accuracy), kappa coefficient and F value. In this study it was found that using GEE through cloud platform, satellite data accessing, filtering and pre-processing of satellite data could be done very efficiently. In terms of overall classification accuracy and kappa coefficient, Random Forest (93.3%, 0.9178) and CART (73.4%, 0.6755) classifiers performed better than SVM (74.3%, 0.6867) classifier. For validation, Field Operation Service Unit (FOSU) division of IARI, data was used and encouraging results were obtained.</p>

2020 ◽  
Vol 12 (3) ◽  
pp. 355 ◽  
Author(s):  
Nam Thang Ha ◽  
Merilyn Manley-Harris ◽  
Tien Dat Pham ◽  
Ian Hawes

Seagrass has been acknowledged as a productive blue carbon ecosystem that is in significant decline across much of the world. A first step toward conservation is the mapping and monitoring of extant seagrass meadows. Several methods are currently in use, but mapping the resource from satellite images using machine learning is not widely applied, despite its successful use in various comparable applications. This research aimed to develop a novel approach for seagrass monitoring using state-of-the-art machine learning with data from Sentinel–2 imagery. We used Tauranga Harbor, New Zealand as a validation site for which extensive ground truth data are available to compare ensemble machine learning methods involving random forests (RF), rotation forests (RoF), and canonical correlation forests (CCF) with the more traditional maximum likelihood classifier (MLC) technique. Using a group of validation metrics including F1, precision, recall, accuracy, and the McNemar test, our results indicated that machine learning techniques outperformed the MLC with RoF as the best performer (F1 scores ranging from 0.75–0.91 for sparse and dense seagrass meadows, respectively). Our study is the first comparison of various ensemble-based methods for seagrass mapping of which we are aware, and promises to be an effective approach to enhance the accuracy of seagrass monitoring.


2021 ◽  
Author(s):  
Michael Tarasiou

This paper presents DeepSatData a pipeline for automatically generating satellite imagery datasets for training machine learning models. We also discuss design considerations with emphasis on dense classification tasks, e.g. semantic segmentation. The implementation presented makes use of freely available Sentinel-2 data which allows the generation of large scale datasets required for training deep neural networks (DNN). We discuss issues faced from the point of view of DNN training and evaluation such as checking the quality of ground truth data and comment on the scalability of the approach.


2021 ◽  
Author(s):  
Michael Tarasiou

This paper presents DeepSatData a pipeline for automatically generating satellite imagery datasets for training machine learning models. We also discuss design considerations with emphasis on dense classification tasks, e.g. semantic segmentation. The implementation presented makes use of freely available Sentinel-2 data which allows the generation of large scale datasets required for training deep neural networks (DNN). We discuss issues faced from the point of view of DNN training and evaluation such as checking the quality of ground truth data and comment on the scalability of the approach.


Author(s):  
Gordana Kaplan ◽  
Ugur Avdan

Wetlands benefits can be summarized but are not limited to their ability to store floodwaters and improve water quality, providing habitats for wildlife and supporting biodiversity, as well as aesthetic values. Over the past few decades, remote sensing and geographical information technologies has proven to be a useful and frequent applications in monitoring and mapping wetlands. Combining both optical and microwave satellite data can give significant information about the biophysical characteristics of wetlands and wetlands` vegetation. Also, fusing data from different sensors, such as radar and optical remote sensing data, can increase the wetland classification accuracy. In this paper we investigate the ability of fusion two fine spatial resolution satellite data, Sentinel-2 and the Synthetic Aperture Radar Satellite, Sentinel-1, for mapping wetlands. As a study area in this paper, Balikdami wetland located in the Anatolian part of Turkey has been selected. Both Sentinel-1 and Sentinel-2 images require pre-processing before their use. After the pre-processing, several vegetation indices calculated from the Sentinel-2 bands were included in the data set. Furthermore, an object-based classification was performed. For the accuracy assessment of the obtained results, number of random points were added over the study area. In addition, the results were compared with data from Unmanned Aerial Vehicle collected on the same data of the overpass of the Sentinel-2, and three days before the overpass of Sentinel-1 satellite. The accuracy assessment showed that the results significant and satisfying in the wetland classification using both multispectral and microwave data. The statistical results of the fusion of the optical and radar data showed high wetland mapping accuracy, with an overall classification accuracy of approximately 90% in the object-based classification. Compared with the high resolution UAV data, the classification results give promising results for mapping and monitoring not just wetlands, but also the sub-classes of the study area. For future research, multi-temporal image use and terrain data collection are recommended.


2021 ◽  
Vol 14 (6) ◽  
pp. 997-1005
Author(s):  
Sandeep Tata ◽  
Navneet Potti ◽  
James B. Wendt ◽  
Lauro Beltrão Costa ◽  
Marc Najork ◽  
...  

Extracting structured information from templatic documents is an important problem with the potential to automate many real-world business workflows such as payment, procurement, and payroll. The core challenge is that such documents can be laid out in virtually infinitely different ways. A good solution to this problem is one that generalizes well not only to known templates such as invoices from a known vendor, but also to unseen ones. We developed a system called Glean to tackle this problem. Given a target schema for a document type and some labeled documents of that type, Glean uses machine learning to automatically extract structured information from other documents of that type. In this paper, we describe the overall architecture of Glean, and discuss three key data management challenges : 1) managing the quality of ground truth data, 2) generating training data for the machine learning model using labeled documents, and 3) building tools that help a developer rapidly build and improve a model for a given document type. Through empirical studies on a real-world dataset, we show that these data management techniques allow us to train a model that is over 5 F1 points better than the exact same model architecture without the techniques we describe. We argue that for such information-extraction problems, designing abstractions that carefully manage the training data is at least as important as choosing a good model architecture.


1993 ◽  
Author(s):  
Usama M. Fayyad ◽  
Richard J. Doyle ◽  
W. Nick Weir ◽  
Stanislav Djorgovski

2020 ◽  
Vol 12 (23) ◽  
pp. 3958
Author(s):  
Parwati Sofan ◽  
David Bruce ◽  
Eriita Jones ◽  
M. Rokhis Khomarudin ◽  
Orbita Roswintiarti

This study establishes a new technique for peatland fire detection in tropical environments using Landsat-8 and Sentinel-2. The Tropical Peatland Combustion Algorithm (ToPeCAl) without longwave thermal infrared (TIR) (henceforth known as ToPeCAl-2) was tested on Landsat-8 Operational Land Imager (OLI) data and then applied to Sentinel-2 Multi Spectral Instrument (MSI) data. The research is aimed at establishing peatland fire information at higher spatial resolution and more frequent observation than from Landsat-8 data over Indonesia’s peatlands. ToPeCAl-2 applied to Sentinel-2 was assessed by comparing fires detected from the original ToPeCAl applied to Landsat-8 OLI/Thermal Infrared Sensor (TIRS) verified through comparison with ground truth data. An adjustment of ToPeCAl-2 was applied to minimise false positive errors by implementing pre-process masking for water and permanent bright objects and filtering ToPeCAl-2’s resultant detected fires by implementing contextual testing and cloud masking. Both ToPeCAl-2 with contextual test and ToPeCAl with cloud mask applied to Sentinel-2 provided high detection of unambiguous fire pixels (>95%) at 20 m spatial resolution. Smouldering pixels were less likely to be detected by ToPeCAl-2. The detected smouldering pixels from ToPeCAl-2 applied to Sentinel-2 with contextual testing and with cloud masking were only 35% and 56% correct, respectively; this needs further investigation and validation. These results demonstrate that even in the absence of TIR data, an adjusted ToPeCAl algorithm (ToPeCAl-2) can be applied to detect peatland fires at 20 m resolution with high accuracy especially for flaming. Overall, the implementation of ToPeCAl applied to cost-free and available Landsat-8 and Sentinel-2 data enables regular peatland fire monitoring in tropical environments at higher spatial resolution than other satellite-derived fire products.


Sign in / Sign up

Export Citation Format

Share Document