Multi-temporal Crop Type and Field Boundary Classification with Google Earth Engine

Mapping Intimacies ◽

10.20944/preprints202004.0316.v1 ◽

2020 ◽

Author(s):

Michael Marszalek ◽

Maximilian Lösch ◽

Marco Körner ◽

Urs Schmidhalter

Keyword(s):

Vegetation Index ◽

Google Earth ◽

Federal State ◽

Support Vector ◽

Data Sets ◽

Field Boundary ◽

Data Set ◽

Crop Type ◽

Normalised Difference Vegetation Index ◽

Boundary Classification

Crop type and field boundary mapping enable cost-efficient crop management on the field scale and serve as the basis for yield forecasts. Our study uses a data set with crop types and corresponding field borders from the federal state of Bavaria, Germany, as documented by farmers from 2016 to 2018. The study classified corn, winter wheat, barley, sugar beet, potato, and rapeseed as the main crops grown in Upper Bavaria. Corresponding Sentinel-2 data sets include the normalised difference vegetation index (NDVI) and raw band data from 2016 to 2018 for each selected field. The influences of clouds, raw bands, and NDVI on crop type classification are analysed, and the classification algorithms, i.e., support vector machine (SVM) and random forest (RF), are compared. Field boundary detection and extraction are based on non-iterative clustering and a newly developed procedure based on Canny edge detection. The results emphasise the application of Sentinel’s raw bands (B1–B12) and RF, which outperforms SVM with an accuracy of up to 94%. Furthermore, we forecast data for an unknown year, which slightly reduces the classification accuracy. The results demonstrate the usefulness of the proof-of-concept and its readiness for use in real applications.

Download Full-text

Mapping of Cotton Fields Within-Season Using Phenology-Based Metrics Derived from a Time Series of Landsat Imagery

Remote Sensing ◽

10.3390/rs12183038 ◽

2020 ◽

Vol 12 (18) ◽

pp. 3038

Author(s):

Dhahi Al-Shammari ◽

Ignacio Fuentes ◽

Brett M. Whelan ◽

Patrick Filippi ◽

Thomas F. A. Bishop

Keyword(s):

Time Series ◽

Vegetation Index ◽

Landsat Imagery ◽

Google Earth ◽

Growth Stages ◽

Landsat 8 ◽

Crop Type ◽

Eastern Australia ◽

Normalised Difference Vegetation Index ◽

Cotton Fields

A phenology-based crop type mapping approach was carried out to map cotton fields throughout the cotton-growing areas of eastern Australia. The workflow was implemented in the Google Earth Engine (GEE) platform, as it is time efficient and does not require processing in multiple platforms to complete the classification steps. A time series of Normalised Difference Vegetation Index (NDVI) imagery were generated from Landsat 8 Surface Reflectance Tier 1 (L8SR) and processed using Fourier transformation. This was used to produce the harmonised-NDVI (H-NDVI) from the original NDVI, and then phase and amplitude values were generated from the H-NDVI to visualise active cotton in the targeted fields. Random Forest (RF) models were built to classify cotton at early, mid and late growth stages to assess the ability of the model to classify cotton as the season progresses, with phase, amplitude and other individual bands as predictors. Results obtained from leave-one-season-out cross validation (LOSOCV) indicated that Overall Accuracy (OA), Kappa, Producer’s Accuracies (PA) and User’s Accuracy (UA), increased significantly when adding amplitude and phase as predictor variables to the model, than prediction using H-NDVI or raw bands only. Commission and omission errors were reduced significantly as the season progressed and more in-season imagery was available. The methodology proposed in this study can map cotton crops accurately based on the reconstruction of the unique cotton reflectance trajectory through time. This study confirms the importance of phenological metrics in improving in-season cotton fields mapping across eastern Australia. This model can be used in conjunction with other datasets to forecast yield based on the mapped crop type for improved decision making related to supply chain logistics and seasonal outlooks for production.

Download Full-text

Annual Normalized Difference Vegetation Index time-series data for Australian statistical areas

10.31219/osf.io/ryqvs ◽

2021 ◽

Author(s):

Simon Ramsey ◽

Suzanne Mavoa

Keyword(s):

Vegetation Index ◽

Time Series Data ◽

Google Earth ◽

Series Data ◽

Atmospheric Effects ◽

Data Set ◽

Normalised Difference Vegetation Index ◽

Moderate Resolution ◽

Google Earth Engine ◽

Statistical Area

Google Earth Engine provides researchers with a platform for conducting planetary scale analysis of environmental processes and landcover change, both by providing the necessary tools and by handling the large quantities of data these analyses require. The most widely used moderate-resolution sensors, onboard the Landsat satellite platforms, often require pre-processing to prepare the data for analysis. This data set consists of Australia-wide Landsat derived Normalised Difference Vegetation Index (NDVI) values for the years 2001-2019. The median annual NDVI value for each Statistical Area 1 (SA1) and Statistical Area 2 (SA2) were calculated, and statistics for this imagery is provided in a tabular format. The accompanying Google Earth Engine script applies the pre-processing steps required to account for sensor, solar and atmospheric effects, improving continuity between imagery across space and time and therefore, will enable researchers beyond the remote sensing community to access analysis-ready imagery for the moderate resolution Landsat and Sentinel-2 satellite platforms.

Download Full-text

An Incremental Isomap Method for Hyperspectral Dimensionality Reduction and Classification

Photogrammetric Engineering & Remote Sensing ◽

10.14358/pers.87.7.445 ◽

2021 ◽

Vol 87 (6) ◽

pp. 445-455

Author(s):

Yi Ma ◽

Zezhong Zheng ◽

Yutang Ma ◽

Mingcang Zhu ◽

Ran Huang ◽

...

Keyword(s):

Manifold Learning ◽

Nearest Neighbor ◽

Hyperspectral Image ◽

Hyperspectral Data ◽

Training Data ◽

Support Vector ◽

Data Sets ◽

K Nearest Neighbor ◽

Data Set ◽

Data Points

Many manifold learning algorithms conduct an eigen vector analysis on a data-similarity matrix with a size of N×N, where N is the number of data points. Thus, the memory complexity of the analysis is no less than O(N2). We pres- ent in this article an incremental manifold learning approach to handle large hyperspectral data sets for land use identification. In our method, the number of dimensions for the high-dimensional hyperspectral-image data set is obtained with the training data set. A local curvature varia- tion algorithm is utilized to sample a subset of data points as landmarks. Then a manifold skeleton is identified based on the landmarks. Our method is validated on three AVIRIS hyperspectral data sets, outperforming the comparison algorithms with a k–nearest-neighbor classifier and achieving the second best performance with support vector machine.

Download Full-text

Explicitly Identifying the Desertification Change in CMREC Area Based on Multisource Remote Data

Remote Sensing ◽

10.3390/rs12193170 ◽

2020 ◽

Vol 12 (19) ◽

pp. 3170

Author(s):

Zemeng Fan ◽

Saibo Li ◽

Haiyan Fang

Keyword(s):

Climate Change ◽

Human Activities ◽

Vegetation Index ◽

Normalized Difference Vegetation Index ◽

Net Primary Productivity ◽

Regression Tree ◽

Google Earth ◽

Classification And Regression Tree ◽

Support Vector ◽

Driving Mechanisms

Explicitly identifying the desertification changes and causes has been a hot issue of eco-environment sustainable development in the China–Mongolia–Russia Economic Corridor (CMREC) area. In this paper, the desertification change patterns between 2000 and 2015 were identified by operating the classification and regression tree (CART) method with multisource remote sensing datasets on Google Earth Engine (GEE), which has the higher overall accuracy (85%) than three other methods, namely support vector machine (SVM), random forest (RF) and Albedo-normalized difference vegetation index (NDVI) models. A contribution index of climate change and human activities on desertification was introduced to quantitatively explicate the driving mechanisms of desertification change based on the temporal datasets and net primary productivity (NPP). The results show that the area of slight desertification land had increased from 719,700 km2 to 948,000 km2 between 2000 and 2015. The area of severe desertification land decreased from 82,400 km2 to 71,200 km2. The area of desertification increased by 9.68%, in which 69.68% was mainly caused by human activities. Climate change and human activities accounted for 68.8% and 27.36%, respectively, in the area of desertification restoration. In general, the degree of desertification showed a decreasing trend, and climate change was the major driving factor in the CMREC area between 2000 and 2015.

Download Full-text

Artificial bee colony algorithm for feature selection and improved support vector machine for text classification

Information Discovery and Delivery ◽

10.1108/idd-09-2018-0045 ◽

2019 ◽

Vol 47 (3) ◽

pp. 154-170

Author(s):

Janani Balakumar ◽

S. Vijayarani Mohan

Keyword(s):

Support Vector Machine ◽

Feature Selection ◽

Text Classification ◽

Support Vector ◽

Data Sets ◽

Selection Algorithm ◽

Data Set ◽

Content Type ◽

Benchmark Data ◽

Bee Colony

Purpose Owing to the huge volume of documents available on the internet, text classification becomes a necessary task to handle these documents. To achieve optimal text classification results, feature selection, an important stage, is used to curtail the dimensionality of text documents by choosing suitable features. The main purpose of this research work is to classify the personal computer documents based on their content. Design/methodology/approach This paper proposes a new algorithm for feature selection based on artificial bee colony (ABCFS) to enhance the text classification accuracy. The proposed algorithm (ABCFS) is scrutinized with the real and benchmark data sets, which is contrary to the other existing feature selection approaches such as information gain and χ2 statistic. To justify the efficiency of the proposed algorithm, the support vector machine (SVM) and improved SVM classifier are used in this paper. Findings The experiment was conducted on real and benchmark data sets. The real data set was collected in the form of documents that were stored in the personal computer, and the benchmark data set was collected from Reuters and 20 Newsgroups corpus. The results prove the performance of the proposed feature selection algorithm by enhancing the text document classification accuracy. Originality/value This paper proposes a new ABCFS algorithm for feature selection, evaluates the efficiency of the ABCFS algorithm and improves the support vector machine. In this paper, the ABCFS algorithm is used to select the features from text (unstructured) documents. Although, there is no text feature selection algorithm in the existing work, the ABCFS algorithm is used to select the data (structured) features. The proposed algorithm will classify the documents automatically based on their content.

Download Full-text

LSTM-based soft sensor design for oxygen content of flue gas in coal-fired power plant

Transactions of the Institute of Measurement and Control ◽

10.1177/0142331220932390 ◽

2020 ◽

pp. 014233122093239

Author(s):

Hongguang Pan ◽

Tao Su ◽

Xiangdong Huang ◽

Zheng Wang

Keyword(s):

Power Plant ◽

Oxygen Content ◽

Flue Gas ◽

Oxygen Sensor ◽

Short Term Memory ◽

Support Vector ◽

Data Sets ◽

Auxiliary Variables ◽

Data Set ◽

Lstm Network

To address problems of high cost, complicated process and low accuracy of oxygen content measurement in flue gas of coal-fired power plant, a method based on long short-term memory (LSTM) network is proposed in this paper to replace oxygen sensor to estimate oxygen content in flue gas of boilers. Specifically, first, the LSTM model was built with the Keras deep learning framework, and the accuracy of the model was further improved by selecting appropriate super-parameters through experiments. Secondly, the flue gas oxygen content, as the leading variable, was combined with the mechanism and boiler process primary auxiliary variables. Based on the actual production data collected from a coal-fired power plant in Yulin, China, the data sets were preprocessed. Moreover, a selection model of auxiliary variables based on grey relational analysis is proposed to construct a new data set and divide the training set and testing set. Finally, this model is compared with the traditional soft-sensing modelling methods (i.e. the methods based on support vector machine and BP neural network). The RMSE of LSTM model is 4.51% lower than that of GA-SVM model and 3.55% lower than that of PSO-BP model. The conclusion shows that the oxygen content model based on LSTM has better generalization and has certain industrial value.

Download Full-text

Collaborative Analysis on the Marked Ages of Rice Wines by Electronic Tongue and Nose based on Different Feature Data Sets

Sensors ◽

10.3390/s20041065 ◽

2020 ◽

Vol 20 (4) ◽

pp. 1065 ◽

Cited By ~ 2

Author(s):

Huihui Zhang ◽

Wenqing Shao ◽

Shanshan Qiu ◽

Jun Wang ◽

Zhenbo Wei

Keyword(s):

Electronic Tongue ◽

Alcoholic Beverages ◽

Support Vector ◽

Data Sets ◽

Good Prediction ◽

Least Squares Regression ◽

Data Set ◽

Weighted Fusion ◽

Fusion Data ◽

Direct Fusion

Aroma and taste are the most important attributes of alcoholic beverages. In the study, the self-developed electronic tongue (e-tongue) and electronic nose (e-nose) were used for evaluating the marked ages of rice wines. Six types of feature data sets (e-tongue data set, e-nose data set, direct-fusion data set, weighted-fusion data set, optimized direct-fusion data set, and optimized weighted-fusion data set) were used for identifying rice wines with different wine ages. Pearson coefficient analysis and variance inflation factor (VIF) analysis were used to optimize the fusion matrixes by removing the multicollinear information. Two types of discrimination methods (principal component analysis (PCA) and locality preserving projections (LPP)) were used for classifying rice wines, and LPP performed better than PCA in the discrimination work. The best result was obtained by LPP based on the weighted-fusion data set, and all the samples could be classified clearly in the LPP plot. Therefore, the weighted-fusion data were used as independent variables of partial least squares regression, extreme learning machine, and support vector machines (LIBSVM) for evaluating wine ages, respectively. All the methods performed well with good prediction results, and LIBSVM presented the best correlation coefficient (R2 ≥ 0.9998).

Download Full-text

Classification with Local Clustering in Imbalanced Data Sets

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.219-220.151 ◽

2011 ◽

Vol 219-220 ◽

pp. 151-155 ◽

Cited By ~ 2

Author(s):

Hua Ji ◽

Hua Xiang Zhang

Keyword(s):

Data Distribution ◽

Imbalanced Data ◽

Support Vector ◽

Data Sets ◽

Data Set ◽

Imbalanced Data Sets ◽

Local Clustering ◽

Rare Class ◽

Novel Method ◽

The Cost

In many real-world domains, learning from imbalanced data sets is always confronted. Since the skewed class distribution brings the challenge for traditional classifiers because of much lower classification accuracy on rare classes, we propose the novel method on classification with local clustering based on the data distribution of the imbalanced data sets to solve this problem. At first, we divide the whole data set into several data groups based on the data distribution. Then we perform local clustering within each group both on the normal class and the disjointed rare class. For rare class, the subsequent over-sampling is employed according to the different rates. At last, we apply support vector machines (SVMS) for classification, by means of the traditional tactic of the cost matrix to enhance the classification accuracies. The experimental results on several UCI data sets show that this method can produces much higher prediction accuracies on the rare class than state-of-art methods.

Download Full-text

Multiclass Contour-Preserving Classification with Support Vector Machine (SVM)

Journal of Intelligent Systems ◽

10.1515/jisys-2015-0087 ◽

2017 ◽

Vol 26 (2) ◽

pp. 323-334 ◽

Cited By ~ 1

Author(s):

Piyabute Fuangkhon

Keyword(s):

Support Vector Machine ◽

Classification Accuracy ◽

University Of California ◽

Support Vector ◽

Data Sets ◽

Feed Forward Neural Network ◽

Real World Data ◽

Data Set ◽

The University ◽

Training Sets

AbstractMulticlass contour-preserving classification (MCOV) has been used to preserve the contour of the data set and improve the classification accuracy of a feed-forward neural network. It synthesizes two types of new instances, called fundamental multiclass outpost vector (FMCOV) and additional multiclass outpost vector (AMCOV), in the middle of the decision boundary between consecutive classes of data. This paper presents a comparison on the generalization of an inclusion of FMCOVs, AMCOVs, and both MCOVs on the final training sets with support vector machine (SVM). The experiments were carried out using MATLAB R2015a and LIBSVM v3.20 on seven types of the final training sets generated from each of the synthetic and real-world data sets from the University of California Irvine machine learning repository and the ELENA project. The experimental results confirm that an inclusion of FMCOVs on the final training sets having raw data can improve the SVM classification accuracy significantly.

Download Full-text

HARVESTING, INTEGRATING AND DISTRIBUTING LARGE OPEN GEOSPATIAL DATASETS USING FREE AND OPEN-SOURCE SOFTWARE

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xli-b7-939-2016 ◽

2016 ◽

Vol XLI-B7 ◽

pp. 939-940

Author(s):

Ricardo Oliveira ◽

Rafael Moreno

Keyword(s):

Open Source ◽

Open Source Software ◽

Spatial Information ◽

Open Data ◽

Federal State ◽

Data Sets ◽

Data Set ◽

Geospatial Datasets ◽

State And Local ◽

The City

Federal, State and Local government agencies in the USA are investing heavily on the dissemination of Open Data sets produced by each of them. The main driver behind this thrust is to increase agencies’ transparency and accountability, as well as to improve citizens’ awareness. However, not all Open Data sets are easy to access and integrate with other Open Data sets available even from the same agency. The City and County of Denver Open Data Portal distributes several types of geospatial datasets, one of them is the city parcels information containing 224,256 records. Although this data layer contains many pieces of information it is incomplete for some custom purposes. Open-Source Software were used to first collect data from diverse City of Denver Open Data sets, then upload them to a repository in the Cloud where they were processed using a PostgreSQL installation on the Cloud and Python scripts. Our method was able to extract non-spatial information from a ‘not-ready-to-download’ source that could then be combined with the initial data set to enhance its potential use.

Download Full-text