Mapping Allochemical Limestone Formations in Hazara, Pakistan Using Google Cloud Architecture: Application of Machine-Learning Algorithms on Multispectral Data

Muhammad Fawad Akbar Khan; Khan Muhammad; Shahid Bashir; Shahab Ud Din; Muhammad Hanif

doi:10.3390/ijgi10020058

Mapping Allochemical Limestone Formations in Hazara, Pakistan Using Google Cloud Architecture: Application of Machine-Learning Algorithms on Multispectral Data

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10020058 ◽

2021 ◽

Vol 10 (2) ◽

pp. 58

Author(s):

Muhammad Fawad Akbar Khan ◽

Khan Muhammad ◽

Shahid Bashir ◽

Shahab Ud Din ◽

Muhammad Hanif

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Learning Algorithms ◽

Remote Sensing Data ◽

Kappa Coefficient ◽

Machine Learning Algorithms ◽

Landsat 8 ◽

Sensing Data ◽

Fossiliferous Limestone

Low-resolution Geological Survey of Pakistan (GSP) maps surrounding the region of interest show oolitic and fossiliferous limestone occurrences correspondingly in Samanasuk, Lockhart, and Margalla hill formations in the Hazara division, Pakistan. Machine-learning algorithms (MLAs) have been rarely applied to multispectral remote sensing data for differentiating between limestone formations formed due to different depositional environments, such as oolitic or fossiliferous. Unlike the previous studies that mostly report lithological classification of rock types having different chemical compositions by the MLAs, this paper aimed to investigate MLAs’ potential for mapping subclasses within the same lithology, i.e., limestone. Additionally, selecting appropriate data labels, training algorithms, hyperparameters, and remote sensing data sources were also investigated while applying these MLAs. In this paper, first, oolitic (Samanasuk), fossiliferous (Lockhart and Margalla) limestone-bearing formations along with the adjoining Hazara formation were mapped using random forest (RF), support vector machine (SVM), classification and regression tree (CART), and naïve Bayes (NB) MLAs. The RF algorithm reported the best accuracy of 83.28% and a Kappa coefficient of 0.78. To further improve the targeted allochemical limestone formation map, annotation labels were generated by the fusion of maps obtained from principal component analysis (PCA), decorrelation stretching (DS), X-means clustering applied to ASTER-L1T, Landsat-8, and Sentinel-2 datasets. These labels were used to train and validate SVM, CART, NB, and RF MLAs to obtain a binary classification map of limestone occurrences in the Hazara division, Pakistan using the Google Earth Engine (GEE) platform. The classification of Landsat-8 data by CART reported 99.63% accuracy, with a Kappa coefficient of 0.99, and was in good agreement with the field validation. This binary limestone map was further classified into oolitic (Samanasuk) and fossiliferous (Lockhart and Margalla) formations by all the four MLAs; in this case, RF surpassed all the other algorithms with an improved accuracy of 96.36%. This improvement can be attributed to better annotation, resulting in a binary limestone classification map, which formed a mask for improved classification of oolitic and fossiliferous limestone in the area.

Get full-text (via PubEx)

Machine Learning Algorithms for Automatic Lithological Mapping Using Remote Sensing Data: A Case Study from Souk Arbaa Sahel, Sidi Ifni Inlier, Western Anti-Atlas, Morocco

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi8060248 ◽

2019 ◽

Vol 8 (6) ◽

pp. 248 ◽

Cited By ~ 7

Author(s):

Imane Bachri ◽

Mustapha Hakdaoui ◽

Mohammed Raji ◽

Ana Cláudia Teodoro ◽

Abdelmajid Benbouziane

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Learning Algorithms ◽

Remote Sensing Data ◽

Machine Learning Algorithms ◽

High Dimensional ◽

Landsat 8 ◽

Landsat 8 Oli ◽

Lithological Mapping ◽

Sensing Data

Remote sensing data proved to be a valuable resource in a variety of earth science applications. Using high-dimensional data with advanced methods such as machine learning algorithms (MLAs), a sub-domain of artificial intelligence, enhances lithological mapping by spectral classification. Support vector machines (SVM) are one of the most popular MLAs with the ability to define non-linear decision boundaries in high-dimensional feature space by solving a quadratic optimization problem. This paper describes a supervised classification method considering SVM for lithological mapping in the region of Souk Arbaa Sahel belonging to the Sidi Ifni inlier, located in southern Morocco (Western Anti-Atlas). The aims of this study were (1) to refine the existing lithological map of this region, and (2) to evaluate and study the performance of the SVM approach by using combined spectral features of Landsat 8 OLI with digital elevation model (DEM) geomorphometric attributes of ALOS/PALSAR data. We performed an SVM classification method to allow the joint use of geomorphometric features and multispectral data of Landsat 8 OLI. The results indicated an overall classification accuracy of 85%. From the results obtained, we can conclude that the classification approach produced an image containing lithological units which easily identified formations such as silt, alluvium, limestone, dolomite, conglomerate, sandstone, rhyolite, andesite, granodiorite, quartzite, lutite, and ignimbrite, coinciding with those already existing on the published geological map. This result confirms the ability of SVM as a supervised learning algorithm for lithological mapping purposes.

Get full-text (via PubEx)

Comparison of Machine Learning Algorithms for Wildland-Urban Interface Fuelbreak Planning Integrating ALS and UAV-Borne LiDAR Data and Multispectral Images

Drones ◽

10.3390/drones4020021 ◽

2020 ◽

Vol 4 (2) ◽

pp. 21 ◽

Cited By ~ 1

Author(s):

Francisco Rodríguez-Puerta ◽

Rafael Alonso Ponce ◽

Fernando Pérez-Rodríguez ◽

Beatriz Águeda ◽

Saray Martín-García ◽

...

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Random Forest ◽

Learning Algorithms ◽

Remote Sensing Data ◽

Machine Learning Algorithms ◽

Data Sources ◽

Lidar Data ◽

Sensing Data ◽

Sentinel 2

Controlling vegetation fuels around human settlements is a crucial strategy for reducing fire severity in forests, buildings and infrastructure, as well as protecting human lives. Each country has its own regulations in this respect, but they all have in common that by reducing fuel load, we in turn reduce the intensity and severity of the fire. The use of Unmanned Aerial Vehicles (UAV)-acquired data combined with other passive and active remote sensing data has the greatest performance to planning Wildland-Urban Interface (WUI) fuelbreak through machine learning algorithms. Nine remote sensing data sources (active and passive) and four supervised classification algorithms (Random Forest, Linear and Radial Support Vector Machine and Artificial Neural Networks) were tested to classify five fuel-area types. We used very high-density Light Detection and Ranging (LiDAR) data acquired by UAV (154 returns·m−2 and ortho-mosaic of 5-cm pixel), multispectral data from the satellites Pleiades-1B and Sentinel-2, and low-density LiDAR data acquired by Airborne Laser Scanning (ALS) (0.5 returns·m−2, ortho-mosaic of 25 cm pixels). Through the Variable Selection Using Random Forest (VSURF) procedure, a pre-selection of final variables was carried out to train the model. The four algorithms were compared, and it was concluded that the differences among them in overall accuracy (OA) on training datasets were negligible. Although the highest accuracy in the training step was obtained in SVML (OA=94.46%) and in testing in ANN (OA=91.91%), Random Forest was considered to be the most reliable algorithm, since it produced more consistent predictions due to the smaller differences between training and testing performance. Using a combination of Sentinel-2 and the two LiDAR data (UAV and ALS), Random Forest obtained an OA of 90.66% in training and of 91.80% in testing datasets. The differences in accuracy between the data sources used are much greater than between algorithms. LiDAR growth metrics calculated using point clouds in different dates and multispectral information from different seasons of the year are the most important variables in the classification. Our results support the essential role of UAVs in fuelbreak planning and management and thus, in the prevention of forest fires.

Get full-text (via PubEx)

Estimation of chlorophyll content in radish leaves using hyperspectral remote sensing data and machine learning algorithms

10.1117/12.2600072 ◽

2021 ◽

Author(s):

Adenan Yandra Nofrizal ◽

Rei Sonobe ◽

Hiroto Yamashita ◽

Takashi Ikka ◽

Akio Morita

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Chlorophyll Content ◽

Learning Algorithms ◽

Remote Sensing Data ◽

Hyperspectral Remote Sensing ◽

Machine Learning Algorithms ◽

Sensing Data

Get full-text (via PubEx)

Machine Learning Algorithms for Optical Remote Sensing Data Classification and Analysis

10.1007/978-981-16-5847-1_10 ◽

2021 ◽

pp. 195-220

Author(s):

G. P. Obi Reddy ◽

K. C. Arun Kumar

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Learning Algorithms ◽

Remote Sensing Data ◽

Data Classification ◽

Machine Learning Algorithms ◽

Optical Remote Sensing ◽

Sensing Data

Get full-text (via PubEx)

Geological mapping using remote sensing data: A comparison of five machine learning algorithms, their response to variations in the spatial distribution of training data and the use of explicit spatial information

Computers & Geosciences ◽

10.1016/j.cageo.2013.10.008 ◽

2014 ◽

Vol 63 ◽

pp. 22-33 ◽

Cited By ~ 203

Author(s):

Matthew J. Cracknell ◽

Anya M. Reading

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Spatial Distribution ◽

Spatial Information ◽

Learning Algorithms ◽

Remote Sensing Data ◽

Geological Mapping ◽

Machine Learning Algorithms ◽

Training Data ◽

Sensing Data

Get full-text (via PubEx)

Using remote sensing data for environmental monitoring of water objects using GIS and machine learning

IOP Conference Series Earth and Environmental Science ◽

10.1088/1755-1315/937/2/022051 ◽

2021 ◽

Vol 937 (2) ◽

pp. 022051

Author(s):

D Krivoguz ◽

A Semenova ◽

S Mal’ko

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Spatial Data ◽

Learning Algorithm ◽

Learning Algorithms ◽

Remote Sensing Data ◽

Environmental Variable ◽

Machine Learning Algorithms ◽

Sensing Data ◽

Water Ecosystems

Abstract The main way to understand variability of any spatial data using remote sensing is calculating spectral indices. For now, some difficulties have receiving water surface temperature due to specific properties for satellite sensors and low spatial resolution. The main sources of receiving salinity data are remote sensing data from ESA SMOS, NASA Aquarius and SMAP satellites. Using different machine learning algorithms, we can get models or equations, representing dependency between studied environmental variable and different spectral channels of remote monitoring data. After receiving and collecting remote sensing data in database this system uses machine learning algorithms to find dependency between collected field data and different spectral bands of the remote sensing data. Our goal was to form an analytical system based on remote sensors and machine learning algorithm to analyse, predict and evaluate water ecosystems for fisheries and environmental protection.

Get full-text (via PubEx)

Influence of Variable Selection and Forest Type on Forest Aboveground Biomass Estimation Using Machine Learning Algorithms

Forests ◽

10.3390/f10121073 ◽

2019 ◽

Vol 10 (12) ◽

pp. 1073 ◽

Cited By ~ 10

Author(s):

Li ◽

Liu

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Variable Selection ◽

Aboveground Biomass ◽

Forest Type ◽

Learning Algorithms ◽

Forest Biomass ◽

Machine Learning Algorithms ◽

Biomass Estimation ◽

Landsat 8

Forest biomass is a major store of carbon and plays a crucial role in the regional and global carbon cycle. Accurate forest biomass assessment is important for monitoring and mapping the status of and changes in forests. However, while remote sensing-based forest biomass estimation in general is well developed and extensively used, improving the accuracy of biomass estimation remains challenging. In this paper, we used China’s National Forest Continuous Inventory data and Landsat 8 Operational Land Imager data in combination with three algorithms, either the linear regression (LR), random forest (RF), or extreme gradient boosting (XGBoost), to establish biomass estimation models based on forest type. In the modeling process, two methods of variable selection, e.g., stepwise regression and variable importance-base method, were used to select optimal variable subsets for LR and machine learning algorithms (e.g., RF and XGBoost), respectively. Comfortingly, the accuracy of models was significantly improved, and thus the following conclusions were drawn: (1) Variable selection is very important for improving the performance of models, especially for machine learning algorithms, and the influence of variable selection on XGBoost is significantly greater than that of RF. (2) Machine learning algorithms have advantages in aboveground biomass (AGB) estimation, and the XGBoost and RF models significantly improved the estimation accuracy compared with the LR models. Despite that the problems of overestimation and underestimation were not fully eliminated, the XGBoost algorithm worked well and reduced these problems to a certain extent. (3) The approach of AGB modeling based on forest type is a very advantageous method for improving the performance at the lower and higher values of AGB. Some conclusions in this paper were probably different as the study area changed. The methods used in this paper provide an optional and useful approach for improving the accuracy of AGB estimation based on remote sensing data, and the estimation of AGB was a reference basis for monitoring the forest ecosystem of the study area.

Get full-text (via PubEx)

Classification of land use areas using remote sensing data with machine learning

2020 IEEE International conference of Moroccan Geomatics (Morgeo) ◽

10.1109/morgeo49228.2020.9121883 ◽

2020 ◽

Author(s):

Mustapha Skittou ◽

Ouadia Madhoum ◽

Abdelouahab Khannouss ◽

Mohamed Merrouchi ◽

Taoufiq Gadi

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Land Use ◽

Remote Sensing Data ◽

Sensing Data

Get full-text (via PubEx)

Comprehensive Review on Application of Machine Learning Algorithms for Water Quality Parameter Estimation Using Remote Sensing Data

Sensors and Materials ◽

10.18494/sam.2020.2953 ◽

2020 ◽

Vol 32 (11) ◽

pp. 3879

Author(s):

Nimisha Wagle ◽

Tri Dev Acharya ◽

Dong Ha Lee

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Water Quality ◽

Parameter Estimation ◽

Remote Sensing Data ◽

Quality Parameter ◽

Machine Learning Algorithms ◽

Water Quality Parameter ◽

Comprehensive Review ◽

Sensing Data

Get full-text (via PubEx)

Comparative Analysis of Machine Learning Algorithms in Automatic Identification and Extraction of Water Boundaries

Applied Sciences ◽

10.3390/app112110062 ◽

2021 ◽

Vol 11 (21) ◽

pp. 10062

Author(s):

Aimin Li ◽

Meng Fan ◽

Guangduo Qin ◽

Youcheng Xu ◽

Hailong Wang

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Decision Tree ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Water Bodies ◽

Support Vector ◽

Landsat 8 ◽

Transfer Performance ◽

Remote Sensing Images

Monitoring open water bodies accurately is important for assessing the role of ecosystem services in the context of human survival and climate change. There are many methods available for water body extraction based on remote sensing images, such as the normalized difference water index (NDWI), modified NDWI (MNDWI), and machine learning algorithms. Based on Landsat-8 remote sensing images, this study focuses on the effects of six machine learning algorithms and three threshold methods used to extract water bodies, evaluates the transfer performance of models applied to remote sensing images in different periods, and compares the differences among these models. The results are as follows. (1) Various algorithms require different numbers of samples to reach their optimal consequence. The logistic regression algorithm requires a minimum of 110 samples. As the number of samples increases, the order of the optimal model is support vector machine, neural network, random forest, decision tree, and XGBoost. (2) The accuracy evaluation performance of each machine learning on the test set cannot represent the local area performance. (3) When these models are directly applied to remote sensing images in different periods, the AUC indicators of each machine learning algorithm for three regions all show a significant decline, with a decrease range of 0.33–66.52%, and the differences among the different algorithm performances in the three areas are obvious. Generally, the decision tree algorithm has good transfer performance among the machine learning algorithms with area under curve (AUC) indexes of 0.790, 0.518, and 0.697 in the three areas, respectively, and the average value is 0.668. The Otsu threshold algorithm is the optimal among threshold methods, with AUC indexes of 0.970, 0.617, and 0.908 in the three regions respectively and an average AUC of 0.832.

Get full-text (via PubEx)