Using Machine Learning to Map Western Australian Landscapes

Landscapes evolve due to climatic conditions, tectonic activity, geological features, biological activity, and sedimentary dynamics. These processes link geological processes at depth to surface features. Consequently, the study of landscapes can reveal essential information about the geochemical footprint of ore deposits at depth. Advances in satellite imaging and computing power have enabled the creation of large geospatial datasets, the sheer size of which necessitates automated processing. We describe a methodology to enable the automated mapping of landscape pattern domains using machine learning (ML) algorithms. From a freely available Digital Elevation Model, derived data, and sample landclass boundaries provided by domain experts, our algorithm produces a dense map of the model region in Western Australia. Both random forest and support vector machine classification achieve about 98\% classification accuracy with reasonable runtime of 48 minutes on a single core. We discuss computational resources and study the effect of grid resolution. Larger tiles result in a more contiguous map, while smaller tiles result in a more detailed, and at some point, noisy map. Diversity and distribution of landscapes mapped in this study support previous results. In addition, our results are consistent with the geological trends and main basement features in the region.

Download Full-text

Using Machine Learning to Map Western Australian Landscapes for Mineral Exploration

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10070459 ◽

2021 ◽

Vol 10 (7) ◽

pp. 459

Author(s):

Thomas Albrecht ◽

Ignacio González-Álvarez ◽

Jens Klump

Keyword(s):

Machine Learning ◽

Large Scale ◽

Mineral Exploration ◽

Climatic Conditions ◽

Tectonic Activity ◽

Support Vector ◽

Near Surface ◽

Geological Processes ◽

Landscape Variability ◽

Model Region

Landscapes evolve due to climatic conditions, tectonic activity, geological features, biological activity, and sedimentary dynamics. Geological processes at depth ultimately control and are linked to the resulting surface features. Large regions in Australia, West Africa, India, and China are blanketed by cover (intensely weathered surface material and/or later sediment deposition, both up to hundreds of metres thick). Mineral exploration through cover poses a significant technological challenge worldwide. Classifying and understanding landscape types and their variability is of key importance for mineral exploration in covered regions. Landscape variability expresses how near-surface geochemistry is linked to underlying lithologies. Therefore, landscape variability mapping should inform surface geochemical sampling strategies for mineral exploration. Advances in satellite imaging and computing power have enabled the creation of large geospatial data sets, the sheer size of which necessitates automated processing. In this study, we describe a methodology to enable the automated mapping of landscape pattern domains using machine learning (ML) algorithms. From a freely available digital elevation model, derived data, and sample landclass boundaries provided by domain experts, our algorithm produces a dense map of the model region in Western Australia. Both random forest and support vector machine classification achieve approximately 98% classification accuracy with a reasonable runtime of 48 minutes on a single Intel® Core™ i7-8550U CPU core. We discuss computational resources and study the effect of grid resolution. Larger tiles result in a more contiguous map, whereas smaller tiles result in a more detailed and, at some point, noisy map. Diversity and distribution of landscapes mapped in this study support previous results. In addition, our results are consistent with the geological trends and main basement features in the region. Mapping landscape variability at a large scale can be used globally as a fundamental tool for guiding more efficient mineral exploration programs in regions under cover.

Download Full-text

Drill-Core Mineral Abundance Estimation Using Hyperspectral and High-Resolution Mineralogical Data

Remote Sensing ◽

10.3390/rs12071218 ◽

2020 ◽

Vol 12 (7) ◽

pp. 1218

Author(s):

Laura Tuşa ◽

Mahdi Khodadadzadeh ◽

Cecilia Contreras ◽

Kasra Rafiezadeh Shahi ◽

Margret Fuchs ◽

...

Keyword(s):

Machine Learning ◽

High Resolution ◽

Ore Deposits ◽

Machine Learning Algorithms ◽

Training Data ◽

Support Vector ◽

Drill Core ◽

Data Types ◽

Mineralogical Characterization ◽

Core Samples

Due to the extensive drilling performed every year in exploration campaigns for the discovery and evaluation of ore deposits, drill-core mapping is becoming an essential step. While valuable mineralogical information is extracted during core logging by on-site geologists, the process is time consuming and dependent on the observer and individual background. Hyperspectral short-wave infrared (SWIR) data is used in the mining industry as a tool to complement traditional logging techniques and to provide a rapid and non-invasive analytical method for mineralogical characterization. Additionally, Scanning Electron Microscopy-based image analyses using a Mineral Liberation Analyser (SEM-MLA) provide exhaustive high-resolution mineralogical maps, but can only be performed on small areas of the drill-cores. We propose to use machine learning algorithms to combine the two data types and upscale the quantitative SEM-MLA mineralogical data to drill-core scale. This way, quasi-quantitative maps over entire drill-core samples are obtained. Our upscaling approach increases result transparency and reproducibility by employing physical-based data acquisition (hyperspectral imaging) combined with mathematical models (machine learning). The procedure is tested on 5 drill-core samples with varying training data using random forests, support vector machines and neural network regression models. The obtained mineral abundance maps are further used for the extraction of mineralogical parameters such as mineral association.

Download Full-text

Predicting Forest Fires using Supervised and Ensemble Machine Learning Algorithms

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b2878.078219 ◽

2019 ◽

Vol 8 (2) ◽

pp. 3697-3705 ◽

Cited By ~ 1

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Forest Fires ◽

Principal Component ◽

Climatic Conditions ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Support Vector ◽

Physical Factors

Forest fires have become one of the most frequently occurring disasters in recent years. The effects of forest fires have a lasting impact on the environment as it lead to deforestation and global warming, which is also one of its major cause of occurrence. Forest fires are dealt by collecting the satellite images of forest and if there is any emergency caused by the fires then the authorities are notified to mitigate its effects. By the time the authorities get to know about it, the fires would have already caused a lot of damage. Data mining and machine learning techniques can provide an efficient prevention approach where data associated with forests can be used for predicting the eventuality of forest fires. This paper uses the dataset present in the UCI machine learning repository which consists of physical factors and climatic conditions of the Montesinho park situated in Portugal. Various algorithms like Logistic regression, Support Vector Machine, Random forest, K-Nearest neighbors in addition to Bagging and Boosting predictors are used, both with and without Principal Component Analysis (PCA). Among the models in which PCA was applied, Logistic Regression gave the highest F-1 score of 68.26 and among the models where PCA was absent, Gradient boosting gave the highest score of 68.36.

Download Full-text

CARTOGRAPHY OF MOROCCAN ARGAN TREE USING COMBINED OPTICAL AND SAR IMAGERY INTEGRATED WITH DIGITAL ELEVATION MODEL

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlvi-4-w5-2021-211-2021 ◽

2021 ◽

Vol XLVI-4/W5-2021 ◽

pp. 211-217

Author(s):

E. Elmoussaoui ◽

A. Moumni ◽

A. Lahrouni

Keyword(s):

Machine Learning ◽

Time Series ◽

Digital Elevation Model ◽

Optical Data ◽

Support Vector ◽

Ground Truth Data ◽

Argan Tree ◽

Digital Elevation ◽

Elevation Model ◽

Sentinel 2

Abstract. Forest tree species mapping became easier due to the global availability of high spatio-temporal resolution images acquired from multiple sensors. Such data can lead to better forest resources management. Machine-learning pixel based analysis was performed to multi-spectral Sentinel-2 and Synthetic Aperture Radar Sentinel-1 time series integrated with Digital Elevation Model acquired over Argan forest of Essaouira province, Morocco. The argan tree constitutes a fundamental resource for the populations of this arid area of Morocco. This research aims to use the potential of the combination of multi-sensor data to detect, map and identify argan tree from other forest species using three Machine Learning algorithms: Support Vector Machine (SVM), Maximum Likelihood (ML) and Artificial Neural Networks (ANN). The exploited datasets included Sentinel-1 (S1), Sentinel-2 (S2) time series, Shuttle Radar Topographic Missing Digital Elevation Model (DEM) layer and Ground truth data. We tested several sets of scenarios, including single S1 derived features, single S2 time series and combined S1 and S2 derived layers with DEM scene acquisition. The best results (overall accuracy OA and Kappa coefficient K) obtained from time series of optical data (NDVI): OA = 86.87%, K = 0.84, from time series of SAR data (VV+VH/VV): OA = 45.90%, K = 0.36, from the combination of optical and SAR time series (NDVI+VH+DEM): OA = 93.01%, K = 0.914, and from the fusion of optical time series and DEM layer (NDVI+DEM): OA = 93.25%, K = 0.91. These results indicate that single-sensor (S2) integrated with the DEM layer led us to obtain the highest classification results.

Download Full-text

Comparison Between Machine Learning Models for Yield Forecast in Cocoa Crops in Santander, Colombia

Revista Facultad de Ingeniería ◽

10.19053/01211129.v29.n54.2020.10853 ◽

2020 ◽

Vol 29 (54) ◽

pp. e10853

Author(s):

Henry Lamos-Díaz ◽

David Esteban Puentes-Garzón ◽

Diego Alejandro Zarate-Caicedo

Keyword(s):

Machine Learning ◽

Influencing Factors ◽

Crop Yield ◽

Sun Exposure ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Support Vector ◽

Learning Models ◽

Essential Information ◽

Significant Difference

The identification of influencing factors in crop yield (kg·ha-1) provides essential information for decision-making processes related to the prediction and improvement of productivity, which gives farmers the opportunity to increase their income. The current study investigates the application of multiple machine learning algorithms for cocoa yield prediction and influencing factors identification. The Support Vector Machines (SVM) and Ensemble Learning Models (Random Forests, Gradient Boosting) are compared with Least Absolute Shrinkage and Selection Operator (LASSO) regression models. The considered predictors were climate conditions, cocoa variety, fertilization level and sun exposition in an experimental crop located in Rionegro, Santander. Results showed that Gradient Boosting is the best prediction alternative with Coefficient of determination (R2) = 68%, Mean Absolute Error (MAE) = 13.32, and Root Mean Square Error (RMSE) = 20.41. The crop yield variability is explained mainly by the radiation one month before harvest, the accumulated rainfall on the harvest month, and the temperature one month before harvest. Likewise, the crop yields are evaluated based on the kind of sun exposure, and it was found that radiation one month before harvest is the most influential factor in shade-grown plants. On the other hand, rainfall and soil moisture are determining variables in sun-grown plants, which is associated with the water requirements. These results suggest a differentiated management for crops depending on the kind of sun exposure to avoid compromising productivity, since there is no significant difference in the yield of both agricultural managements.

Download Full-text

Using Machine Learning Algorithms on Prediction of Stock Price

Journal of Modeling and Optimization ◽

10.32732/jmo.2020.12.2.84 ◽

2020 ◽

Vol 12 (2) ◽

pp. 84-99

Author(s):

Li-Pang Chen

Keyword(s):

Machine Learning ◽

Stock Price ◽

Short Term Memory ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Short Term ◽

Learning Techniques ◽

Historical Database ◽

Long Short Term Memory

In this paper, we investigate analysis and prediction of the time-dependent data. We focus our attention on four different stocks are selected from Yahoo Finance historical database. To build up models and predict the future stock price, we consider three different machine learning techniques including Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN) and Support Vector Regression (SVR). By treating close price, open price, daily low, daily high, adjusted close price, and volume of trades as predictors in machine learning methods, it can be shown that the prediction accuracy is improved.

Download Full-text

A Comparative Study of Different Machine Learning Algorithms for Disease Prediction

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse/v7i7/0177 ◽

2017 ◽

Vol 7 (7) ◽

pp. 172

Author(s):

Anantvir Singh Romana

Keyword(s):

Machine Learning ◽

Subsequent Treatment ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Disease Prediction ◽

Classification Problems ◽

Learning Techniques ◽

Neural Network Classifiers ◽

Diagnostic Detection

Accurate diagnostic detection of the disease in a patient is critical and may alter the subsequent treatment and increase the chances of survival rate. Machine learning techniques have been instrumental in disease detection and are currently being used in various classification problems due to their accurate prediction performance. Various techniques may provide different desired accuracies and it is therefore imperative to use the most suitable method which provides the best desired results. This research seeks to provide comparative analysis of Support Vector Machine, Naïve bayes, J48 Decision Tree and neural network classifiers breast cancer and diabetes datsets.

Download Full-text

Structure-Based Virtual Screening of Perfluoroalkyl and Polyfluoroalkyl Substances (PFASs) as Endocrine Disruptors of Androgen Receptor Activity Using Molecular Docking and Machine Learning

10.26434/chemrxiv.11886702.v1 ◽

2020 ◽

Author(s):

Azhagiya Singam Ettayapuram Ramaprasad ◽

Phum Tachachartvanich ◽

Denis Fourches ◽

Anatoly Soshilov ◽

Jennifer C.Y. Hsieh ◽

...

Keyword(s):

Machine Learning ◽

Molecular Docking ◽

Androgen Receptor ◽

Endocrine Disruptors ◽

Hormone Receptors ◽

Steroid Hormone Receptors ◽

Machine Learning Techniques ◽

Support Vector ◽

Polyfluoroalkyl Substances ◽

Perfluoroalkyl And Polyfluoroalkyl Substances

Perfluoroalkyl and Polyfluoroalkyl Substances (PFASs) pose a substantial threat as endocrine disruptors, and thus early identification of those that may interact with steroid hormone receptors, such as the androgen receptor (AR), is critical. In this study we screened 5,206 PFASs from the CompTox database against the different binding sites on the AR using both molecular docking and machine learning techniques. We developed support vector machine models trained on Tox21 data to classify the active and inactive PFASs for AR using different chemical fingerprints as features. The maximum accuracy was 95.01% and Matthew’s correlation coefficient (MCC) was 0.76 respectively, based on MACCS fingerprints (MACCSFP). The combination of docking-based screening and machine learning models identified 29 PFASs that have strong potential for activity against the AR and should be considered priority chemicals for biological toxicity testing.

Download Full-text

IMPLEMENTASI SUPPORT VECTOR MACHINE PADA PREDIKSI HARGA SAHAM GABUNGAN (IHSG)

Jurnal Ilmiah Teknologi dan Rekayasa ◽

10.35760/tr.2020.25i1.2571 ◽

2020 ◽

Vol 25 (1) ◽

pp. 24-38

Author(s):

Eka Patriya

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Support Vector Regression ◽

Support Vector

Saham adalah instrumen pasar keuangan yang banyak dipilih oleh investor sebagai alternatif sumber keuangan, akan tetapi saham yang diperjual belikan di pasar keuangan sering mengalami fluktuasi harga (naik dan turun) yang tinggi. Para investor berpeluang tidak hanya mendapat keuntungan, tetapi juga dapat mengalami kerugian di masa mendatang. Salah satu indikator yang perlu diperhatikan oleh investor dalam berinvestasi saham adalah pergerakan Indeks Harga Saham Gabungan (IHSG). Tindakan dalam menganalisa IHSG merupakan hal yang penting dilakukan oleh investor dengan tujuan untuk menemukan suatu trend atau pola yang mungkin berulang dari pergerakan harga saham masa lalu, sehingga dapat digunakan untuk memprediksi pergerakan harga saham di masa mendatang. Salah satu metode yang dapat digunakan untuk memprediksi pergerakan harga saham secara akurat adalah machine learning. Pada penelitian ini dibuat sebuah model prediksi harga penutupan IHSG menggunakan algoritma Support Vector Regression (SVR) yang menghasilkan kemampuan prediksi dan generalisasi yang baik dengan nilai RMSE training dan testing sebesar 14.334 dan 20.281, serta MAPE training dan testing sebesar 0.211% dan 0.251%. Hasil penelitian ini diharapkan dapat membantu para investor dalam mengambil keputusan untuk menyusun strategi investasi saham.

Download Full-text

Binary Spectrum Feature for Improved Classiﬁer Performance

10.36227/techrxiv.12993122 ◽

2020 ◽

Author(s):

Nalika Ulapane ◽

Karthick Thiyagarajan ◽

sarath kodagoda

Keyword(s):

Machine Learning ◽

Classification Performance ◽

Feature Reduction ◽

Sensor Data ◽

Machine Learning Techniques ◽

Support Vector ◽

Svm Classifier ◽

Monitoring Task ◽

Classifier Performance ◽

Spectrum Feature

<div>Classiﬁcation has become a vital task in modern machine learning and Artiﬁcial Intelligence applications, including smart sensing. Numerous machine learning techniques are available to perform classiﬁcation. Similarly, numerous practices, such as feature selection (i.e., selection of a subset of descriptor variables that optimally describe the output), are available to improve classiﬁer performance. In this paper, we consider the case of a given supervised learning classiﬁcation task that has to be performed making use of continuous-valued features. It is assumed that an optimal subset of features has already been selected. Therefore, no further feature reduction, or feature addition, is to be carried out. Then, we attempt to improve the classiﬁcation performance by passing the given feature set through a transformation that produces a new feature set which we have named the “Binary Spectrum”. Via a case study example done on some Pulsed Eddy Current sensor data captured from an infrastructure monitoring task, we demonstrate how the classiﬁcation accuracy of a Support Vector Machine (SVM) classiﬁer increases through the use of this Binary Spectrum feature, indicating the feature transformation’s potential for broader usage.</div><div><br></div>

Download Full-text