OIL PALM PLANTATION DETECTION IN INDONESIA USING SENTINEL-2 AND LANDSAT-8 OPTICAL SATELLITE IMAGERY  (CASE STUDY: ROKAN HULU REGENCY, RIAU PROVINCE)

The objective of this work is to assess the capability of multispectral optical Landsat and Sentinel images to detect oil palm plantations in Rokan Hulu, Riau, one of the largest palm oil producers in Indonesia, by combining multispectral bands and composite indices. In addition to comparing two different sets of satellite images, we also ascertain which gives the best performance among the supervised machine learning classifiers CART Decision Tree, Random Forest, Support Vector Machine, and Naive Bayes. With the use of multispectral bands and derived composite indices, the best classifier achieved an overall accuracy of up to 92%. The findings and contributions of the study include: (1) insight into a set of feature combinations that provides the highest model accuracy, and (2) an extensive evaluation of machine learning-based classifiers on two different optical satellite imageries. Our study could further be beneficial for the government in providing more scalable plantation statistics.

Download Full-text

Comparing Supervised Machine Learning Strategies and Linguistic Features to Search for Very Negative Opinions

Information ◽

10.3390/info10010016 ◽

2019 ◽

Vol 10 (1) ◽

pp. 16 ◽

Cited By ~ 3

Author(s):

Sattam Almatarneh ◽

Pablo Gamallo

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Empirical Study ◽

Learning Strategies ◽

Supervised Machine Learning ◽

Support Vector ◽

Word Embeddings ◽

Linguistic Features ◽

Machine Learning Classifiers ◽

Supervised Machine Learning Classifiers

In this paper, we examine the performance of several classifiers in the process of searching for very negative opinions. More precisely, we do an empirical study that analyzes the influence of three types of linguistic features (n-grams, word embeddings, and polarity lexicons) and their combinations when they are used to feed different supervised machine learning classifiers: Naive Bayes (NB), Decision Tree (DT), and Support Vector Machine (SVM). The experiments we have carried out show that SVM clearly outperforms NB and DT in all datasets by taking into account all features individually as well as their combinations.

Download Full-text

Credibility Detection on Twitter News Using Machine Learning Approach

International Journal of Intelligent Systems and Applications ◽

10.5815/ijisa.2021.03.01 ◽

2021 ◽

Vol 13 (3) ◽

pp. 1-10

Author(s):

Marina Azer ◽

◽

Mohamed Taha ◽

Hala H. Zayed ◽

Mahmoud Gadallah

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Random Forest ◽

Learning System ◽

Supervised Machine Learning ◽

Support Vector ◽

Sources Of Information ◽

Supervised Machine Learning Classifiers ◽

False News ◽

The Impact

Social media presence is a crucial portion of our life. It is considered one of the most important sources of information than traditional sources. Twitter has become one of the prevalent social sites for exchanging viewpoints and feelings. This work proposes a supervised machine learning system for discovering false news. One of the credibility detection problems is finding new features that are most predictive to better performance classifiers. Both features depending on new content, and features based on the user are used. The features' importance is examined, and their impact on the performance. The reasons for choosing the final feature set using the k-best method are explained. Seven supervised machine learning classifiers are used. They are Naïve Bayes (NB), Support vector machine (SVM), Knearest neighbors (KNN), Logistic Regression (LR), Random Forest (RF), Maximum entropy (ME), and conditional random forest (CRF). Training and testing models were conducted using the Pheme dataset. The feature's analysis is introduced and compared to the features depending on the content, as the decisive factors in determining the validity. Random forest shows the highest performance while using user-based features only and using a mixture of both types of features; features depending on content and the features based on the user, accuracy (82.2 %) in using user-based features only. We achieved the highest results by using both types of features, utilizing random forest classifier accuracy(83.4%). In contrast, logistic regression was the best as to using features that are based on contents. Performance is measured by different measurements accuracy, precision, recall, and F1_score. We compared our feature set with other studies' features and the impact of our new features. We found that our conclusions exhibit high enhancement concerning discovering and verifying the false news regarding the discovery and verification of false news, comparing it to the current results of how it is developed.

Download Full-text

Comparing Supervised Machine Learning Strategies and Linguistic Features to Search for Very Negative Opinions

10.20944/preprints201811.0436.v1 ◽

2018 ◽

Author(s):

Sattam Almatarneh ◽

Pablo Gamallo

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Empirical Study ◽

Learning Strategies ◽

Supervised Machine Learning ◽

Support Vector ◽

Word Embeddings ◽

Linguistic Features ◽

Machine Learning Classifiers ◽

Supervised Machine Learning Classifiers

In this paper, we examine the performance of several classifiers in the process of searching for very negative opinions. More precisely, we do an empirical study that analyzes the influence of three types of linguistic features (n-grams, word embeddings, and polarity lexicons) and their combinations when they are used to feed different supervised machine learning classifiers: Support Vector Machine (SVM), Naive Bayes (NB), and Decision Tree (DT).

Download Full-text

PIXEL BASED LANDSLIDE IDENTIFICATION USING LANDSAT 8 AND GEE

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliii-b3-2021-721-2021 ◽

2021 ◽

Vol XLIII-B3-2021 ◽

pp. 721-726

Author(s):

P. Singh ◽

V. Maurya ◽

R. Dwivedi

Keyword(s):

Machine Learning ◽

Research Work ◽

Volcanic Eruptions ◽

Google Earth ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Geological Survey ◽

Support Vector ◽

Landsat 8 ◽

Multi Band

Abstract. Landslide is one of the most common natural disasters triggered mainly due to heavy rainfall, cloud burst, earthquake, volcanic eruptions, unorganized constructions of roads, and deforestation. In India, field surveying is the most common method used to identify potential landslide regions and update the landslide inventories maintained by the Geological Survey of India, but it is very time-consuming, costly, and inefficient. Alternatively, advanced remote sensing technologies in landslide analysis allow rapid and easy data acquisitions and help to improve the traditional method of landslide detection capabilities. Supervised Machine learning algorithms, for example, Support Vector Machine (SVM), are challenging to conventional techniques by predicting disasters with astounding accuracy. In this research work, we have utilized open-source datasets (Landsat 8 multi-band images and JAXA ALOS DSM) and Google Earth Engine (GEE) to identify landslides in Rudraprayag using machine learning techniques. Rudraprayag is a district of Uttarakhand state in India, which has always been the center of attention of geological studies due to its higher density of landslide-prone zones. For the training and validation purpose, labeled landslide locations obtained from landslide inventory (prepared by the Geological Survey of India) and layers such as NDVI, NDWI, and slope (generated from JAXA ALOS DSM and Landsat 8 satellite multi-band imagery) were used. The landslide identification has been performed using SVM, Classification and Regression Trees (CART), Minimum Distance, Random forest (RF), and Naïve Bayes techniques, in which SVM and RF outperformed all other techniques by achieving an 87.5% true positive rate (TPR).

Download Full-text

Examining the Capability of Supervised Machine Learning Classifiers in Extracting Flooded Areas from Landsat TM Imagery: A Case Study from a Mediterranean Flood

Remote Sensing ◽

10.3390/rs70303372 ◽

2015 ◽

Vol 7 (3) ◽

pp. 3372-3399 ◽

Cited By ~ 32

Author(s):

Gareth Ireland ◽

Michele Volpi ◽

George Petropoulos

Keyword(s):

Machine Learning ◽

Landsat Tm ◽

Supervised Machine Learning ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Supervised Machine Learning Classifiers ◽

Landsat Tm Imagery ◽

Flooded Areas

Download Full-text

Predictive Modelling of Employee Turnover in Indian IT Industry Using Machine Learning Techniques

Vision The Journal of Business Perspective ◽

10.1177/0972262918821221 ◽

2019 ◽

Vol 23 (1) ◽

pp. 12-21 ◽

Cited By ~ 2

Author(s):

Shikha N. Khera ◽

Divya

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Confusion Matrix ◽

Predictive Modelling ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Support Vector ◽

It Industry ◽

Knowledge Based ◽

Employee Attrition

Information technology (IT) industry in India has been facing a systemic issue of high attrition in the past few years, resulting in monetary and knowledge-based loses to the companies. The aim of this research is to develop a model to predict employee attrition and provide the organizations opportunities to address any issue and improve retention. Predictive model was developed based on supervised machine learning algorithm, support vector machine (SVM). Archival employee data (consisting of 22 input features) were collected from Human Resource databases of three IT companies in India, including their employment status (response variable) at the time of collection. Accuracy results from the confusion matrix for the SVM model showed that the model has an accuracy of 85 per cent. Also, results show that the model performs better in predicting who will leave the firm as compared to predicting who will not leave the company.

Download Full-text

Supervised Machine Learning Methods and Hyperspectral Imaging Techniques Jointly Applied for Brain Cancer Classification

Sensors ◽

10.3390/s21113827 ◽

2021 ◽

Vol 21 (11) ◽

pp. 3827

Author(s):

Gemma Urbanos ◽

Alberto Martín ◽

Guillermo Vázquez ◽

Marta Villanueva ◽

Manuel Villa ◽

...

Keyword(s):

Machine Learning ◽

Blood Vessel ◽

Hyperspectral Imaging ◽

Imaging Techniques ◽

Venous Blood ◽

Healthy Tissue ◽

Supervised Machine Learning ◽

Support Vector ◽

Arterial Blood

Hyperspectral imaging techniques (HSI) do not require contact with patients and are non-ionizing as well as non-invasive. As a consequence, they have been extensively applied in the medical field. HSI is being combined with machine learning (ML) processes to obtain models to assist in diagnosis. In particular, the combination of these techniques has proven to be a reliable aid in the differentiation of healthy and tumor tissue during brain tumor surgery. ML algorithms such as support vector machine (SVM), random forest (RF) and convolutional neural networks (CNN) are used to make predictions and provide in-vivo visualizations that may assist neurosurgeons in being more precise, hence reducing damages to healthy tissue. In this work, thirteen in-vivo hyperspectral images from twelve different patients with high-grade gliomas (grade III and IV) have been selected to train SVM, RF and CNN classifiers. Five different classes have been defined during the experiments: healthy tissue, tumor, venous blood vessel, arterial blood vessel and dura mater. Overall accuracy (OACC) results vary from 60% to 95% depending on the training conditions. Finally, as far as the contribution of each band to the OACC is concerned, the results obtained in this work are 3.81 times greater than those reported in the literature.

Download Full-text

Financial Context News Sentiment Analysis for the Lithuanian Language

Applied Sciences ◽

10.3390/app11104443 ◽

2021 ◽

Vol 11 (10) ◽

pp. 4443

Author(s):

Rokas Štrimaitis ◽

Pavel Stefanovič ◽

Simona Ramanauskaitė ◽

Asta Slotkienė

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Short Term Memory ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Experimental Investigations ◽

Support Vector ◽

Applied Machine Learning ◽

Bayes Algorithm ◽

Website Content

Financial area analysis is not limited to enterprise performance analysis. It is worth analyzing as wide an area as possible to obtain the full impression of a specific enterprise. News website content is a datum source that expresses the public’s opinion on enterprise operations, status, etc. Therefore, it is worth analyzing the news portal article text. Sentiment analysis in English texts and financial area texts exist, and are accurate, the complexity of Lithuanian language is mostly concentrated on sentiment analysis of comment texts, and does not provide high accuracy. Therefore in this paper, the supervised machine learning model was implemented to assign sentiment analysis on financial context news, gathered from Lithuanian language websites. The analysis was made using three commonly used classification algorithms in the field of sentiment analysis. The hyperparameters optimization using the grid search was performed to discover the best parameters of each classifier. All experimental investigations were made using the newly collected datasets from four Lithuanian news websites. The results of the applied machine learning algorithms show that the highest accuracy is obtained using a non-balanced dataset, via the multinomial Naive Bayes algorithm (71.1%). The other algorithm accuracies were slightly lower: a long short-term memory (71%), and a support vector machine (70.4%).

Download Full-text

Framing Twitter Public Sentiment on Nigerian Government COVID-19 Palliatives Distribution Using Machine Learning

Sustainability ◽

10.3390/su13063497 ◽

2021 ◽

Vol 13 (6) ◽

pp. 3497

Author(s):

Hassan Adamu ◽

Syaheerah Lebai Lutfi ◽

Nurul Hashimah Ahamed Hassain Malim ◽

Rohail Hassan ◽

Assunta Di Vaio ◽

...

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Nearest Neighbor ◽

Primary Objective ◽

Support Vector ◽

Standard English ◽

Emotion Classification ◽

K Nearest Neighbor ◽

The Public ◽

The Government

Sustainable development plays a vital role in information and communication technology. In times of pandemics such as COVID-19, vulnerable people need help to survive. This help includes the distribution of relief packages and materials by the government with the primary objective of lessening the economic and psychological effects on the citizens affected by disasters such as the COVID-19 pandemic. However, there has not been an efficient way to monitor public funds’ accountability and transparency, especially in developing countries such as Nigeria. The understanding of public emotions by the government on distributed palliatives is important as it would indicate the reach and impact of the distribution exercise. Although several studies on English emotion classification have been conducted, these studies are not portable to a wider inclusive Nigerian case. This is because Informal Nigerian English (Pidgin), which Nigerians widely speak, has quite a different vocabulary from Standard English, thus limiting the applicability of the emotion classification of Standard English machine learning models. An Informal Nigerian English (Pidgin English) emotions dataset is constructed, pre-processed, and annotated. The dataset is then used to classify five emotion classes (anger, sadness, joy, fear, and disgust) on the COVID-19 palliatives and relief aid distribution in Nigeria using standard machine learning (ML) algorithms. Six ML algorithms are used in this study, and a comparative analysis of their performance is conducted. The algorithms are Multinomial Naïve Bayes (MNB), Support Vector Machine (SVM), Random Forest (RF), Logistics Regression (LR), K-Nearest Neighbor (KNN), and Decision Tree (DT). The conducted experiments reveal that Support Vector Machine outperforms the remaining classifiers with the highest accuracy of 88%. The “disgust” emotion class surpassed other emotion classes, i.e., sadness, joy, fear, and anger, with the highest number of counts from the classification conducted on the constructed dataset. Additionally, the conducted correlation analysis shows a significant relationship between the emotion classes of “Joy” and “Fear”, which implies that the public is excited about the palliatives’ distribution but afraid of inequality and transparency in the distribution process due to reasons such as corruption. Conclusively, the results from this experiment clearly show that the public emotions on COVID-19 support and relief aid packages’ distribution in Nigeria were not satisfactory, considering that the negative emotions from the public outnumbered the public happiness.

Download Full-text

Optimizing machine learning models for granular NdFeB magnets by very fast simulated annealing

Scientific Reports ◽

10.1038/s41598-021-83315-9 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Hyeon-Kyu Park ◽

Jae-Hyeok Lee ◽

Jehyun Lee ◽

Sang-Koog Kim

Keyword(s):

Machine Learning ◽

Simulated Annealing ◽

Permanent Magnets ◽

Supervised Machine Learning ◽

Support Vector ◽

Micromagnetic Simulations ◽

Ndfeb Magnets ◽

Average Grain Size ◽

Macroscopic Properties ◽

Very Fast Simulated Annealing

AbstractThe macroscopic properties of permanent magnets and the resultant performance required for real implementations are determined by the magnets’ microscopic features. However, earlier micromagnetic simulations and experimental studies required relatively a lot of work to gain any complete and comprehensive understanding of the relationships between magnets’ macroscopic properties and their microstructures. Here, by means of supervised learning, we predict reliable values of coercivity (μ0Hc) and maximum magnetic energy product (BHmax) of granular NdFeB magnets according to their microstructural attributes (e.g. inter-grain decoupling, average grain size, and misalignment of easy axes) based on numerical datasets obtained from micromagnetic simulations. We conducted several tests of a variety of supervised machine learning (ML) models including kernel ridge regression (KRR), support vector regression (SVR), and artificial neural network (ANN) regression. The hyper-parameters of these models were optimized by a very fast simulated annealing (VFSA) algorithm with an adaptive cooling schedule. In our datasets of randomly generated 1,000 polycrystalline NdFeB cuboids with different microstructural attributes, all of the models yielded similar results in predicting both μ0Hc and BHmax. Furthermore, some outliers, which deteriorated the normality of residuals in the prediction of BHmax, were detected and further analyzed. Based on all of our results, we can conclude that our ML approach combined with micromagnetic simulations provides a robust framework for optimal design of microstructures for high-performance NdFeB magnets.

Download Full-text