scholarly journals Machine Learning Classification and Accuracy Assessment from High-Resolution Images of Coastal Wetlands

2021 ◽  
Vol 13 (18) ◽  
pp. 3669
Author(s):  
Ricardo Martínez Prentice ◽  
Miguel Villoslada Peciña ◽  
Raymond D. Ward ◽  
Thaisa F. Bergamo ◽  
Chris B. Joyce ◽  
...  

High-resolution images obtained by multispectral cameras mounted on Unmanned Aerial Vehicles (UAVs) are helping to capture the heterogeneity of the environment in images that can be discretized in categories during a classification process. Currently, there is an increasing use of supervised machine learning (ML) classifiers to retrieve accurate results using scarce datasets with samples with non-linear relationships. We compared the accuracies of two ML classifiers using a pixel and object analysis approach in six coastal wetland sites. The results show that the Random Forest (RF) performs better than K-Nearest Neighbors (KNN) algorithm in the classification of pixels and objects and the classification based on pixel analysis is slightly better than the object-based analysis. The agreement between the classifications of objects and pixels is higher in Random Forest. This is likely due to the heterogeneity of the study areas, where pixel-based classifications are most appropriate. In addition, from an ecological perspective, as these wetlands are heterogeneous, the pixel-based classification reflects a more realistic interpretation of plant community distribution.

2020 ◽  
Vol 14 ◽  

Breast Cancer (BC) is amongst the most common and leading causes of deaths in women throughout the world. Recently, classification and data analysis tools are being widely used in the medical field for diagnosis, prognosis and decision making to help lower down the risks of people dying or suffering from diseases. Advanced machine learning methods have proven to give hope for patients as this has helped the doctors in early detection of diseases like Breast Cancer that can be fatal, in support with providing accurate outcomes. However, the results highly depend on the techniques used for feature selection and classification which will produce a strong machine learning model. In this paper, a performance comparison is conducted using four classifiers which are Multilayer Perceptron (MLP), Support Vector Machine (SVM), K-Nearest Neighbors (KNN) and Random Forest on the Wisconsin Breast Cancer dataset to spot the most effective predictors. The main goal is to apply best machine learning classification methods to predict the Breast Cancer as benign or malignant using terms such as accuracy, f-measure, precision and recall. Experimental results show that Random forest is proven to achieve the highest accuracy of 99.26% on this dataset and features, while SVM and KNN show 97.78% and 97.04% accuracy respectively. MLP shows the least accuracy of 94.07%. All the experiments are conducted using RStudio as the data mining tool platform.


2020 ◽  
Vol 7 (1) ◽  
pp. 28-32
Author(s):  
Andre Rusli ◽  
Alethea Suryadibrata ◽  
Samiaji Bintang Nusantara ◽  
Julio Christian Young

The advancement of machine learning and natural language processing techniques hold essential opportunities to improve the existing software engineering activities, including the requirements engineering activity. Instead of manually reading all submitted user feedback to understand the evolving requirements of their product, developers could use the help of an automatic text classification program to reduce the required effort. Many supervised machine learning approaches have already been used in many fields of text classification and show promising results in terms of performance. This paper aims to implement NLP techniques for the basic text preprocessing, which then are followed by traditional (non-deep learning) machine learning classification algorithms, which are the Logistics Regression, Decision Tree, Multinomial Naïve Bayes, K-Nearest Neighbors, Linear SVC, and Random Forest classifier. Finally, the performance of each algorithm to classify the feedback in our dataset into several categories is evaluated using three F1 Score metrics, the macro-, micro-, and weighted-average F1 Score. Results show that generally, Logistics Regression is the most suitable classifier in most cases, followed by Linear SVC. However, the performance gap is not large, and with different configurations and requirements, other classifiers could perform equally or even better.


Author(s):  
Ye Lv ◽  
Guofeng Wang ◽  
Xiangyun Hu

At present, remote sensing technology is the best weapon to get information from the earth surface, and it is very useful in geo- information updating and related applications. Extracting road from remote sensing images is one of the biggest demand of rapid city development, therefore, it becomes a hot issue. Roads in high-resolution images are more complex, patterns of roads vary a lot, which becomes obstacles for road extraction. In this paper, a machine learning based strategy is presented. The strategy overall uses the geometry features, radiation features, topology features and texture features. In high resolution remote sensing images, the images cover a great scale of landscape, thus, the speed of extracting roads is slow. So, roads’ ROIs are firstly detected by using Houghline detection and buffering method to narrow down the detecting area. As roads in high resolution images are normally in ribbon shape, mean-shift and watershed segmentation methods are used to extract road segments. Then, Real Adaboost supervised machine learning algorithm is used to pick out segments that contain roads’ pattern. At last, geometric shape analysis and morphology methods are used to prune and restore the whole roads’ area and to detect the centerline of roads.


2021 ◽  
Vol 1 (2) ◽  
pp. 106-118
Author(s):  
Bahzad Taha Chicho ◽  
Adnan Mohsin Abdulazeez ◽  
Diyar Qader Zeebaree ◽  
Dilovan Assad Zebari

Classification is the most widely applied machine learning problem today, with implementations in face recognition, flower classification, clustering, and other fields. The goal of this paper is to organize and identify a set of data objects. The study employs K-nearest neighbors, decision tree (j48), and random forest algorithms, and then compares their performance using the IRIS dataset. The results of the comparison analysis showed that the K-nearest neighbors outperformed the other classifiers. Also, the random forest classifier worked better than the decision tree (j48). Finally, the best result obtained by this study is 100% and there is no error rate for the classifier that was obtained.


Author(s):  
Shashank Iyer ◽  
Chirag Variawa

Supervised Machine Learning classification algorithms are used to analyze the potential inclination of the undecided/undeclared first-year engineering students.  The data exploration task is possible by building a dataset that comprises of questions based on significant attributes.  These attributes hover around different disciplines of engineering being offered at the University of Toronto. This qualitative survey is distributed to upperclassmen students (3rd, 4th year and graduate students, N = 54) and undecided first-year engineering students (N = 29) Multi-class classification is a technique that is used to categorize the data into two or more classes, in this case, the different disciplines of engineering at University of Toronto. The dataset that is built, based on the answers provided by the upperclassmen, is programmed into different classification algorithms such as Logistic Regression, KNN (K-nearest neighbors), Decision Tree and Random Forest classifier. The algorithms are compared so as to identify the most appropriate one that can determine the specific class label of the upperclassmen based on the answers provided in the qualitative survey.  The accuracy of the various algorithms is an indicator of the favorable algorithm that can serve as a tool to suggest the potential majors that could be pursued by the undecided/undeclared students. Moreover, the answers given by the upperclassmen is visually analyzed for identifying the patterns of inclination of the students belonging to different disciplines of engineering.  


Author(s):  
Ye Lv ◽  
Guofeng Wang ◽  
Xiangyun Hu

At present, remote sensing technology is the best weapon to get information from the earth surface, and it is very useful in geo- information updating and related applications. Extracting road from remote sensing images is one of the biggest demand of rapid city development, therefore, it becomes a hot issue. Roads in high-resolution images are more complex, patterns of roads vary a lot, which becomes obstacles for road extraction. In this paper, a machine learning based strategy is presented. The strategy overall uses the geometry features, radiation features, topology features and texture features. In high resolution remote sensing images, the images cover a great scale of landscape, thus, the speed of extracting roads is slow. So, roads’ ROIs are firstly detected by using Houghline detection and buffering method to narrow down the detecting area. As roads in high resolution images are normally in ribbon shape, mean-shift and watershed segmentation methods are used to extract road segments. Then, Real Adaboost supervised machine learning algorithm is used to pick out segments that contain roads’ pattern. At last, geometric shape analysis and morphology methods are used to prune and restore the whole roads’ area and to detect the centerline of roads.


2020 ◽  
Vol 14 (2) ◽  
pp. 140-159
Author(s):  
Anthony-Paul Cooper ◽  
Emmanuel Awuni Kolog ◽  
Erkki Sutinen

This article builds on previous research around the exploration of the content of church-related tweets. It does so by exploring whether the qualitative thematic coding of such tweets can, in part, be automated by the use of machine learning. It compares three supervised machine learning algorithms to understand how useful each algorithm is at a classification task, based on a dataset of human-coded church-related tweets. The study finds that one such algorithm, Naïve-Bayes, performs better than the other algorithms considered, returning Precision, Recall and F-measure values which each exceed an acceptable threshold of 70%. This has far-reaching consequences at a time where the high volume of social media data, in this case, Twitter data, means that the resource-intensity of manual coding approaches can act as a barrier to understanding how the online community interacts with, and talks about, church. The findings presented in this article offer a way forward for scholars of digital theology to better understand the content of online church discourse.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Sunhae Kim ◽  
Hye-Kyung Lee ◽  
Kounseok Lee

AbstractMinnesota Multiphasic Personality Inventory-2 (MMPI-2) is a widely used tool for early detection of psychological maladjustment and assessing the level of adaptation for a large group in clinical settings, schools, and corporations. This study aims to evaluate the utility of MMPI-2 in assessing suicidal risk using the results of MMPI-2 and suicidal risk evaluation. A total of 7,824 datasets collected from college students were analyzed. The MMPI-2-Resturcutred Clinical Scales (MMPI-2-RF) and the response results for each question of the Mini International Neuropsychiatric Interview (MINI) suicidality module were used. For statistical analysis, random forest and K-Nearest Neighbors (KNN) techniques were used with suicidal ideation and suicide attempt as dependent variables and 50 MMPI-2 scale scores as predictors. On applying the random forest method to suicidal ideation and suicidal attempts, the accuracy was 92.9% and 95%, respectively, and the Area Under the Curves (AUCs) were 0.844 and 0.851, respectively. When the KNN method was applied, the accuracy was 91.6% and 94.7%, respectively, and the AUCs were 0.722 and 0.639, respectively. The study confirmed that machine learning using MMPI-2 for a large group provides reliable accuracy in classifying and predicting the subject's suicidal ideation and past suicidal attempts.


Electronics ◽  
2021 ◽  
Vol 10 (13) ◽  
pp. 1578
Author(s):  
Daniel Szostak ◽  
Adam Włodarczyk ◽  
Krzysztof Walkowiak

Rapid growth of network traffic causes the need for the development of new network technologies. Artificial intelligence provides suitable tools to improve currently used network optimization methods. In this paper, we propose a procedure for network traffic prediction. Based on optical networks’ (and other network technologies) characteristics, we focus on the prediction of fixed bitrate levels called traffic levels. We develop and evaluate two approaches based on different supervised machine learning (ML) methods—classification and regression. We examine four different ML models with various selected features. The tested datasets are based on real traffic patterns provided by the Seattle Internet Exchange Point (SIX). Obtained results are analyzed using a new quality metric, which allows researchers to find the best forecasting algorithm in terms of network resources usage and operational costs. Our research shows that regression provides better results than classification in case of all analyzed datasets. Additionally, the final choice of the most appropriate ML algorithm and model should depend on the network operator expectations.


Sign in / Sign up

Export Citation Format

Share Document