scholarly journals A Comparison of the Performance of Supervised Learning Algorithms for Solar Power Prediction

Energies ◽  
2021 ◽  
Vol 14 (15) ◽  
pp. 4424
Author(s):  
Leidy Gutiérrez ◽  
Julian Patiño ◽  
Eduardo Duque-Grisales

Science seeks strategies to mitigate global warming and reduce the negative impacts of the long-term use of fossil fuels for power generation. In this sense, implementing and promoting renewable energy in different ways becomes one of the most effective solutions. The inaccuracy in the prediction of power generation from photovoltaic (PV) systems is a significant concern for the planning and operational stages of interconnected electric networks and the promotion of large-scale PV installations. This study proposes the use of Machine Learning techniques to model the photovoltaic power production for a system in Medellín, Colombia. Four forecasting models were generated from techniques compatible with Machine Learning and Artificial Intelligence methods: K-Nearest Neighbors (KNN), Linear Regression (LR), Artificial Neural Networks (ANN) and Support Vector Machines (SVM). The results obtained indicate that the four methods produced adequate estimations of photovoltaic energy generation. However, the best estimate according to RMSE and MAE is the ANN forecasting model. The proposed Machine Learning-based models were demonstrated to be practical and effective solutions to forecast PV power generation in Medellin.

Algorithms ◽  
2020 ◽  
Vol 13 (12) ◽  
pp. 334
Author(s):  
Joshua Vendrow ◽  
Jamie Haddock ◽  
Deanna Needell ◽  
Lorraine Johnson

Lyme disease is a rapidly growing illness that remains poorly understood within the medical community. Critical questions about when and why patients respond to treatment or stay ill, what kinds of treatments are effective, and even how to properly diagnose the disease remain largely unanswered. We investigate these questions by applying machine learning techniques to a large scale Lyme disease patient registry, MyLymeData, developed by the nonprofit LymeDisease.org. We apply various machine learning methods in order to measure the effect of individual features in predicting participants’ answers to the Global Rating of Change (GROC) survey questions that assess the self-reported degree to which their condition improved, worsened, or remained unchanged following antibiotic treatment. We use basic linear regression, support vector machines, neural networks, entropy-based decision tree models, and k-nearest neighbors approaches. We first analyze the general performance of the model and then identify the most important features for predicting participant answers to GROC. After we identify the “key” features, we separate them from the dataset and demonstrate the effectiveness of these features at identifying GROC. In doing so, we highlight possible directions for future study both mathematically and clinically.


2018 ◽  
Vol 7 (2.8) ◽  
pp. 684 ◽  
Author(s):  
V V. Ramalingam ◽  
Ayantan Dandapath ◽  
M Karthik Raja

Heart related diseases or Cardiovascular Diseases (CVDs) are the main reason for a huge number of death in the world over the last few decades and has emerged as the most life-threatening disease, not only in India but in the whole world. So, there is a need of reliable, accurate and feasible system to diagnose such diseases in time for proper treatment. Machine Learning algorithms and techniques have been applied to various medical datasets to automate the analysis of large and complex data. Many researchers, in recent times, have been using several machine learning techniques to help the health care industry and the professionals in the diagnosis of heart related diseases. This paper presents a survey of various models based on such algorithms and techniques andanalyze their performance. Models based on supervised learning algorithms such as Support Vector Machines (SVM), K-Nearest Neighbour (KNN), NaïveBayes, Decision Trees (DT), Random Forest (RF) and ensemble models are found very popular among the researchers.


Advancement in medical science has always been one of the most vital aspects of the human race. With the progress in technology, the use of modern techniques and equipment is always imposed on treatment purposes. Nowadays, machine learning techniques have widely been used in medical science for assuring accuracy. In this work, we have constructed computational model building techniques for liver disease prediction accurately. We used some efficient classification algorithms: Random Forest, Perceptron, Decision Tree, K-Nearest Neighbors (KNN), and Support Vector Machine (SVM) for predicting liver diseases. Our works provide the implementation of hybrid model construction and comparative analysis for improving prediction performance. At first, classification algorithms are applied to the original liver patient datasets collected from the UCI repository. Then we analyzed features and tweaked to improve the performance of our predictor and made a comparative analysis among the classifiers. We examined that, KNN algorithm outperformed all other techniques with feature selection.


2020 ◽  
Vol 493 (3) ◽  
pp. 3429-3441
Author(s):  
Paulo A A Lopes ◽  
André L B Ribeiro

ABSTRACT We introduce a new method to determine galaxy cluster membership based solely on photometric properties. We adopt a machine learning approach to recover a cluster membership probability from galaxy photometric parameters and finally derive a membership classification. After testing several machine learning techniques (such as stochastic gradient boosting, model averaged neural network and k-nearest neighbours), we found the support vector machine algorithm to perform better when applied to our data. Our training and validation data are from the Sloan Digital Sky Survey main sample. Hence, to be complete to $M_r^* + 3$, we limit our work to 30 clusters with $z$phot-cl ≤ 0.045. Masses (M200) are larger than $\sim 0.6\times 10^{14} \, \mathrm{M}_{\odot }$ (most above $3\times 10^{14} \, \mathrm{M}_{\odot }$). Our results are derived taking in account all galaxies in the line of sight of each cluster, with no photometric redshift cuts or background corrections. Our method is non-parametric, making no assumptions on the number density or luminosity profiles of galaxies in clusters. Our approach delivers extremely accurate results (completeness, C $\sim 92{\rm{ per\ cent}}$ and purity, P $\sim 87{\rm{ per\ cent}}$) within R200, so that we named our code reliable photometric membership. We discuss possible dependencies on magnitude, colour, and cluster mass. Finally, we present some applications of our method, stressing its impact to galaxy evolution and cosmological studies based on future large-scale surveys, such as eROSITA, EUCLID, and LSST.


2019 ◽  
Vol 18 (27) ◽  
pp. 2347-2354 ◽  
Author(s):  
Juan Alberto Castillo-Garit ◽  
Naivi Flores-Balmaseda ◽  
Orlando Álvarez ◽  
Hai Pham-The ◽  
Virginia Pérez-Doñate ◽  
...  

Leishmaniasis is a poverty-related disease endemic in 98 countries worldwide, with morbidity and mortality increasing daily. All currently used first-line and second-line drugs for the treatment of leishmaniasis exhibit several drawbacks including toxicity, high costs and route of administration. Consequently, the development of new treatments for leishmaniasis is a priority in the field of neglected tropical diseases. The aim of this work is to develop computational models those allow the identification of new chemical compounds with potential anti-leishmanial activity. A data set of 116 organic chemicals, assayed against promastigotes of Leishmania amazonensis, is used to develop the theoretical models. The cutoff value to consider a compound as active one was IC50≤1.5μM. For this study, we employed Dragon software to calculate the molecular descriptors and WEKA to obtain machine learning (ML) models. All ML models showed accuracy values between 82% and 91%, for the training set. The models developed with k-nearest neighbors and classification trees showed sensitivity values of 97% and 100%, respectively; while the models developed with artificial neural networks and support vector machine showed specificity values of 94% and 92%, respectively. In order to validate our models, an external test-set was evaluated with good behavior for all models. A virtual screening was performed and 156 compounds were identified as potential anti-leishmanial by all the ML models. This investigation highlights the merits of ML-based techniques as an alternative to other more traditional methods to find new chemical compounds with anti-leishmanial activity.


2021 ◽  
pp. 1-29
Author(s):  
Ahmed Alsaihati ◽  
Mahmoud Abughaban ◽  
Salaheldin Elkatatny ◽  
Abdulazeez Abdulraheem

Abstract Fluid loss into formations is a common operational issue that is frequently encountered when drilling across naturally or induced fractured formations. This could pose significant operational risks, such as well-control, stuck pipe, and wellbore instability, which, in turn, lead to an increase of well time and cost. This research aims to use and evaluate different machine learning techniques, namely: support vector machines, random forests, and K-nearest neighbors in detecting loss circulation occurrences while drilling using solely drilling surface parameters. Actual field data of seven wells, which had suffered partial or severe loss circulation, were used to build predictive models, while Well-8 was used to compare the performance of the developed models. Different performance metrics were used to evaluate the performance of the developed models. Recall, precision, and F1-score measures were used to evaluate the ability of the developed model to detect loss circulation occurrences. The results showed the K-nearest neighbors classifier achieved a high F1-score of 0.912 in detecting loss circulation occurrence in the testing set, while the random forests was the second-best classifier with almost the same F1-score of 0.910. The support vector machines achieved an F1-score of 0.83 in predicting the loss circulation occurrence in the testing set. The K-nearest neighbors outperformed other models in detecting the loss circulation occurrences in Well-8 with an F1-score of 0.80. The main contribution of this research as compared to previous studies is that it identifies losses events based on real-time measurements of the active pit volume.


2018 ◽  
Vol 10 (8) ◽  
pp. 1253 ◽  
Author(s):  
Hironori Matsumoto ◽  
Adam Young

Cobbles (64–256 mm) are found on beaches throughout the world, influence beach morphology, and can provide shoreline stability. Detailed, frequent, and spatially large-scale quantitative cobble observations at beaches are vital toward a better understanding of sand-cobble beach systems. This study used a truck-mounted mobile terrestrial LiDAR system and a raster-based classification approach to map cobbles automatically. Rasters of LiDAR intensity, intensity deviation, topographic roughness, and slope were utilized for cobble classification. Four machine learning techniques including maximum likelihood, decision tree, support vector machine, and k-nearest neighbors were tested on five raster resolutions ranging from 5–50 cm. The cobble mapping capability varied depending on pixel size, classification technique, surface cobble density, and beach setting. The best performer was a maximum likelihood classification using 20 cm raster resolution. Compared to manual mapping at 15 control sites (size ranging from a few to several hundred square meters), automated mapping errors were <12% (best fit line). This method mapped the spatial location of dense cobble regions more accurately compared to sparse and moderate density cobble areas. The method was applied to a ~40 km section of coast in southern California, and successfully generated temporal and spatial cobble distributions consistent with previous observations.


2019 ◽  
Author(s):  
Lucas Carvalho ◽  
Maycon Silva ◽  
Edimilson Santos ◽  
Daniel Guidoni

Problems related to traffic congestion and management have become common in many cities. Thus, vehicle re-routing methods have been proposed to minimize the congestion. Some of these methods have applied machine learning techniques, more specifically classifiers, to verify road conditions and detect congestion. However, better results may be obtained by applying a classifier more suitable to domain. In this sense, this paper presents an evaluation of different classifiers applied to the identification of the level of road congestion. Our main goal is to analyze the characteristics of each classifier in this task. The classifiers involved in the experiments here are: Multiple Layer Neural Network (MLP), K-Nearest Neighbors (KNN), Decision Trees (J48), Support Vector Machines (SVM), Naive Bayes and Tree Augment Naive Bayes.


Author(s):  
Lorenzo Cesaretti ◽  
Laura Screpanti ◽  
David Scaradozzi ◽  
Eleni Mangina

AbstractThis paper presents the preliminary results of using machine learning techniques to analyze educational robotics activities. An experiment was conducted with 197 secondary school students in Italy: the authors updated Lego Mindstorms EV3 programming blocks to record log files with coding sequences students had designed in teams. The activities were part of a preliminary robotics exercise. We used four machine learning techniques—logistic regression, support-vector machine (SVM), K-nearest neighbors and random forests—to predict the students’ performance, comparing a supervised approach (using twelve indicators extracted from the log files as input for the algorithms) and a mixed approach (applying a k-means algorithm to calculate the machine learning features). The results showed that the mixed approach with SVM outperformed the other techniques, and that three predominant learning styles emerged from the data mining analysis.


2019 ◽  
Vol 11 (7) ◽  
pp. 819 ◽  
Author(s):  
Julia Marrs ◽  
Wenge Ni-Meister

The use of light detection and ranging (LiDAR) techniques for recording and analyzing tree and forest structural variables shows strong promise for improving established hyperspectral-based tree species classifications; however, previous multi-sensoral projects were often limited by error resulting from seasonal or flight path differences. The National Aeronautics and Space Administration (NASA) Goddard’s LiDAR, hyperspectral, and thermal imager (G-LiHT) is now providing co-registered data on experimental forests in the United States, which are associated with established ground truths from existing forest plots. Free, user-friendly machine learning applications like the Orange Data Mining Extension for Python recently simplified the process of combining datasets, handling variable redundancy and noise, and reducing dimensionality in remotely sensed datasets. Neural networks, CN2 rules, and support vector machine methods are used here to achieve a final classification accuracy of 67% for dominant tree species in experimental plots of Howland Experimental Forest, a mixed coniferous–deciduous forest with ten dominant tree species, and 59% for plots in Penobscot Experimental Forest, a mixed coniferous–deciduous forest with 15 dominant tree species. These accuracies are higher than those produced using LiDAR or hyperspectral datasets separately, suggesting that combined spectral and structural data have a greater richness of complementary information than either dataset alone. Using greatly simplified datasets created by our dimensionality reduction methodology, machine learner performance remains comparable or higher to that using the full dataset. Across forests, the identification of shared structural and spectral variables suggests that this methodology can successfully identify parameters with high explanatory power for differentiating among tree species, and opens the possibility of addressing large-scale forestry questions using optimized remote sensing workflows.


Sign in / Sign up

Export Citation Format

Share Document