Comparison of Implicit vs. Explicit Regime Identification in Machine Learning Methods for Solar Irradiance Prediction

This work compares the solar power forecasting performance of tree-based methods that include implicit regime-based models to explicit regime separation methods that utilize both unsupervised and supervised machine learning techniques. Previous studies have shown an improvement utilizing a regime-based machine learning approach in a climate with diverse cloud conditions. This study compares the machine learning approaches for solar power prediction at the Shagaya Renewable Energy Park in Kuwait, which is in an arid desert climate characterized by abundant sunshine. The regime-dependent artificial neural network models undergo a comprehensive parameter and hyperparameter tuning analysis to minimize the prediction errors on a test dataset. The final results that compare the different methods are computed on an independent validation dataset. The results show that the tree-based methods, the regression model tree approach, performs better than the explicit regime-dependent approach. These results appear to be a function of the predominantly sunny conditions that limit the ability of an unsupervised technique to separate regimes for which the relationship between the predictors and the predictand would differ for the supervised learning technique.

Download Full-text

Machine Learning Approaches to Traffic Accident Analysis and Hotspot Prediction

Computers ◽

10.3390/computers10120157 ◽

2021 ◽

Vol 10 (12) ◽

pp. 157

Author(s):

Daniel Santos ◽

José Saias ◽

Paulo Quaresma ◽

Vítor Beires Nogueira

Keyword(s):

Machine Learning ◽

Predictive Model ◽

Road Accident ◽

Influential Factors ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Economic Losses ◽

Learning Approaches ◽

Road Accidents ◽

Accident Data

Traffic accidents are one of the most important concerns of the world, since they result in numerous casualties, injuries, and fatalities each year, as well as significant economic losses. There are many factors that are responsible for causing road accidents. If these factors can be better understood and predicted, it might be possible to take measures to mitigate the damages and its severity. The purpose of this work is to identify these factors using accident data from 2016 to 2019 from the district of Setúbal, Portugal. This work aims at developing models that can select a set of influential factors that may be used to classify the severity of an accident, supporting an analysis on the accident data. In addition, this study also proposes a predictive model for future road accidents based on past data. Various machine learning approaches are used to create these models. Supervised machine learning methods such as decision trees (DT), random forests (RF), logistic regression (LR), and naive Bayes (NB) are used, as well as unsupervised machine learning techniques including DBSCAN and hierarchical clustering. Results show that a rule-based model using the C5.0 algorithm is capable of accurately detecting the most relevant factors describing a road accident severity. Further, the results of the predictive model suggests the RF model could be a useful tool for forecasting accident hotspots.

Download Full-text

Machine Learning Frameworks in Cancer Detection

E3S Web of Conferences ◽

10.1051/e3sconf/202129701073 ◽

2021 ◽

Vol 297 ◽

pp. 01073

Author(s):

Sabyasachi Pramanik ◽

K. Martin Sagayam ◽

Om Prakash Jena

Keyword(s):

Machine Learning ◽

Prediction Models ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Cancer Development ◽

Support Vector ◽

Learning Approaches ◽

Learning Techniques ◽

Fact Finding ◽

Risk Of Cancer

Cancer has been described as a diverse illness with several distinct subtypes that may occur simultaneously. As a result, early detection and forecast of cancer types have graced essentially in cancer fact-finding methods since they may help to improve the clinical treatment of cancer survivors. The significance of categorizing cancer suffers into higher or lower-threat categories has prompted numerous fact-finding associates from the bioscience and genomics field to investigate the utilization of machine learning (ML) algorithms in cancer diagnosis and treatment. Because of this, these methods have been used with the goal of simulating the development and treatment of malignant diseases in humans. Furthermore, the capacity of machine learning techniques to identify important characteristics from complicated datasets demonstrates the significance of these technologies. These technologies include Bayesian networks and artificial neural networks, along with a number of other approaches. Decision Trees and Support Vector Machines which have already been extensively used in cancer research for the creation of predictive models, also lead to accurate decision making. The application of machine learning techniques may undoubtedly enhance our knowledge of cancer development; nevertheless, a sufficient degree of validation is required before these approaches can be considered for use in daily clinical practice. An overview of current machine learning approaches utilized in the simulation of cancer development is presented in this paper. All of the supervised machine learning approaches described here, along with a variety of input characteristics and data samples, are used to build the prediction models. In light of the increasing trend towards the use of machine learning methods in biomedical research, we offer the most current papers that have used these approaches to predict risk of cancer or patient outcomes in order to better understand cancer.

Download Full-text

Machine Learning Approaches to Retrieve High-Quality, Clinically Relevant, Evidence from the Biomedical Literature: A Systematic Review (Preprint)

10.2196/preprints.30401 ◽

2021 ◽

Author(s):

Wael Abdelkader ◽

Tamara Navarro ◽

Rick Parrish ◽

Chris Cotoi ◽

Federico Germini ◽

...

Keyword(s):

Machine Learning ◽

Systematic Review ◽

Strong Evidence ◽

Clinical Care ◽

Biomedical Literature ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Learning Approaches ◽

High Quality ◽

Applied Machine Learning

BACKGROUND The rapid growth of the biomedical literature makes identifying strong evidence a time-consuming task. Applying machine learning to the process could be a viable solution that limits effort while maintaining accuracy. OBJECTIVE To summarize the nature and comparative performance of machine learning approaches that have been applied to retrieve high-quality evidence for clinical consideration from the biomedical literature. METHODS We conducted a systematic review of studies that applied machine learning techniques to identify high-quality clinical articles in the biomedical literature. Multiple databases were searched to July 2020. Extracted data focused on the applied machine learning model, steps in the development of the models, and model performance. RESULTS From 3918 retrieved studies, 10 met our inclusion criteria. All followed a supervised machine learning approach and applied, from a limited range of options, a high-quality standard for the training of their model. The results show that machine learning can achieve a sensitivity of 95% while maintaining a high precision of 86%. CONCLUSIONS Applying machine learning to distinguish studies with strong evidence for clinical care has the potential to decrease the workload of manually identifying these. The evidence base is active and evolving. Reported methods were variable across the studies but focused on supervised machine learning approaches. Performance may improve by applying more sophisticated approaches such as active learning, auto-machine learning, and unsupervised machine learning approaches.

Download Full-text

Machine Learning in KM3NeT

EPJ Web of Conferences ◽

10.1051/epjconf/201920705004 ◽

2019 ◽

Vol 207 ◽

pp. 05004 ◽

Cited By ~ 1

Author(s):

Chiara De Sio

Keyword(s):

Machine Learning ◽

Network Models ◽

High Energy ◽

Particle Identification ◽

Machine Learning Techniques ◽

Neutrino Interaction ◽

Mass Hierarchy ◽

Reconstruction Algorithms ◽

Neural Network Models ◽

Neutrino Mass Hierarchy

The KM3NeT Collaboration is building a network of underwater Cherenkov telescopes at two sites in the Mediterranean Sea, with the main goals of investigating astrophysical sources of high-energy neutrinos (ARCA) and of determining the neutrino mass hierarchy (ORCA). Various Machine Learning techniques, such as Random Forests, BDTs, Shallow and Deep Networks are being used for diverse tasks, such as event-type and particle identification, energy/direction estimation, source identification, signal/background discrimination and data analysis, with sound results as well as promising research paths. The main focus of this work is the application of Convolutional Neural Network models to the tasks of neutrino interaction classification, as well as the estimation of energy and direction of the propagating particles. The performances are also compared to those of the standard reconstruction algorithms used in the Collaboration.

Download Full-text

Revisiting the Contested Role of Natural Resources in Violent Conflict Risk through Machine Learning

Sustainability ◽

10.3390/su12166574 ◽

2020 ◽

Vol 12 (16) ◽

pp. 6574

Author(s):

Marie K. Schellens ◽

Salim Belyazid

Keyword(s):

Machine Learning ◽

Natural Resources ◽

Natural Resource ◽

Network Models ◽

Violent Conflict ◽

Machine Learning Techniques ◽

Economic Conditions ◽

Neural Network Models ◽

Logistic Regression Models ◽

Conflict Risk

The integrated character of the sustainable development goals in Agenda 2030, as well as research in environmental security, flag that sustainable peace requires sustainable and conflict-sensitive natural resource use. The precise relationship between the risk for violent conflict and natural resources remains contested because of the interplay with socio-economic variables. This paper aims to improve the understanding of natural resources’ role in the risk of violent conflicts by accounting for complex interactions with socio-economic conditions. Conflict data was analysed with machine learning techniques, which can account for complex patterns, such as variable interactions. More commonly used logistic regression models are compared with neural network models and random forest models. The results indicate that a country’s natural resource features are important predictors of its risk for violent conflict and that they interact with socio-economic conditions. Based on these empirical results and the existing literature, we interpret that natural resources can be root causes of violent intrastate conflict, and that signals from natural resources leading to conflict risk are reflected in and influenced by interacting socio-economic conditions. More specifically, the results show that variables such as access to water and food security are important predictors of conflict, while resource rents and oil and ore exports are relatively less important than other natural resource variables, contrasting what prior research has suggested. Given the potential of natural resource features to act as an early warning for violent conflict, we argue that natural resources should be included in conflict risk models for conflict prevention.

Download Full-text

Identification of early liver toxicity gene biomarkers using comparative supervised machine learning

Scientific Reports ◽

10.1038/s41598-020-76129-8 ◽

2020 ◽

Vol 10 (1) ◽

Author(s):

Brandi Patrice Smith ◽

Loretta Sue Auvil ◽

Michael Welge ◽

Colleen Bannon Bushell ◽

Rohit Bhargava ◽

...

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Liver Toxicity ◽

Toxicity Testing ◽

High Specificity ◽

Supervised Machine Learning ◽

Validation Dataset ◽

Learning Approaches ◽

Liver Necrosis ◽

Microarray Quality Control

Abstract Screening agrochemicals and pharmaceuticals for potential liver toxicity is required for regulatory approval and is an expensive and time-consuming process. The identification and utilization of early exposure gene signatures and robust predictive models in regulatory toxicity testing has the potential to reduce time and costs substantially. In this study, comparative supervised machine learning approaches were applied to the rat liver TG-GATEs dataset to develop feature selection and predictive testing. We identified ten gene biomarkers using three different feature selection methods that predicted liver necrosis with high specificity and selectivity in an independent validation dataset from the Microarray Quality Control (MAQC)-II study. Nine of the ten genes that were selected with the supervised methods are involved in metabolism and detoxification (Car3, Crat, Cyp39a1, Dcd, Lbp, Scly, Slc23a1, and Tkfc) and transcriptional regulation (Ablim3). Several of these genes are also implicated in liver carcinogenesis, including Crat, Car3 and Slc23a1. Our biomarker gene signature provides high statistical accuracy and a manageable number of genes to study as indicators to potentially accelerate toxicity testing based on their ability to induce liver necrosis and, eventually, liver cancer.

Download Full-text

Machine Learning in Agriculture Application: Algorithms and Techniques

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.f3713.049620 ◽

2020 ◽

Vol 9 (6) ◽

pp. 1140-1146

Keyword(s):

Machine Learning ◽

Deep Learning ◽

High Performance ◽

Network Models ◽

Processing Technique ◽

Machine Learning Techniques ◽

Image Processing Technique ◽

Neural Network Models ◽

Survey Paper ◽

Learning Techniques

Machine learning techniques with high performance computing technologies can create various new opportunities in the agriculture domain. This paper does comprehensivereview of various papers which are concentrating on machine learning (ML) and deep learning application in agriculture. This paper is categorized into three sections a) Yield prediction using machine learning technique b) Price prediction c) Leaf disease detection using neural networks. In this paper we study the comparison of neural network models with existing models. The findings of this survey paper indicate Deep learning models give high accuracy and outperform traditional image processing technique and ML techniques outperforms various traditional techniques in prediction.

Download Full-text

Disease Identification in Chilli Leaves using Machine Learning Techniques

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.a1061.1291s319 ◽

2019 ◽

Vol 9 (1S3) ◽

pp. 325-329

Keyword(s):

Neural Network ◽

Machine Learning ◽

Detection System ◽

Network Models ◽

Detection Algorithm ◽

Machine Learning Techniques ◽

Neural Network Models ◽

The Past ◽

Learning Techniques ◽

Proposed Model

Crop diseases reduce the yield of the crop or may even kill it. Over the past two years, as per the I.C.A.R, the production of chilies in the state of Goa has reduced drastically due to the presence of virus. Most of the plants flower very less or stop flowering completely. In rare cases when a plant manages to flower, the yield is substantially low. Proposed model detects the presence of disease in crops by examining the symptoms. The model uses an object detection algorithm and supervised image recognition and feature extraction using convolutional neural network to classify crops as infected or healthy. Google machine learning libraries, TensorFlow and Keras are used to build neural network models. An Android application is developed around the model for the ease of using the disease detection system.

Download Full-text

Prediction of drug synergy in cancer using ensemble-based machine learning techniques

Modern Physics Letters B ◽

10.1142/s0217984918501324 ◽

2018 ◽

Vol 32 (11) ◽

pp. 1850132 ◽

Cited By ~ 9

Author(s):

Harpreet Singh ◽

Prashant Singh Rana ◽

Urvinder Singh

Keyword(s):

Machine Learning ◽

Fuzzy Inference System ◽

Fuzzy Inference ◽

Machine Learning Techniques ◽

Support Vector ◽

Prediction Errors ◽

Learning Approaches ◽

Inference System ◽

Drug Synergy ◽

Learning Techniques

Drug synergy prediction plays a significant role in the medical field for inhibiting specific cancer agents. It can be developed as a pre-processing tool for therapeutic successes. Examination of different drug–drug interaction can be done by drug synergy score. It needs efficient regression-based machine learning approaches to minimize the prediction errors. Numerous machine learning techniques such as neural networks, support vector machines, random forests, LASSO, Elastic Nets, etc., have been used in the past to realize requirement as mentioned above. However, these techniques individually do not provide significant accuracy in drug synergy score. Therefore, the primary objective of this paper is to design a neuro-fuzzy-based ensembling approach. To achieve this, nine well-known machine learning techniques have been implemented by considering the drug synergy data. Based on the accuracy of each model, four techniques with high accuracy are selected to develop ensemble-based machine learning model. These models are Random forest, Fuzzy Rules Using Genetic Cooperative-Competitive Learning method (GFS.GCCL), Adaptive-Network-Based Fuzzy Inference System (ANFIS) and Dynamic Evolving Neural-Fuzzy Inference System method (DENFIS). Ensembling is achieved by evaluating the biased weighted aggregation (i.e. adding more weights to the model with a higher prediction score) of predicted data by selected models. The proposed and existing machine learning techniques have been evaluated on drug synergy score data. The comparative analysis reveals that the proposed method outperforms others in terms of accuracy, root mean square error and coefficient of correlation.

Download Full-text

The Effectiveness of Feature Selection Method in Solar Power Prediction

Journal of Renewable Energy ◽

10.1155/2013/952613 ◽

2013 ◽

Vol 2013 ◽

pp. 1-9 ◽

Cited By ~ 4

Author(s):

Md Rahat Hossain ◽

Amanullah Maung Than Oo ◽

A. B. M. Shawkat Ali

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Prediction Accuracy ◽

Solar Power ◽

Feature Subset Selection ◽

Machine Learning Techniques ◽

Support Vector ◽

Selection Methods ◽

Power Prediction ◽

Learning Techniques

This paper empirically shows that the effect of applying selected feature subsets on machine learning techniques significantly improves the accuracy for solar power prediction. Experiments are performed using five well-known wrapper feature selection methods to obtain the solar power prediction accuracy of machine learning techniques with selected feature subsets. For all the experiments, the machine learning techniques, namely, least median square (LMS), multilayer perceptron (MLP), and support vector machine (SVM), are used. Afterwards, these results are compared with the solar power prediction accuracy of those same machine leaning techniques (i.e., LMS, MLP, and SVM) but without applying feature selection methods (WAFS). Experiments are carried out using reliable and real life historical meteorological data. The comparison between the results clearly shows that LMS, MLP, and SVM provide better prediction accuracy (i.e., reduced MAE and MASE) with selected feature subsets than without selected feature subsets. Experimental results of this paper facilitate to make a concrete verdict that providing more attention and effort towards the feature subset selection aspect (e.g., selected feature subsets on prediction accuracy which is investigated in this paper) can significantly contribute to improve the accuracy of solar power prediction.

Download Full-text