scholarly journals Evaluation of Machine Learning Techniques for Daily Reference Evapotranspiration Estimation

Author(s):  
Ali Rashid Niaghi ◽  
Oveis Hassanijalilian ◽  
Jalal Shiri

The ASCE-EWRI reference evapotranspiration (ETo) equation is recommended as a standardized method for reference crop ETo estimation. However, various climate data as input variables to the standardized ETo method are considered limiting factors in most cases and restrict the ETo estimation. This paper assessed the potential of different machine learning (ML) models for ETo estimation using limited meteorological data. The ML models used to estimate daily ETo included Gene Expression Programming (GEP), Support Vector Machine (SVM), Multiple Linear Regression (LR), and Random Forest (RF). Three input combinations of daily maximum and minimum temperature (Tmax and Tmin), wind speed (W) with Tmax and Tmin, and solar radiation (Rs) with Tmax and Tmin were considered using meteorological data during 2003–2016 from six weather stations in the Red River Valley. To understand the performance of the applied models with the various combinations, station, and yearly based tests were assessed with local and spatial approaches. Considering the local and spatial approaches analysis, the LR and RF models illustrated the lowest rate of improvement compared to GEP and SVM. The spatial RF and SVM approaches showed the lowest and highest values of the scatter index as 0.333 and 0.457, respectively. As a result, the radiation-based combination and the RF model showed the best performance with higher accuracy for all stations either locally or spatially, and the spatial SVM and GEP illustrated the lowest performance among models and approaches.

2014 ◽  
Vol 627 ◽  
pp. 97-100 ◽  
Author(s):  
R. Fernandez-Martinez ◽  
R. Hernandez ◽  
J. Ibarretxe ◽  
Pello Jimbert ◽  
M. Iturrondobeitia ◽  
...  

Mastering the relationship between the final mechanical properties of carbon black reinforced rubber blends and their composition is a key advantage for an efficient design of the composition of the blend. In this work, some models to predict three relevant physical attributes of rubber blends — modulus at 100% deformation, Shore A hardness, and tensile strength — are built by machine learning methods and subsequently evaluated. Linear regression, artificial neural networks, support vector machine, and regression trees are used to generate the models. The number of used samples and the values for the input variables is determined by a Taguchi design of experiments, and prior to the modeling the uncertainty of the experimental data was analyzed.


Hydrology ◽  
2021 ◽  
Vol 8 (1) ◽  
pp. 25
Author(s):  
Ali Rashid Niaghi ◽  
Oveis Hassanijalilian ◽  
Jalal Shiri

Evapotranspiration (ET) is widely employed to measure amounts of total water loss between land and atmosphere due to its major contribution to water balance on both regional and global scales. Considering challenges to quantifying nonlinear ET processes, machine learning (ML) techniques have been increasingly utilized to estimate ET due to their powerful advantage of capturing complex nonlinear structures and characteristics. However, limited studies have been conducted in subhumid climates to simulate local and spatial ETo using common ML methods. The current study aims to present a methodology that exempts local data in ETo simulation. The present study, therefore, seeks to estimate and compare reference ET (ETo) using four common ML methods with local and spatial approaches based on continuous 17-year daily climate data from six weather stations across the Red River Valley with subhumid climate. The four ML models have included Gene Expression Programming (GEP), Support Vector Machine (SVM), Multiple Linear Regression (LR), and Random Forest (RF) with three input combinations of maximum and minimum air temperature-based (Tmax, Tmin), mass transfer-based (Tmax, Tmin, U: wind speed), and radiation-based (Rs: solar radiation, Tmax, Tmin) measurements. The estimates yielded by the four ML models were compared against each other by considering spatial and local approaches and four statistical indicators; namely, the root means square error (RMSE), the mean absolute error (MAE), correlation coefficient (r2), and scatter index (SI), which were used to assess the ML model’s performance. The comparison between combinations showed the lowest SI and RMSE values for the RF model with the radiation-based combination. Furthermore, the RF model showed the best performance for all combinations among the four defined models either spatially or locally. In general, the LR, GEP, and SVM models were improved when a local approach was used. The results showed the best performance for the radiation-based combination and the RF model with higher accuracy for all stations either locally or spatially, and the spatial SVM and GEP illustrated the lowest performance among the models and approaches.


2019 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Luciano Cavalcante Siebert ◽  
José Francisco Bianchi Filho ◽  
Eunelson José da Silva Júnior ◽  
Eduardo Kazumi Yamakawa ◽  
Angela Catapan

Purpose This study aims to support electricity distribution companies on measuring and predicting customer satisfaction. Design/methodology/approach The developed methodology selects and applies machine learning techniques such as decision trees, support vector machines and ensemble learning to predict customer satisfaction from service data, power outage data and reliability indices. Findings The results on the predicted main indicator diverged only by 1.36 per cent of the results obtained by the survey with company customers. Research limitations/implications Social, economic and political conjunctures of the regional and national scenario can influence the indicators beyond the input variables considered in this paper. Practical implications Currently, the actions taken to increase customer satisfaction are based on the track record of a yearly survey; therefore, the methodology may assist in identifying disturbances on customer satisfaction, enabling decision-making to deal with it in a timely manner. Originality/value Development of an intelligent algorithm that can improve its performance with time. Understanding customer satisfaction may improve companies’ performance.


2020 ◽  
Vol 12 (2) ◽  
pp. 84-99
Author(s):  
Li-Pang Chen

In this paper, we investigate analysis and prediction of the time-dependent data. We focus our attention on four different stocks are selected from Yahoo Finance historical database. To build up models and predict the future stock price, we consider three different machine learning techniques including Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN) and Support Vector Regression (SVR). By treating close price, open price, daily low, daily high, adjusted close price, and volume of trades as predictors in machine learning methods, it can be shown that the prediction accuracy is improved.


Author(s):  
Anantvir Singh Romana

Accurate diagnostic detection of the disease in a patient is critical and may alter the subsequent treatment and increase the chances of survival rate. Machine learning techniques have been instrumental in disease detection and are currently being used in various classification problems due to their accurate prediction performance. Various techniques may provide different desired accuracies and it is therefore imperative to use the most suitable method which provides the best desired results. This research seeks to provide comparative analysis of Support Vector Machine, Naïve bayes, J48 Decision Tree and neural network classifiers breast cancer and diabetes datsets.


2020 ◽  
Author(s):  
Azhagiya Singam Ettayapuram Ramaprasad ◽  
Phum Tachachartvanich ◽  
Denis Fourches ◽  
Anatoly Soshilov ◽  
Jennifer C.Y. Hsieh ◽  
...  

Perfluoroalkyl and Polyfluoroalkyl Substances (PFASs) pose a substantial threat as endocrine disruptors, and thus early identification of those that may interact with steroid hormone receptors, such as the androgen receptor (AR), is critical. In this study we screened 5,206 PFASs from the CompTox database against the different binding sites on the AR using both molecular docking and machine learning techniques. We developed support vector machine models trained on Tox21 data to classify the active and inactive PFASs for AR using different chemical fingerprints as features. The maximum accuracy was 95.01% and Matthew’s correlation coefficient (MCC) was 0.76 respectively, based on MACCS fingerprints (MACCSFP). The combination of docking-based screening and machine learning models identified 29 PFASs that have strong potential for activity against the AR and should be considered priority chemicals for biological toxicity testing.


2020 ◽  
Author(s):  
Nalika Ulapane ◽  
Karthick Thiyagarajan ◽  
sarath kodagoda

<div>Classification has become a vital task in modern machine learning and Artificial Intelligence applications, including smart sensing. Numerous machine learning techniques are available to perform classification. Similarly, numerous practices, such as feature selection (i.e., selection of a subset of descriptor variables that optimally describe the output), are available to improve classifier performance. In this paper, we consider the case of a given supervised learning classification task that has to be performed making use of continuous-valued features. It is assumed that an optimal subset of features has already been selected. Therefore, no further feature reduction, or feature addition, is to be carried out. Then, we attempt to improve the classification performance by passing the given feature set through a transformation that produces a new feature set which we have named the “Binary Spectrum”. Via a case study example done on some Pulsed Eddy Current sensor data captured from an infrastructure monitoring task, we demonstrate how the classification accuracy of a Support Vector Machine (SVM) classifier increases through the use of this Binary Spectrum feature, indicating the feature transformation’s potential for broader usage.</div><div><br></div>


2020 ◽  
Vol 21 ◽  
Author(s):  
Sukanya Panja ◽  
Sarra Rahem ◽  
Cassandra J. Chu ◽  
Antonina Mitrofanova

Background: In recent years, the availability of high throughput technologies, establishment of large molecular patient data repositories, and advancement in computing power and storage have allowed elucidation of complex mechanisms implicated in therapeutic response in cancer patients. The breadth and depth of such data, alongside experimental noise and missing values, requires a sophisticated human-machine interaction that would allow effective learning from complex data and accurate forecasting of future outcomes, ideally embedded in the core of machine learning design. Objective: In this review, we will discuss machine learning techniques utilized for modeling of treatment response in cancer, including Random Forests, support vector machines, neural networks, and linear and logistic regression. We will overview their mathematical foundations and discuss their limitations and alternative approaches all in light of their application to therapeutic response modeling in cancer. Conclusion: We hypothesize that the increase in the number of patient profiles and potential temporal monitoring of patient data will define even more complex techniques, such as deep learning and causal analysis, as central players in therapeutic response modeling.


Author(s):  
Amandeep Kaur ◽  
Sushma Jain ◽  
Shivani Goel ◽  
Gaurav Dhiman

Context: Code smells are symptoms, that something may be wrong in software systems that can cause complications in maintaining software quality. In literature, there exists many code smells and their identification is far from trivial. Thus, several techniques have also been proposed to automate code smell detection in order to improve software quality. Objective: This paper presents an up-to-date review of simple and hybrid machine learning based code smell detection techniques and tools. Methods: We collected all the relevant research published in this field till 2020. We extracted the data from those articles and classified them into two major categories. In addition, we compared the selected studies based on several aspects like, code smells, machine learning techniques, datasets, programming languages used by datasets, dataset size, evaluation approach, and statistical testing. Results: Majority of empirical studies have proposed machine- learning based code smell detection tools. Support vector machine and decision tree algorithms are frequently used by the researchers. Along with this, a major proportion of research is conducted on Open Source Softwares (OSS) such as, Xerces, Gantt Project and ArgoUml. Furthermore, researchers paid more attention towards Feature Envy and Long Method code smells. Conclusion: We identified several areas of open research like, need of code smell detection techniques using hybrid approaches, need of validation employing industrial datasets, etc.


2019 ◽  
Vol 23 (1) ◽  
pp. 12-21 ◽  
Author(s):  
Shikha N. Khera ◽  
Divya

Information technology (IT) industry in India has been facing a systemic issue of high attrition in the past few years, resulting in monetary and knowledge-based loses to the companies. The aim of this research is to develop a model to predict employee attrition and provide the organizations opportunities to address any issue and improve retention. Predictive model was developed based on supervised machine learning algorithm, support vector machine (SVM). Archival employee data (consisting of 22 input features) were collected from Human Resource databases of three IT companies in India, including their employment status (response variable) at the time of collection. Accuracy results from the confusion matrix for the SVM model showed that the model has an accuracy of 85 per cent. Also, results show that the model performs better in predicting who will leave the firm as compared to predicting who will not leave the company.


Sign in / Sign up

Export Citation Format

Share Document