Evapotranspiration Estimation Using Support Vector Machines and Hargreaves-Samani Equation for St. Johns, FL, USA

Author(s):  
Fatih Ünes ◽  
Yunus Ziya Kaya ◽  
Mustafa Mamak ◽  
Mustafa Demirci

Information about Evapotranspiration (ET) calculations are not clear enough even it is an important part of hydrological cycle. There are many parameters which effect ET directly or indirectly such as Solar Radiation (SR) and Air Temperature (AT). In this study authors focused on the modelling ET using Support Vector Machines (SVM) method because this method has abilities to solve nonlinear problems. For the training SVM 1158 daily AT, SR, Wind Speed (U) and Relative Humidity (RH) meteorological parameters are used and model is tested using 385 daily parameters. Data set is taken from St. Johns, Florida, USA weather station. To understand the abilities of SVM for ET prediction against Hargreaves-Samani formula, the test set is applied to this empirical equation. Determination coefficient of SVM with observed daily ET values is calculated as 0.913 and determination coefficient of Hargreaves- Samani formula with observed daily ET is found as 0.910. Comparison between both methods is done using Mean Square Error (MSE), Mean Absolute Error (MEA) and determination coefficient statistics. As a result it is seen that SVM method is trustier than Hargreaves-Samani formula for daily ET prediction.

Author(s):  
Mohammad Reza Daliri

AbstractIn this article, we propose a feature selection strategy using a binary particle swarm optimization algorithm for the diagnosis of different medical diseases. The support vector machines were used for the fitness function of the binary particle swarm optimization. We evaluated our proposed method on four databases from the machine learning repository, including the single proton emission computed tomography heart database, the Wisconsin breast cancer data set, the Pima Indians diabetes database, and the Dermatology data set. The results indicate that, with selected less number of features, we obtained a higher accuracy in diagnosing heart, cancer, diabetes, and erythematosquamous diseases. The results were compared with the traditional feature selection methods, namely, the F-score and the information gain, and a superior accuracy was obtained with our method. Compared to the genetic algorithm for feature selection, the results of the proposed method show a higher accuracy in all of the data, except in one. In addition, in comparison with other methods that used the same data, our approach has a higher performance using less number of features.


Kybernetes ◽  
2014 ◽  
Vol 43 (8) ◽  
pp. 1150-1164 ◽  
Author(s):  
Bilal M’hamed Abidine ◽  
Belkacem Fergani ◽  
Mourad Oussalah ◽  
Lamya Fergani

Purpose – The task of identifying activity classes from sensor information in smart home is very challenging because of the imbalanced nature of such data set where some activities occur more frequently than others. Typically probabilistic models such as Hidden Markov Model (HMM) and Conditional Random Fields (CRF) are known as commonly employed for such purpose. The paper aims to discuss these issues. Design/methodology/approach – In this work, the authors propose a robust strategy combining the Synthetic Minority Over-sampling Technique (SMOTE) with Cost Sensitive Support Vector Machines (CS-SVM) with an adaptive tuning of cost parameter in order to handle imbalanced data problem. Findings – The results have demonstrated the usefulness of the approach through comparison with state of art of approaches including HMM, CRF, the traditional C-Support vector machines (C-SVM) and the Cost-Sensitive-SVM (CS-SVM) for classifying the activities using binary and ubiquitous sensors. Originality/value – Performance metrics in the experiment/simulation include Accuracy, Precision/Recall and F measure.


2020 ◽  
Vol 10 (2) ◽  
Author(s):  
Mahmood Umar ◽  
Nor Bahiah Ahmad ◽  
Anazida Zainal

This study investigates the performance of machine learning algorithms for sentiment analysis of students’ opinions on programming assessment. Previous researches show that Support Vector Machines (SVM) performs the best among all techniques, followed by Naïve Bayes (NB) in sentiment analysis. This study proposes a framework for classifying sentiments, as positive or negative using NB algorithm and Lexicon-based approach on small data set. The performance of NB algorithm was evaluated using SVM. NB and SVM conquer the Lexicon-based approach opinion lexicon technique in terms of accuracy in the specific area for which it is trained. The Lexicon-based technique, on the other hand, avoids difficult steps needed to train the classifier. Data was analyzed from 75 first year undergraduate students in School of Computing, Universiti Teknologi Malaysia taking programming subject. The student’s sentiments were gathered based on their opinions for the zero-score policy for unsuccessful compilation of program during skill-based test. The result of the study reveals that the students tend to have negative sentiments on programming assessment as it gives them scary emotions. The experimental result of applying NB algorithm yields a prediction accuracy of 85% which outperform both the SVM with 70% and Lexicon-based approach with 60% accuracy. The result shows that NB works better than SVM and Lexicon-based approach on small dataset. 


Machine learning is one of the fast growing aspect in current world. Machine learning (ML) and Artificial Neural Network (ANN) are helpful in detection and diagnosis of various heart diseases. Naïve Bayes Classification is a vital approach of classification in machine learning. The heart disease consists of set of range disorders affecting the heart. It includes blood vessel problems such as irregular heart beat issues, weak heart muscles, congenital heart defects, cardio vascular disease and coronary artery disease. Coronary heart disorder is a familiar type of heart disease. It reduces the blood flow to the heart leading to a heart attack. In this paper the UCI machine learning repository data set consisting of patients suffering from heart disease is analyzed using Naïve Bayes classification and support vector machines. The classification accuracy of the patients suffering from heart disease is predicted using Naïve Bayes classification and support vector machines. Implementation is done using R language.


2018 ◽  
Vol 200 ◽  
pp. 00004
Author(s):  
Guermah Bassma ◽  
El Ghazi Hassan ◽  
Sadiki Tayeb

Since the middle ages, the need to identify the vehicles position in their local environment has always been a necessity and a challenge. Today, GNSS-based positioning systems have penetrated several field, such as land transport, emergency systems and civil aviation requiring high positioning accuracy. However, the performances of GNSS-based systems can be degraded in harsh environment due to non-line-of-sight (NLOS), Multipath and masking effects. In this paper, for improving vehicle localization in urban canyons, we address a very challenging problem related to GNSS signal reception state detection (LOS, NLOS or Multipath). A SVMbased system for GNSS Multipath detection using the fusion of information provided by two GNSS antennas is proposed. In this work, we aim to explore the potential of machine learning, and more precisely, Support Vector Machines (SVM) to identify GNSS signals reception state. The SVM-based system developed in this work has used the C/N0 of signals provided by RHCP and LHCP antennas, and satellite elevation as classification criteria. The training data set is constructed by several experimental studies done in real environments, Calais, France . Furthermore, four SVM kernel functions are tested, namely, Linear, Gaussian, Cubic and Quadratic. A GNSS signal reception state detection by applying the proposed SVM-based classifier is demonstrated on real GPS signals, and the efficiency of the system is shown. We obtain empirically an accuracy of signal detection about 93%.


2011 ◽  
Vol 474-476 ◽  
pp. 1-6
Author(s):  
Guo Xing Peng ◽  
Bei Li

Improved learning algorithm for branch and bound for semi-supervised support vector machines is proposed, according to the greater difference in the optimal solution in different semi-supervised support vector machines for the same data set caused by the local optimization. The lower bound of node in IBBS3VM algorithm is re-defined, which will be pseudo-dual function value as the lower bound of node to avoid the large amount of calculation of 0-1 quadratic programming, reducing the lower bound of each node calculate the time complexity; at the same time, in determining the branch nodes, only based on the credibility of the unlabeled samples without the need to repeatedly carry out the training of support vector machines to enhance the training speed of the algorithm. Simulation analysis shows that IBBS3VM presented in this paper has faster training speed than BBS3VM algorithms, higher precision and stronger robustness than the other semi-supervised support vector machines.


Author(s):  
Jasleen Kaur ◽  
Khushdeep Dharni

Uniqueness in economies and stock markets has given rise to an interesting domain of exploring data mining techniques across global indices. Previously, very few studies have attempted to compare the performance of data mining techniques in diverse markets. The current study adds to the understanding regarding the variations in performance of data mining techniques across the global stock indices. We compared the performance of Neural Networks and Support Vector Machines using accuracy measures Mean Absolute Error (MAE) and R­­­­oot Mean Square Error (RMSE) across seven major stock markets. For prediction purpose, technical analysis has been employed on selected indicators based on daily values of indices spanning a period of 12 years. We created 196 data sets spanning different time periods for model building such as 1 year, 2 years, 3 years, 4 years, 6 years and 12 years for selected seven stock indices. Based on prediction models built using Neural Networks and Support Vector Machines, the findings of the study indicate there is a significant difference, both for MAE and RMSE, across the selected global indices. Also, Mean Absolute Error and Root Mean Square Error of models built using NN were greater than Mean Absolute Error and Root Mean Square Error of models built using SVM.


2020 ◽  
Author(s):  
Andrea Ferrario ◽  
Burcu Demiray ◽  
Kristina Yordanova ◽  
Minxia Luo ◽  
Mike Martin

BACKGROUND Reminiscence is the act of thinking or talking about personal experiences that occurred in the past. It is a central task of old age that is essential for healthy aging, and it serves multiple functions, such as decision-making and introspection, transmitting life lessons, and bonding with others. The study of social reminiscence behavior in everyday life can be used to generate data and detect reminiscence from general conversations. OBJECTIVE The aims of this original paper are to (1) preprocess coded transcripts of conversations in German of older adults with natural language processing (NLP), and (2) implement and evaluate learning strategies using different NLP features and machine learning algorithms to detect reminiscence in a corpus of transcripts. METHODS The methods in this study comprise (1) collecting and coding of transcripts of older adults’ conversations in German, (2) preprocessing transcripts to generate NLP features (bag-of-words models, part-of-speech tags, pretrained German word embeddings), and (3) training machine learning models to detect reminiscence using random forests, support vector machines, and adaptive and extreme gradient boosting algorithms. The data set comprises 2214 transcripts, including 109 transcripts with reminiscence. Due to class imbalance in the data, we introduced three learning strategies: (1) class-weighted learning, (2) a meta-classifier consisting of a voting ensemble, and (3) data augmentation with the Synthetic Minority Oversampling Technique (SMOTE) algorithm. For each learning strategy, we performed cross-validation on a random sample of the training data set of transcripts. We computed the area under the curve (AUC), the average precision (AP), precision, recall, as well as F1 score and specificity measures on the test data, for all combinations of NLP features, algorithms, and learning strategies. RESULTS Class-weighted support vector machines on bag-of-words features outperformed all other classifiers (AUC=0.91, AP=0.56, precision=0.5, recall=0.45, F1=0.48, specificity=0.98), followed by support vector machines on SMOTE-augmented data and word embeddings features (AUC=0.89, AP=0.54, precision=0.35, recall=0.59, F1=0.44, specificity=0.94). For the meta-classifier strategy, adaptive and extreme gradient boosting algorithms trained on word embeddings and bag-of-words outperformed all other classifiers and NLP features; however, the performance of the meta-classifier learning strategy was lower compared to other strategies, with highly imbalanced precision-recall trade-offs. CONCLUSIONS This study provides evidence of the applicability of NLP and machine learning pipelines for the automated detection of reminiscence in older adults’ everyday conversations in German. The methods and findings of this study could be relevant for designing unobtrusive computer systems for the real-time detection of social reminiscence in the everyday life of older adults and classifying their functions. With further improvements, these systems could be deployed in health interventions aimed at improving older adults’ well-being by promoting self-reflection and suggesting coping strategies to be used in the case of dysfunctional reminiscence cases, which can undermine physical and mental health.


Sign in / Sign up

Export Citation Format

Share Document