scholarly journals Exploring the long-term changes in the Madden Julian Oscillation using machine learning

2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Panini Dasgupta ◽  
Abirlal Metya ◽  
C. V. Naidu ◽  
Manmeet Singh ◽  
M. K. Roxy

Abstract The Madden Julian Oscillation (MJO), the dominant subseasonal variability in the tropics, is widely represented using the Real-time Multivariate MJO (RMM) index. The index is limited to the satellite era (post-1974) as its calculation relies on satellite-based observations. Oliver and Thompson (J Clim 25:1996–2019, 2012) extended the RMM index for the twentieth century, employing a multilinear regression on the sea level pressure (SLP) from the NOAA twentieth century reanalysis. They obtained an 82.5% correspondence with the index in the satellite era. In this study, we show that the historical MJO index can be successfully reconstructed using machine learning techniques and improved upon. We obtain a significant improvement of up to 4%, using the support vector regressor (SVR) and convolutional neural network (CNN) methods on the same set of predictors used by Oliver and Thompson. Based on the improved RMM indices, we explore the long-term changes in the intensity, phase occurrences, and frequency of the winter MJO events during 1905–2015. We show an increasing trend in MJO intensity (22–27%) during this period. We also find a multidecadal change in MJO phase occurrence and periodicity corresponding to the Pacific Decadal Oscillation (PDO), while the role of anthropogenic warming cannot be ignored.

Energies ◽  
2021 ◽  
Vol 14 (18) ◽  
pp. 5947
Author(s):  
William Mounter ◽  
Chris Ogwumike ◽  
Huda Dawood ◽  
Nashwan Dawood

Advances in metering technologies and emerging energy forecast strategies provide opportunities and challenges for predicting both short and long-term building energy usage. Machine learning is an important energy prediction technique, and is significantly gaining research attention. The use of different machine learning techniques based on a rolling-horizon framework can help to reduce the prediction error over time. Due to the significant increases in error beyond short-term energy forecasts, most reported energy forecasts based on statistical and machine learning techniques are within the range of one week. The aim of this study was to investigate how facility managers can improve the accuracy of their building’s long-term energy forecasts. This paper presents an extensive study of machine learning and data processing techniques and how they can more accurately predict within different forecast ranges. The Clarendon building of Teesside University was selected as a case study to demonstrate the prediction of overall energy usage with different machine learning techniques such as polynomial regression (PR), support vector regression (SVR) and artificial neural networks (ANNs). This study further examined how preprocessing training data for prediction models can impact the overall accuracy, such as via segmenting the training data by building modes (active and dormant), or by days of the week (weekdays and weekends). The results presented in this paper illustrate a significant reduction in the mean absolute percentage error (MAPE) for segmented building (weekday and weekend) energy usage prediction when compared to unsegmented monthly predictions. A reduction in MAPE of 5.27%, 11.45%, and 12.03% was achieved with PR, SVR and ANN, respectively.


Author(s):  
William Mounter ◽  
Huda Dawood ◽  
Nashwan Dawood

AbstractAdvances in metering technologies and machine learning methods provide both opportunities and challenges for predicting building energy usage in the both the short and long term. However, there are minimal studies on comparing machine learning techniques in predicting building energy usage on their rolling horizon, compared with comparisons based upon a singular forecast range. With the majority of forecasts ranges being within the range of one week, due to the significant increases in error beyond short term building energy prediction. The aim of this paper is to investigate how the accuracy of building energy predictions can be improved for long term predictions, in part of a larger study into which machine learning techniques predict more accuracy within different forecast ranges. In this case study the ‘Clarendon building’ of Teesside University was selected for use in using it’s BMS data (Building Management System) to predict the building’s overall energy usage with Support Vector Regression. Examining how altering what data is used to train the models, impacts their overall accuracy. Such as by segmenting the model by building modes (Active and dormant), or by days of the week (Weekdays and weekends). Of which it was observed that modelling building weekday and weekend energy usage, lead to a reduction of 11% MAPE on average compared with unsegmented predictions.


2020 ◽  
Vol 12 (2) ◽  
pp. 84-99
Author(s):  
Li-Pang Chen

In this paper, we investigate analysis and prediction of the time-dependent data. We focus our attention on four different stocks are selected from Yahoo Finance historical database. To build up models and predict the future stock price, we consider three different machine learning techniques including Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN) and Support Vector Regression (SVR). By treating close price, open price, daily low, daily high, adjusted close price, and volume of trades as predictors in machine learning methods, it can be shown that the prediction accuracy is improved.


Author(s):  
Anantvir Singh Romana

Accurate diagnostic detection of the disease in a patient is critical and may alter the subsequent treatment and increase the chances of survival rate. Machine learning techniques have been instrumental in disease detection and are currently being used in various classification problems due to their accurate prediction performance. Various techniques may provide different desired accuracies and it is therefore imperative to use the most suitable method which provides the best desired results. This research seeks to provide comparative analysis of Support Vector Machine, Naïve bayes, J48 Decision Tree and neural network classifiers breast cancer and diabetes datsets.


2020 ◽  
Author(s):  
Azhagiya Singam Ettayapuram Ramaprasad ◽  
Phum Tachachartvanich ◽  
Denis Fourches ◽  
Anatoly Soshilov ◽  
Jennifer C.Y. Hsieh ◽  
...  

Perfluoroalkyl and Polyfluoroalkyl Substances (PFASs) pose a substantial threat as endocrine disruptors, and thus early identification of those that may interact with steroid hormone receptors, such as the androgen receptor (AR), is critical. In this study we screened 5,206 PFASs from the CompTox database against the different binding sites on the AR using both molecular docking and machine learning techniques. We developed support vector machine models trained on Tox21 data to classify the active and inactive PFASs for AR using different chemical fingerprints as features. The maximum accuracy was 95.01% and Matthew’s correlation coefficient (MCC) was 0.76 respectively, based on MACCS fingerprints (MACCSFP). The combination of docking-based screening and machine learning models identified 29 PFASs that have strong potential for activity against the AR and should be considered priority chemicals for biological toxicity testing.


2020 ◽  
Author(s):  
Nalika Ulapane ◽  
Karthick Thiyagarajan ◽  
sarath kodagoda

<div>Classification has become a vital task in modern machine learning and Artificial Intelligence applications, including smart sensing. Numerous machine learning techniques are available to perform classification. Similarly, numerous practices, such as feature selection (i.e., selection of a subset of descriptor variables that optimally describe the output), are available to improve classifier performance. In this paper, we consider the case of a given supervised learning classification task that has to be performed making use of continuous-valued features. It is assumed that an optimal subset of features has already been selected. Therefore, no further feature reduction, or feature addition, is to be carried out. Then, we attempt to improve the classification performance by passing the given feature set through a transformation that produces a new feature set which we have named the “Binary Spectrum”. Via a case study example done on some Pulsed Eddy Current sensor data captured from an infrastructure monitoring task, we demonstrate how the classification accuracy of a Support Vector Machine (SVM) classifier increases through the use of this Binary Spectrum feature, indicating the feature transformation’s potential for broader usage.</div><div><br></div>


2020 ◽  
Vol 21 ◽  
Author(s):  
Sukanya Panja ◽  
Sarra Rahem ◽  
Cassandra J. Chu ◽  
Antonina Mitrofanova

Background: In recent years, the availability of high throughput technologies, establishment of large molecular patient data repositories, and advancement in computing power and storage have allowed elucidation of complex mechanisms implicated in therapeutic response in cancer patients. The breadth and depth of such data, alongside experimental noise and missing values, requires a sophisticated human-machine interaction that would allow effective learning from complex data and accurate forecasting of future outcomes, ideally embedded in the core of machine learning design. Objective: In this review, we will discuss machine learning techniques utilized for modeling of treatment response in cancer, including Random Forests, support vector machines, neural networks, and linear and logistic regression. We will overview their mathematical foundations and discuss their limitations and alternative approaches all in light of their application to therapeutic response modeling in cancer. Conclusion: We hypothesize that the increase in the number of patient profiles and potential temporal monitoring of patient data will define even more complex techniques, such as deep learning and causal analysis, as central players in therapeutic response modeling.


Author(s):  
Amandeep Kaur ◽  
Sushma Jain ◽  
Shivani Goel ◽  
Gaurav Dhiman

Context: Code smells are symptoms, that something may be wrong in software systems that can cause complications in maintaining software quality. In literature, there exists many code smells and their identification is far from trivial. Thus, several techniques have also been proposed to automate code smell detection in order to improve software quality. Objective: This paper presents an up-to-date review of simple and hybrid machine learning based code smell detection techniques and tools. Methods: We collected all the relevant research published in this field till 2020. We extracted the data from those articles and classified them into two major categories. In addition, we compared the selected studies based on several aspects like, code smells, machine learning techniques, datasets, programming languages used by datasets, dataset size, evaluation approach, and statistical testing. Results: Majority of empirical studies have proposed machine- learning based code smell detection tools. Support vector machine and decision tree algorithms are frequently used by the researchers. Along with this, a major proportion of research is conducted on Open Source Softwares (OSS) such as, Xerces, Gantt Project and ArgoUml. Furthermore, researchers paid more attention towards Feature Envy and Long Method code smells. Conclusion: We identified several areas of open research like, need of code smell detection techniques using hybrid approaches, need of validation employing industrial datasets, etc.


2019 ◽  
Vol 23 (1) ◽  
pp. 12-21 ◽  
Author(s):  
Shikha N. Khera ◽  
Divya

Information technology (IT) industry in India has been facing a systemic issue of high attrition in the past few years, resulting in monetary and knowledge-based loses to the companies. The aim of this research is to develop a model to predict employee attrition and provide the organizations opportunities to address any issue and improve retention. Predictive model was developed based on supervised machine learning algorithm, support vector machine (SVM). Archival employee data (consisting of 22 input features) were collected from Human Resource databases of three IT companies in India, including their employment status (response variable) at the time of collection. Accuracy results from the confusion matrix for the SVM model showed that the model has an accuracy of 85 per cent. Also, results show that the model performs better in predicting who will leave the firm as compared to predicting who will not leave the company.


2021 ◽  
Vol 7 (1) ◽  
Author(s):  
Helder Sebastião ◽  
Pedro Godinho

AbstractThis study examines the predictability of three major cryptocurrencies—bitcoin, ethereum, and litecoin—and the profitability of trading strategies devised upon machine learning techniques (e.g., linear models, random forests, and support vector machines). The models are validated in a period characterized by unprecedented turmoil and tested in a period of bear markets, allowing the assessment of whether the predictions are good even when the market direction changes between the validation and test periods. The classification and regression methods use attributes from trading and network activity for the period from August 15, 2015 to March 03, 2019, with the test sample beginning on April 13, 2018. For the test period, five out of 18 individual models have success rates of less than 50%. The trading strategies are built on model assembling. The ensemble assuming that five models produce identical signals (Ensemble 5) achieves the best performance for ethereum and litecoin, with annualized Sharpe ratios of 80.17% and 91.35% and annualized returns (after proportional round-trip trading costs of 0.5%) of 9.62% and 5.73%, respectively. These positive results support the claim that machine learning provides robust techniques for exploring the predictability of cryptocurrencies and for devising profitable trading strategies in these markets, even under adverse market conditions.


Sign in / Sign up

Export Citation Format

Share Document