Combine Clustering and Machine Learning for Enhancing the Efficiency of Energy Baseline of Chiller System

Energy baseline is an important method for measuring the energy-saving benefits of chiller system, and the benefits can be calculated by comparing prediction models and actual results. Currently, machine learning is often adopted as a prediction model for energy baselines. Common models include regression, ensemble learning, and deep learning models. In this study, we first reviewed several machine learning algorithms, which were used to establish prediction models. Then, the concept of clustering to preprocess chiller data was adopted. Data mining, K-means clustering, and gap statistic were used to successfully identify the critical variables to cluster chiller modes. Applying these key variables effectively enhanced the quality of the chiller data, and combining the clustering results and the machine learning model effectively improved the prediction accuracy of the model and the reliability of the energy baselines.

Download Full-text

Data science in economics: comprehensive review of advanced machine learning and deep learning methods

10.21203/rs.3.rs-91905/v1 ◽

2020 ◽

Author(s):

Saeed Nosratabadi ◽

Amir Mosavi ◽

Puhong Duan ◽

Pedram Ghamisi ◽

Filip Ferdinand ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Prediction Accuracy ◽

Data Science ◽

State Of The Art ◽

Hybrid Models ◽

The Other ◽

Learning Models ◽

Comprehensive Review

Abstract This paper provides the state of the art of data science in economics. Through a novel taxonomy of applications and methods advances in data science are investigated. The data science advances are investigated in three individual classes of deep learning models, ensemble models, and hybrid models. Application domains include stock market, marketing, E-commerce, corporate banking, and cryptocurrency. Prisma method, a systematic literature review methodology is used to ensure the quality of the survey. The findings revealed that the trends are on advancement of hybrid models as more than 51% of the reviewed articles applied hybrid model. On the other hand, it is found that based on the RMSE accuracy metric, hybrid models had higher prediction accuracy than other algorithms. While it is expected the trends go toward the advancements of deep learning models.

Download Full-text

Effectiveness, Explainability and Reliability of Machine Meta-Learning Methods for Predicting Mortality in Patients with COVID-19: Results of the Brazilian COVID-19 Registry

10.1101/2021.11.01.21265527 ◽

2021 ◽

Author(s):

Bruno Barbosa Miranda de Paiva ◽

Polianna Delfino Pereira ◽

Claudio Moises Valiense de Andrade ◽

Virginia Mara Reis Gomes ◽

Maria Clara Pontello Barbosa Lima ◽

...

Keyword(s):

Machine Learning ◽

Prediction Models ◽

State Of The Art ◽

Laboratory Data ◽

Machine Learning Algorithms ◽

Training Data ◽

Learning Models ◽

Learning Methods ◽

Meta Learning ◽

Machine Learning Models

Objective: To provide a thorough comparative study among state ofthe art machine learning methods and statistical methods for determining in-hospital mortality in COVID 19 patients using data upon hospital admission; to study the reliability of the predictions of the most effective methods by correlating the probability of the outcome and the accuracy of the methods; to investigate how explainable are the predictions produced by the most effective methods. Materials and Methods: De-identified data were obtained from COVID 19 positive patients in 36 participating hospitals, from March 1 to September 30, 2020. Demographic, comorbidity, clinical presentation and laboratory data were used as training data to develop COVID 19 mortality prediction models. Multiple machine learning and traditional statistics models were trained on this prediction task using a folded cross validation procedure, from which we assessed performance and interpretability metrics. Results: The Stacking of machine learning models improved over the previous state of the art results by more than 26% in predicting the class of interest (death), achieving 87.1% of AUROC and macroF1 of 73.9%. We also show that some machine learning models can be very interpretable and reliable, yielding more accurate predictions while providing a good explanation for the why. Conclusion: The best results were obtained using the meta learning ensemble model Stacking. State of the art explainability techniques such as SHAP values can be used to draw useful insights into the patterns learned by machine-learning algorithms. Machine learning models can be more explainable than traditional statistics models while also yielding highly reliable predictions. Key words: COVID-19; prognosis; prediction model; machine learning

Download Full-text

Data Science in Economics: Comprehensive Review of Advanced Machine Learning and Deep Learning Methods

10.20944/preprints202010.0263.v1 ◽

2020 ◽

Author(s):

Saeed Nosratabadi ◽

Amir Mosavi ◽

Puhong Duan ◽

Pedram Ghamisi ◽

Filip Ferdinand ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Prediction Accuracy ◽

Data Science ◽

State Of The Art ◽

Hybrid Models ◽

The Other ◽

Learning Models ◽

Comprehensive Review

This paper provides the state of the art of data science in economics. Through a novel taxonomy of applications and methods advances in data science are investigated. The data science advances are investigated in three individual classes of deep learning models, ensemble models, and hybrid models. Application domains include stock market, marketing, E-commerce, corporate banking, and cryptocurrency. Prisma method, a systematic literature review methodology is used to ensure the quality of the survey. The findings revealed that the trends are on advancement of hybrid models as more than 51% of the reviewed articles applied hybrid model. On the other hand, it is found that based on the RMSE accuracy metric, hybrid models had higher prediction accuracy than other algorithms. While it is expected the trends go toward the advancements of deep learning models.

Download Full-text

Intelligent Personalized Abnormality Detection for Remote Health Monitoring

International Journal of Intelligent Information Technologies ◽

10.4018/ijiit.2020040105 ◽

2020 ◽

Vol 16 (2) ◽

pp. 87-109 ◽

Cited By ~ 1

Author(s):

Poorani Marimuthu ◽

Varalakshmi Perumal ◽

Vaidehi Vijayakumar

Keyword(s):

Machine Learning ◽

Prediction Accuracy ◽

Prediction Models ◽

False Negative ◽

False Negative Rate ◽

Area Under The Curve ◽

Ground Truth ◽

Machine Learning Algorithms ◽

Abnormality Detection ◽

Remote Healthcare

Machine learning algorithms are extensively used in healthcare analytics to learn normal and abnormal patterns automatically. The detection and prediction accuracy of any machine learning model depends on many factors like ground truth instances, attribute relationships, model design, the size of the dataset, the percentage of uncertainty, the training and testing environment, etc. Prediction models in healthcare should generate a minimal false positive and false negative rate. To accomplish high classification or prediction accuracy, the screening of health status needs to be personalized rather than following general clinical practice guidelines (CPG) which fits for an average population. Hence, a personalized screening model (IPAD – Intelligent Personalized Abnormality Detection) for remote healthcare is proposed that tailored to specific individual. The severity level of the abnormal status has been derived using personalized health values and the IPAD model obtains an area under the curve (AUC) of 0.907.

Download Full-text

Predicting Intraday Prices in the Frontier Stock Market of Romania Using Machine Learning Algorithms

International Journal of Economics and Financial Research ◽

10.32861/ijefr.67.170.179 ◽

2020 ◽

pp. 170-179

Author(s):

Dan Gabriel ANGHEL

Keyword(s):

Machine Learning ◽

Stock Market ◽

Stock Prices ◽

Prediction Accuracy ◽

Prediction Models ◽

State Of The Art ◽

Predictive Ability ◽

Weak Form ◽

Machine Learning Algorithms ◽

Forecasting Models

This paper investigates if forecasting models based on Machine Learning (ML) Algorithms are capable to predict intraday prices in the small, frontier stock market of Romania. The results show that this is indeed the case. Moreover, the prediction accuracy of the various models improves as the forecasting horizon increases. Overall, ML forecasting models are superior to the passive buy and hold strategy, as well as to a naïve strategy that always predicts the last known price action will continue. However, we also show that this superior predictive ability cannot be converted into “abnormal”, economically significant profits after considering transaction costs. This implies that intraday stock prices incorporate information within the accepted bounds of weak-form market efficiency, and cannot be “timed” even by sophisticated investors equipped with state of the art ML prediction models.

Download Full-text

A deep learning-based quality assessment model of collaboratively edited documents: A case study of Wikipedia

Journal of Information Science ◽

10.1177/0165551519877646 ◽

2019 ◽

pp. 016555151987764

Author(s):

Ping Wang ◽

Xiaodan Li ◽

Renli Wu

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Complete Information ◽

Classification Performance ◽

Machine Learning Algorithms ◽

Assessment Model ◽

Learning Models ◽

Proposed Model

Wikipedia is becoming increasingly critical in helping people obtain information and knowledge. Its leading advantage is that users can not only access information but also modify it. However, this presents a challenging issue: how can we measure the quality of a Wikipedia article? The existing approaches assess Wikipedia quality by statistical models or traditional machine learning algorithms. However, their performance is not satisfactory. Moreover, most existing models fail to extract complete information from articles, which degrades the model’s performance. In this article, we first survey related works and summarise a comprehensive feature framework. Then, state-of-the-art deep learning models are introduced and applied to assess Wikipedia quality. Finally, a comparison among deep learning models and traditional machine learning models is conducted to validate the effectiveness of the proposed model. The models are compared extensively in terms of their training and classification performance. Moreover, the importance of each feature and the importance of different feature sets are analysed separately.

Download Full-text

COVID-19 detection using federated machine learning

PLoS ONE ◽

10.1371/journal.pone.0252573 ◽

2021 ◽

Vol 16 (6) ◽

pp. e0252573

Author(s):

Mustafa Abdul Salam ◽

Sanaa Taha ◽

Mohamed Ramadan

Keyword(s):

Machine Learning ◽

Model Prediction ◽

Prediction Accuracy ◽

Activation Function ◽

Learning Model ◽

Learning Models ◽

X Ray ◽

Machine Learning Model ◽

Chest X Ray ◽

Machine Learning Models

The current COVID-19 pandemic threatens human life, health, and productivity. AI plays an essential role in COVID-19 case classification as we can apply machine learning models on COVID-19 case data to predict infectious cases and recovery rates using chest x-ray. Accessing patient’s private data violates patient privacy and traditional machine learning model requires accessing or transferring whole data to train the model. In recent years, there has been increasing interest in federated machine learning, as it provides an effective solution for data privacy, centralized computation, and high computation power. In this paper, we studied the efficacy of federated learning versus traditional learning by developing two machine learning models (a federated learning model and a traditional machine learning model)using Keras and TensorFlow federated, we used a descriptive dataset and chest x-ray (CXR) images from COVID-19 patients. During the model training stage, we tried to identify which factors affect model prediction accuracy and loss like activation function, model optimizer, learning rate, number of rounds, and data Size, we kept recording and plotting the model loss and prediction accuracy per each training round, to identify which factors affect the model performance, and we found that softmax activation function and SGD optimizer give better prediction accuracy and loss, changing the number of rounds and learning rate has slightly effect on model prediction accuracy and prediction loss but increasing the data size did not have any effect on model prediction accuracy and prediction loss. finally, we build a comparison between the proposed models’ loss, accuracy, and performance speed, the results demonstrate that the federated machine learning model has a better prediction accuracy and loss but higher performance time than the traditional machine learning model.

Download Full-text

Prediction of Earnings Manipulation on Malaysian Listed Firms: A Comparison between Linear and Tree-based Machine Learning

International Journal of Emerging Technology and Advanced Engineering ◽

10.46338/ijetae0821_13 ◽

2021 ◽

Vol 11 (8) ◽

pp. 111-120

Author(s):

Rahayu Abdul Rahman ◽

◽

Suraya Masrom ◽

Nor Balkish Zakaria ◽

Enny Nurdin

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Earnings Management ◽

Generalized Linear Model ◽

Prediction Models ◽

Financial Economic ◽

Machine Learning Algorithms ◽

Earnings Manipulation ◽

Listed Firms

Predicting the earning manipulation is an inseparable part of financial-economic analysis, helping shareholders, investors, creditors and outsiders acquire high quality of firm’s financial information. Thus, the aim of the paper is to compare the earnings manipulation prediction models developed by using two types of machine learning algorithms; linear and tree categories. The linear based machine learning are Logistic Regression and Generalized Linear Model while the tree based are Decision Tree and Random Forest. All of the algorithms were tested on dataset of earnings manipulation among 1874 firm-year observations of firms listed on Bursa Malaysia . The results indicate that the performances of the two kinds of machine learning is not extremely different except with the Decision Tree. Furthermore, the most outperformed algorithm has been presented by the linear based machine learning, which produced the best accuracy in the shortest total time completion. All the models present better ability in detecting the false cases of earnings manipulation rather than the true cases mainly from the tree based machine learning. Keywords-- Earnings Manipulation, Earnings Management, Machine Learning, Malaysia

Download Full-text

Damaged/missing proximity sensor induces screen mistouch when answering calls: Prediction of smartphone answering status by posture data

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-210646 ◽

2021 ◽

pp. 1-12

Author(s):

Xing Tang ◽

Suihuai Yu ◽

Jianjie Chu ◽

Hao Fan

Keyword(s):

Machine Learning ◽

User Experience ◽

Prediction Accuracy ◽

Phone Call ◽

Learning Models ◽

Proximity Sensor ◽

Human Ear ◽

Compensation Approach ◽

Machine Learning Models

When the proximity sensor of a smartphone is impaired, it would easily lead to screen mistouch during conversation, which will significantly affect the user experience. However, there are relatively few studies that have been focused on the quality of user experience following sensor impairment. The purpose of this study was to compare and evaluate different machine learning models in forecasting the user’s posture during a phone call, thereby providing a compensation approach for detecting proximity to the human ear during a phone call following sensor damage. The built-in accelerometer sensors of smartphones were employed to collect posture data while users were employing their smartphones. Three main postures (holding, moving and answering) were identified; the posture data were obtained through training and prediction using five machine learning models. The results showed that the model that utilized triaxial data had better prediction accuracy than the model that used single-axis data. Furthermore, models with time-domain features had a higher accuracy rate. Among the five models, neural networks had the best prediction accuracy (0.982). The proposed approach could be of immense benefit to the users following proximity sensor damage, and would be advantageous in the design of the smartphone, particularly in the early stages of the design process.

Download Full-text

Development of Rainfall Prediction Models Using Machine Learning Approaches for Different Agro-Climatic Zones

Advances in Data Mining and Database Management - Handbook of Research on Automated Feature Engineering and Advanced Applications in Data Science ◽

10.4018/978-1-7998-6659-6.ch005 ◽

2021 ◽

pp. 72-94

Author(s):

Diwakar Naidu ◽

Babita Majhi ◽

Surendra Kumar Chandniha

Keyword(s):

Neural Network ◽

Machine Learning ◽

Large Scale ◽

Prediction Models ◽

Machine Learning Algorithms ◽

Learning Approaches ◽

Learning Models ◽

Climatic Zones ◽

Environmental Prediction ◽

Machine Learning Models

This study focuses on modelling the changes in rainfall patterns in different agro-climatic zones due to climate change through statistical downscaling of large-scale climate variables using machine learning approaches. Potential of three machine learning algorithms, multilayer artificial neural network (MLANN), radial basis function neural network (RBFNN), and least square support vector machine (LS-SVM) have been investigated. The large-scale climate variable are obtained from National Centre for Environmental Prediction (NCEP) reanalysis product and used as predictors for model development. Proposed machine learning models are applied to generate projected time series of rainfall for the period 2021-2050 using the Hadley Centre coupled model (HadCM3) B2 emission scenario data as predictors. An increasing trend in anticipated rainfall is observed during 2021-2050 in all the ACZs of Chhattisgarh State. Among the machine learning models, RBFNN found as more feasible technique for modeling of monthly rainfall in this region.

Download Full-text