Energy Audit System for Households using Machine Learning

the growth in population and economics the global demand for energy is increased considerably. The large amount of energy demand comes from houses. Because of this the energy efficiency in houses in considered most important aspect towards the global sustainability. The machine learning algorithms contributed heavily in predicting the amount of energy consumed in household level. In this paper, a energy audit system using machine learning are developed to estimate the amount of energy consumed at household level in order to identify probable areas to plug wastage of energy in household. Each energy audit system is trained using one machine leaning algorithm with previous power consumption history of training data. By converting this data into knowledge, gratification of analysis of energy consumption is attained. The performance of energy audit Linear Regression system is 82%, Decision Tree system is 86% and Random Forest 91% are predicted energy consumption and the performance of learning methods were evaluated based on the heir predictive accuracy, ease of learning and user friendly characteristics. The Random Forest energy audit system is superior when compare to other energy audit system.

Download Full-text

PSIX-15 Assessment of machine learning algorithms for prediction of Aleutian disease in American mink

Journal of Animal Science ◽

10.1093/jas/skab235.484 ◽

2021 ◽

Vol 99 (Supplement_3) ◽

pp. 264-265

Author(s):

Duy Ngoc Do ◽

Guoyu Hu ◽

Younes Miar

Keyword(s):

Machine Learning ◽

Random Forest ◽

Linear Models ◽

American Mink ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Training Data ◽

Enzyme Linked Immunosorbent Assay ◽

Linear Discriminant ◽

Machine Learning Classification

Abstract American mink (Neovison vison) is the major source of fur for the fur industries worldwide and Aleutian disease (AD) is causing severe financial losses to the mink industry. Different methods have been used to diagnose the AD in mink, but the combination of several methods can be the most appropriate approach for the selection of AD resilient mink. Iodine agglutination test (IAT) and counterimmunoelectrophoresis (CIEP) methods are commonly employed in test-and-remove strategy; meanwhile, enzyme-linked immunosorbent assay (ELISA) and packed-cell volume (PCV) methods are complementary. However, using multiple methods are expensive; and therefore, hindering the corrected use of AD tests in selection. This research presented the assessments of the AD classification based on machine learning algorithms. The Aleutian disease was tested on 1,830 individuals using these tests in an AD positive mink farm (Canadian Centre for Fur Animal Research, NS, Canada). The accuracy of classification for CIEP was evaluated based on the sex information, and IAT, ELISA and PCV test results implemented in seven machine learning classification algorithms (Random Forest, Artificial Neural Networks, C50Tree, Naive Bayes, Generalized Linear Models, Boost, and Linear Discriminant Analysis) using the Caret package in R. The accuracy of prediction varied among the methods. Overall, the Random Forest was the best-performing algorithm for the current dataset with an accuracy of 0.89 in the training data and 0.94 in the testing data. Our work demonstrated the utility and relative ease of using machine learning algorithms to assess the CIEP information, and consequently reducing the cost of AD tests. However, further works require the inclusion of production and reproduction information in the models and extension of phenotypic collection to increase the accuracy of current methods.

Download Full-text

Predicting Bank Operational Efficiency Using Machine Learning Algorithm: Comparative Study of Decision Tree, Random Forest, and Neural Networks

Advances in Fuzzy Systems ◽

10.1155/2020/8581202 ◽

2020 ◽

Vol 2020 ◽

pp. 1-12

Author(s):

Peter Appiahene ◽

Yaw Marfo Missah ◽

Ussiph Najim

Keyword(s):

Machine Learning ◽

Random Forest ◽

Decision Tree ◽

Banking Sector ◽

Banking Industry ◽

Predictive Accuracy ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

Machine Learning Algorithm ◽

And Performance

The financial crisis that hit Ghana from 2015 to 2018 has raised various issues with respect to the efficiency of banks and the safety of depositors’ in the banking industry. As part of measures to improve the banking sector and also restore customers’ confidence, efficiency and performance analysis in the banking industry has become a hot issue. This is because stakeholders have to detect the underlying causes of inefficiencies within the banking industry. Nonparametric methods such as Data Envelopment Analysis (DEA) have been suggested in the literature as a good measure of banks’ efficiency and performance. Machine learning algorithms have also been viewed as a good tool to estimate various nonparametric and nonlinear problems. This paper presents a combined DEA with three machine learning approaches in evaluating bank efficiency and performance using 444 Ghanaian bank branches, Decision Making Units (DMUs). The results were compared with the corresponding efficiency ratings obtained from the DEA. Finally, the prediction accuracies of the three machine learning algorithm models were compared. The results suggested that the decision tree (DT) and its C5.0 algorithm provided the best predictive model. It had 100% accuracy in predicting the 134 holdout sample dataset (30% banks) and a P value of 0.00. The DT was followed closely by random forest algorithm with a predictive accuracy of 98.5% and a P value of 0.00 and finally the neural network (86.6% accuracy) with a P value 0.66. The study concluded that banks in Ghana can use the result of this study to predict their respective efficiencies. All experiments were performed within a simulation environment and conducted in R studio using R codes.

Download Full-text

Reverse-engineering human olfactory perception from chemical features of odor molecules

10.1101/082495 ◽

2016 ◽

Cited By ~ 2

Author(s):

Andreas Keller ◽

Richard C. Gerkin ◽

Yuanfang Guan ◽

Amit Dhurandhar ◽

Gabor Turu ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Molecular Mechanisms ◽

Linear Models ◽

Predictive Accuracy ◽

High Accuracy ◽

Machine Learning Algorithms ◽

Olfactory Perception ◽

Theoretical Limit ◽

Reverse Engineer

AbstractDespite 25 years of progress in understanding the molecular mechanisms of olfaction, it is still not possible to predict whether a given molecule will have a perceived odor, or what olfactory percept it will produce. To address this stimulus-percept problem for olfaction, we organized the crowd-sourced DREAM Olfaction Prediction Challenge. Working from a large olfactory psychophysical dataset, teams developed machine learning algorithms to predict sensory attributes of molecules based on their chemoinformatic features. The resulting models predicted odor intensity and pleasantness with high accuracy, and also successfully predicted eight semantic descriptors (“garlic”, “fish”, “sweet”, “fruit”, “burnt”, “spices”, “flower”, “sour”). Regularized linear models performed nearly as well as random-forest-based approaches, with a predictive accuracy that closely approaches a key theoretical limit. The models presented here make it possible to predict the perceptual qualities of virtually any molecule with an impressive degree of accuracy to reverse-engineer the smell of a molecule.One Sentence SummaryResults of a crowdsourcing competition show that it is possible to accurately predict and reverse-engineer the smell of a molecule.

Download Full-text

A Novel GIS-Based Random Forest Machine Algorithm for the Spatial Prediction of Shallow Landslide Susceptibility

Forests ◽

10.3390/f11010118 ◽

2020 ◽

Vol 11 (1) ◽

pp. 118 ◽

Cited By ~ 6

Author(s):

Viet-Hung Dang ◽

Nhat-Duc Hoang ◽

Le-Mai-Duyen Nguyen ◽

Dieu Tien Bui ◽

Pijush Samui

Keyword(s):

Machine Learning ◽

Random Forest ◽

Landslide Susceptibility ◽

Spatial Prediction ◽

Shallow Landslide ◽

Machine Learning Algorithms ◽

Training Data ◽

Support Vector ◽

Conditioning Factors ◽

Susceptibility Modeling

This study developed and verified a new hybrid machine learning model, named random forest machine (RFM), for the spatial prediction of shallow landslides. RFM is a hybridization of two state-of-the-art machine learning algorithms, random forest classifier (RFC) and support vector machine (SVM), in which RFC is used to generate subsets from training data and SVM is used to build decision functions for these subsets. To construct and verify the hybrid RFM model, a shallow landslide database of the Lang Son area (northern Vietnam) was prepared. The database consisted of 101 shallow landslide polygons and 14 conditioning factors. The relevance of these factors for shallow landslide susceptibility modeling was assessed using the ReliefF method. Experimental results pointed out that the proposed RFM can help to achieve the desired prediction with an F1 score of roughly 0.96. The performance of the RFM was better than those of benchmark approaches, including the SVM, RFC, and logistic regression. Thus, the newly developed RFM is a promising tool to help local authorities in shallow landslide hazard mitigations.

Download Full-text

Predicting Limit-Setting Behavior of Gamblers Using Machine Learning Algorithms: A Real-World Study of Norwegian Gamblers Using Account Data

International Journal of Mental Health and Addiction ◽

10.1007/s11469-019-00166-2 ◽

2019 ◽

Cited By ~ 1

Author(s):

Michael Auer ◽

Mark D. Griffiths

Keyword(s):

Machine Learning ◽

Random Forest ◽

Test Data ◽

Predictive Analytics ◽

Learning Algorithm ◽

Responsible Gambling ◽

Machine Learning Algorithms ◽

Training Data ◽

Training Dataset ◽

Limit Setting

AbstractPlayer protection and harm minimization have become increasingly important in the gambling industry along with the promotion of responsible gambling (RG). Among the most widespread RG tools that gaming operators provide are limit-setting tools that help players limit the amount of time and/or money they spend gambling. Research suggests that limit-setting significantly reduces the amount of money that players spend. If limit-setting is to be encouraged as a way of facilitating responsible gambling, it is important to know what variables are important in getting individuals to set and change limits in the first place. In the present study, 33 variables assessing the player behavior among Norsk Tipping clientele (N = 70,789) from January to March 2017 were computed. The 33 variables which reflect the players’ behavior were then used to predict the likelihood of gamblers changing their monetary limit between April and June 2017. The 70,789 players were randomly split into a training dataset of 56,532 and an evaluation set of 14,157 players (corresponding to an 80/20 split). The results demonstrated that it is possible to predict future limit-setting based on player behavior. The random forest algorithm appeared to predict limit-changing behavior much better than the other algorithms. However, on the independent test data, the random forest algorithm’s accuracy dropped significantly. The best performance on the test data along with a small decrease in accuracy in comparison to the training data was delivered by the gradient boost machine learning algorithm. The most important variables predicting future limit-setting using the gradient boost machine algorithm were players receiving feedback that they had reached 80% of their personal monthly global loss limit, personal monthly loss limit, the amount bet, theoretical loss, and whether the players had increased their limits in the past. With the help of predictive analytics, players with a high likelihood of changing their limits can be proactively approached.

Download Full-text

Coupling simulation with artificial neural networks for the optimisation of HVAC controls in manufacturing environments

Optimization and Engineering ◽

10.1007/s11081-020-09567-y ◽

2020 ◽

Author(s):

Victoria Jayne Mawson ◽

Ben Richard Hughes

Keyword(s):

Machine Learning ◽

Energy Consumption ◽

Energy Demand ◽

Large Data ◽

Electrical Energy ◽

Building Energy ◽

Machine Learning Algorithms ◽

Data Sets ◽

Unseen Data ◽

Manufacturing Environments

Abstract Manufacturing remains one of the most energy intensive sectors, additionally, the energy used within buildings for heating, ventilation and air conditioning (HVAC) is responsible for almost half of the UK’s energy demand. Commonly, these are analysed in isolation from one another. Use of machine learning is gaining popularity due to its ability to solve non-linear problems with large data sets and little knowledge about relationships between parameters. Such models use relationships between inputs and outputs to make further predictions on unseen data, without requiring any understanding regarding the system, making them highly suited to dealing with the stochastic data sets found in a manufacturing environment. This has been seen in literature for determining electrical energy demand for residential or commercial buildings, rather than manufacturing environments. This study proposes a novel method of coupling simulation with machine learning to predict indoor workshop conditions and building energy demand, in response to production schedules, outdoor conditions, building behaviour and use. Such predictions can subsequently allow for more efficient management of HVAC systems. Based upon predicted energy consumption, potential spikes were identified and manufacturing schedules subsequently optimised to reduce peak energy demand. Coupling simulation techniques with machine learning algorithms eliminates the requirement for costly and intrusive methods of data collection, providing a method of predicting and optimising building energy consumption in the manufacturing sector.

Download Full-text

Prediction of Potential Future IT Personnel in Bangladesh using Machine Learning Classifier

Global Disclosure of Economics and Business ◽

10.18034/gdeb.v6i1.112 ◽

2017 ◽

Vol 6 (1) ◽

pp. 7-18

Author(s):

Md. Hasnat Parvez ◽

Most. Moriom Khatun ◽

Sayed Mohsin Reza ◽

Md. Mahfujur Rahman ◽

Md. Fazlul Karim Patwary

Keyword(s):

Machine Learning ◽

Random Forest ◽

Direct Analysis ◽

Machine Learning Algorithms ◽

Training Data ◽

Accuracy Measurement ◽

Learning Classifier ◽

Future Potential ◽

Roc Area ◽

It Personnel

Bangladesh is one of the most promising developing countries in IT sector, where people from several disciplines and experiences are involved in this sector. However, no direct analysis in this sector is published yet, which covers the proper guideline for predicting future IT personnel. Hence this is not a simple solution, training data from real IT sector are needed and trained several classifiers for detecting perfect results. Machine learning algorithms can be used for predicting future potential IT personnel. In this paper, four different classifiers named as Naive Bayes, J48, Bagging and Random Forest in five different folds are experimented for that prediction. Results are pointed out that Random Forest performs better accuracy than other experimented classifier for future IT personnel prediction. It is mentioned that the standard accuracy measurement process named as Precision, Recall, F-Measure, ROC Area etc. are used for evaluating the results.

Download Full-text

Machine Learning Assisted Design for Active Cathode Materials

Volume 3: Advanced Materials: Design, Processing, Characterization, and Applications ◽

10.1115/imece2020-23963 ◽

2020 ◽

Author(s):

Sihan Yong ◽

Zhuoyuan Zheng ◽

Pingfeng Wang ◽

Yumeng Li

Keyword(s):

Machine Learning ◽

Random Forest ◽

Mean Squared Error ◽

Computational Simulation ◽

Material Design ◽

Machine Learning Algorithms ◽

Training Data ◽

Coefficient Of Determination ◽

Crystal System ◽

Wide Range

Abstract The traditional way of designing materials, including experimental measurement and computational simulation, are not efficient. Machine learning is considered a promising solution for material design in the recent years. By observing from previous data, machine learning finds patterns, learns from the patterns and predict the material properties. In this study, machine learning methods are used for discovering new cathode with better properties, includes crystal system learning and the property prediction. K-Folder cross-validation is used for finding the best training data with a limited dataset, nevertheless increasing the percentage of training data would ultimately result in better performance on prediction. It is found that, random forest gives the highest average accuracy in crystal system classification, meanwhile, extra randomized tree algorithm provides a higher averaged coefficient of determination and lower mean squared error in the regression model predicting electrical properties of cathodes. The random forest algorithm is chosen from a wide range of machine learning algorithms with the implementation of Monte Carlo validation. Based on the feature importance evaluation, oxygen contents are found to have the highest effects in determining capacity gravity and volume change in properties prediction.

Download Full-text

Predicting Solid Particle Erosion and Uncertainty in Elbows by Artificial Intelligence Methods

Volume 2: Fluid Mechanics; Multiphase Flows ◽

10.1115/fedsm2020-20458 ◽

2020 ◽

Author(s):

Soroor Karimi ◽

Bohan Xu ◽

Alireza Asgharpour ◽

Siamack A. Shirazi ◽

Sandip Sen

Keyword(s):

Machine Learning ◽

Random Forest ◽

Elastic Net ◽

Machine Learning Algorithms ◽

Superficial Velocity ◽

Training Data ◽

Support Vector ◽

Good Prediction ◽

Data Set ◽

Erosion Prediction

Abstract AI approaches include machine learning algorithms in which models are trained from existing data to predict the behavior of the system for previously unseen cases. Recent studies at the Erosion/Corrosion Research Center (E/CRC) have shown that these methods can be quite effective in predicting erosion. However, these methods are not widely used in the engineering industries due to the lack of work and information in this area. Moreover, in most of the available literature, the reported models and results have not been rigorously tested. This fact suggests that these models cannot be fully trusted for the applications for which they are trained. Therefore, in this study three machine learning models, including Elastic Net, Random Forest and Support Vector Machine (SVM), are utilized to increase the confidence in these tools. First, these models are trained with a training data set. Next, the model hyper-parameters are optimized by using nested cross validation. Finally, the results are verified with a test data set. This process is repeated several times to assure the accuracy of the results. In order to be able to predict the erosion under different conditions with these three models, six main variables are considered in the training data set. These variables include material hardness, pipe diameter, particle size, liquid viscosity, liquid superficial velocity, and gas superficial velocity. All three studied models show good prediction performances. The Random Forest and SVM approaches, however, show slightly better results compared to Elastic Net. The performance of these models is compared to both CFD erosion simulation results and also to Sand Production Pipe Saver (SPPS) results, a mechanistic erosion prediction software developed at the E/CRC. The comparison shows SVM prediction has a better match with both CFD and SPPS. The application of AI model to determine the uncertainty of calculated erosion is also discussed.

Download Full-text

Machine Learning Algorithms in Fraud Detection: Case Study on Retail Consumer Financing Company

Asia Pacific Fraud Journal ◽

10.21532/apfjournal.v6i2.216 ◽

2021 ◽

Vol 6 (2) ◽

pp. 213

Author(s):

Nadya Intan Mustika ◽

Bagus Nenda ◽

Dona Ramadhan

Keyword(s):

Machine Learning ◽

Random Forest ◽

Historical Data ◽

Learning Algorithm ◽

Fraud Detection ◽

Machine Learning Algorithms ◽

Training Data ◽

Support Vector ◽

Random Forest Algorithm ◽

Data Set

This study aims to implement a machine learning algorithm in detecting fraud based on historical data set in a retail consumer financing company. The outcome of machine learning is used as samples for the fraud detection team. Data analysis is performed through data processing, feature selection, hold-on methods, and accuracy testing. There are five machine learning methods applied in this study: Logistic Regression, K-Nearest Neighbor (KNN), Decision Tree, Random Forest, and Support Vector Machine (SVM). Historical data are divided into two groups: training data and test data. The results show that the Random Forest algorithm has the highest accuracy with a training score of 0.994999 and a test score of 0.745437. This means that the Random Forest algorithm is the most accurate method for detecting fraud. Further research is suggested to add more predictor variables to increase the accuracy value and apply this method to different financial institutions and different industries.

Download Full-text