scholarly journals Data-Driven Modelling of Smart Building Ventilation Subsystem

2019 ◽  
Vol 2019 ◽  
pp. 1-14 ◽  
Author(s):  
Grigore Stamatescu ◽  
Iulia Stamatescu ◽  
Nicoleta Arghira ◽  
Ioana Fagarasan

Considering the advances in building monitoring and control through networks of interconnected devices, effective handling of the associated rich data streams is becoming an important challenge. In many situations, the application of conventional system identification or approximate grey-box models, partly theoretic and partly data driven, is either unfeasible or unsuitable. The paper discusses and illustrates an application of black-box modelling achieved using data mining techniques with the purpose of smart building ventilation subsystem control. We present the implementation and evaluation of a data mining methodology on collected data from over one year of operation. The case study is carried out on four air handling units of a modern campus building for preliminary decision support for facility managers. The data processing and learning framework is based on two steps: raw data streams are compressed using the Symbolic Aggregate Approximation method, followed by the resulting segments being input into a Support Vector Machine algorithm. The results are useful for deriving the behaviour of each equipment in various modi of operation and can be built upon for fault detection or energy efficiency applications. Challenges related to online operation within a commercial Building Management System are also discussed as the approach shows promise for deployment.

Plants ◽  
2021 ◽  
Vol 10 (1) ◽  
pp. 95
Author(s):  
Heba Kurdi ◽  
Amal Al-Aldawsari ◽  
Isra Al-Turaiki ◽  
Abdulrahman S. Aldawood

In the past 30 years, the red palm weevil (RPW), Rhynchophorus ferrugineus (Olivier), a pest that is highly destructive to all types of palms, has rapidly spread worldwide. However, detecting infestation with the RPW is highly challenging because symptoms are not visible until the death of the palm tree is inevitable. In addition, the use of automated RPW weevil identification tools to predict infestation is complicated by a lack of RPW datasets. In this study, we assessed the capability of 10 state-of-the-art data mining classification algorithms, Naive Bayes (NB), KSTAR, AdaBoost, bagging, PART, J48 Decision tree, multilayer perceptron (MLP), support vector machine (SVM), random forest, and logistic regression, to use plant-size and temperature measurements collected from individual trees to predict RPW infestation in its early stages before significant damage is caused to the tree. The performance of the classification algorithms was evaluated in terms of accuracy, precision, recall, and F-measure using a real RPW dataset. The experimental results showed that infestations with RPW can be predicted with an accuracy up to 93%, precision above 87%, recall equals 100%, and F-measure greater than 93% using data mining. Additionally, we found that temperature and circumference are the most important features for predicting RPW infestation. However, we strongly call for collecting and aggregating more RPW datasets to run more experiments to validate these results and provide more conclusive findings.


Author(s):  
Efat Jabarpour ◽  
Amin Abedini ◽  
Abbasali Keshtkar

Introduction: Osteoporosis is a disease that reduces bone density and loses the quality of bone microstructure leading to an increased risk of fractures. It is one of the major causes of inability and death in elderly people. The current study aims at determining the factors influencing the incidence of osteoporosis and providing a predictive model for the disease diagnosis to increase the diagnostic speed and reduce diagnostic costs. Methods: An Individual's data including personal information, lifestyle, and disease information were reviewed. A new model has been presented based on the Cross-Industry Standard Process CRISP methodology. Besides, Support Vector Machine (SVM) and Bayes methods (Tree Augmented Naïve Bayes (TAN)) and Clementine12 have been used as data mining tools. Results: Some features have been detected to affect this disease. The rules have been extracted that can be used as a pattern for the prediction of the patients' status. Classification precision was calculated to be 88.39% for SVM, and 91.29% for  (TAN) when the precision of  TAN  is higher comparing to other methods. Conclusion: The most effective factors concerning osteoporosis are detected and can be used for a new sample with defined characteristics to predict the possibility of osteoporosis in a person.  


2021 ◽  
Vol 2 (2) ◽  
Author(s):  
Farid Sartipi ◽  

With the growing attention to smart buildings, local governments are seeking practical ways to optimize the energy consumption of commercial buildings. An ideal smart building is capable of monitoring its own energy consumption and adjusting the operation of electric devices, being lighting and air conditioners, based on the occupant behaviour. In this study, data had been obtained from the monitoring sensors in a commercial building located in the heart of Sydney from 2013 until 2020 on a 15-minute time intervals. The data derivation and analysis are intrinsically static at the moment which makes it difficult for building management to make instantaneous decision regarding the measures to be taken for a lower energy consumption. Using data analysis and visualization tools in Tableau, this study provides detailed insights about the trends in energy consumption in the given building. The outcomes facilitate the decision making for building management and can be seen as a milestone towards a dynamic optimization protocol in a bigger picture which is introduced in the second part of this study.


Author(s):  
Mohammad M. Masud ◽  
Latifur Khan ◽  
Bhavani Thuraisingham

This chapter applies data mining techniques to detect email worms. Email messages contain a number of different features such as the total number of words in message body/subject, presence/absence of binary attachments, type of attachments, and so on. The goal is to obtain an efficient classification model based on these features. The solution consists of several steps. First, the number of features is reduced using two different approaches: feature-selection and dimension-reduction. This step is necessary to reduce noise and redundancy from the data. The feature-selection technique is called Two-phase Selection (TPS), which is a novel combination of decision tree and greedy selection algorithm. The dimensionreduction is performed by Principal Component Analysis. Second, the reduced data is used to train a classifier. Different classification techniques have been used, such as Support Vector Machine (SVM), Naïve Bayes and their combination. Finally, the trained classifiers are tested on a dataset containing both known and unknown types of worms. These results have been compared with published results. It is found that the proposed TPS selection along with SVM classification achieves the best accuracy in detecting both known and unknown types of worms.


2013 ◽  
Vol 16 (3) ◽  
pp. 671-689 ◽  
Author(s):  
Daniel J. Karran ◽  
Efrat Morin ◽  
Jan Adamowski

Considering the popularity of using data-driven non-linear methods for forecasting streamflow, there has been no exploration of how well such models perform in climate regimes with differing hydrological characteristics, nor has the performance of these models, coupled with wavelet transforms, been compared for lead times of less than 1 month. This study compares the use of four different models, namely artificial neural networks (ANNs), support vector regression (SVR), wavelet-ANN, and wavelet-SVR in a Mediterranean, Oceanic, and Hemiboreal watershed. Model performance was tested for 1, 2 and 3 day forecasting lead times, measured by fractional standard error, the coefficient of determination, Nash–Sutcliffe model efficiency, multiplicative bias, probability of detection and false alarm rate. SVR based models performed best overall, but no one model outperformed the others in more than one watershed, suggesting that some models may be more suitable for certain types of data. Overall model performance varied greatly between climate regimes, suggesting that higher persistence and slower hydrological processes (i.e. snowmelt, glacial runoff, and subsurface flow) support reliable forecasting using daily and multi-day lead times.


Author(s):  
M. Jupri ◽  
Riyanarto Sarno

The achievement of accepting optimal tax need effective and efficient tax supervision can be achieved by classifying taxpayer compliance to tax regulations. Considering this issue, this paper proposes the classification of taxpayer compliance using data mining algorithms; i.e. C4.5, Support Vector Machine, K-Nearest Neighbor, Naive Bayes, and Multilayer Perceptron based on the compliance of taxpayer data. The taxpayer compliance can be classified into four classes, which are (1) formal and material compliant taxpayers, (2) formal compliant taxpayers, (3) material compliant taxpayers, and (4) formal and material non-compliant taxpayers. Furthermore, the results of data mining algorithms are compared by using Fuzzy AHP and TOPSIS to determine the best performance classification based on the criteria of Accuracy, F-Score, and Time required. Selection of the taxpayer's priority for more detailed supervision at each level of taxpayer compliance is ranked using Fuzzy AHP and TOPSIS based on criteria of dataset variables. The results show that C4.5 is the best performance classification and achieves preference value of 0.998; whereas the MLP algorithm results from the lowest preference value of 0.131. Alternative taxpayer A233 is the top priority taxpayer with a preference value of 0.433; whereas alternative taxpayer A051 is the lowest priority taxpayer with a preference value of 0.036.


2021 ◽  
Vol 11 (19) ◽  
pp. 9096
Author(s):  
Idongesit Ekerete ◽  
Matias Garcia-Constantino ◽  
Alexandros Konios ◽  
Mustafa A. Mustafa ◽  
Yohanca Diaz-Skeete ◽  
...  

This paper proposes the fusion of Unobtrusive Sensing Solutions (USSs) for human Activity Recognition and Classification (ARC) in home environments. It also considers the use of data mining models and methods for cluster-based analysis of datasets obtained from the USSs. The ability to recognise and classify activities performed in home environments can help monitor health parameters in vulnerable individuals. This study addresses five principal concerns in ARC: (i) users’ privacy, (ii) wearability, (iii) data acquisition in a home environment, (iv) actual recognition of activities, and (v) classification of activities from single to multiple users. Timestamp information from contact sensors mounted at strategic locations in a kitchen environment helped obtain the time, location, and activity of 10 participants during the experiments. A total of 11,980 thermal blobs gleaned from privacy-friendly USSs such as ceiling and lateral thermal sensors were fused using data mining models and methods. Experimental results demonstrated cluster-based activity recognition, classification, and fusion of the datasets with an average regression coefficient of 0.95 for tested features and clusters. In addition, a pooled Mean accuracy of 96.5% was obtained using classification-by-clustering and statistical methods for models such as Neural Network, Support Vector Machine, K-Nearest Neighbour, and Stochastic Gradient Descent on Evaluation Test.


2020 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Rebecca Wolf ◽  
Joseph M. Reilly ◽  
Steven M. Ross

PurposeThis article informs school leaders and staffs about existing research findings on the use of data-driven decision-making in creating class rosters. Given that teachers are the most important school-based educational resource, decisions regarding the assignment of students to particular classes and teachers are highly impactful for student learning. Classroom compositions of peers can also influence student learning.Design/methodology/approachA literature review was conducted on the use of data-driven decision-making in the rostering process. The review addressed the merits of using various quantitative metrics in the rostering process.FindingsFindings revealed that, despite often being purposeful about rostering, school leaders and staffs have generally not engaged in data-driven decision-making in creating class rosters. Using data-driven rostering may have benefits, such as limiting the questionable practice of assigning the least effective teachers in the school to the youngest or lowest performing students. School leaders and staffs may also work to minimize negative peer effects due to concentrating low-achieving, low-income, or disruptive students in any one class. Any data-driven system used in rostering, however, would need to be adequately complex to account for multiple influences on student learning. Based on the research reviewed, quantitative data alone may not be sufficient for effective rostering decisions.Practical implicationsGiven the rich data available to school leaders and staffs, data-driven decision-making could inform rostering and contribute to more efficacious and equitable classroom assignments.Originality/valueThis article is the first to summarize relevant research across multiple bodies of literature on the opportunities for and challenges of using data-driven decision-making in creating class rosters.


Sign in / Sign up

Export Citation Format

Share Document