Unleashing analytics to reduce electricity consumption using incremental clustering algorithm

Purpose To reduce the electricity consumption in our homes, a first step is to make the user aware of it. Reading a meter once in a month is not enough, instead, it requires real-time meter reading. Smart electricity meter (SEM) is capable of providing a quick and exact meter reading in real-time at regular time intervals. SEM generates a considerable amount of household electricity consumption data in an incremental manner. However, such data has embedded load patterns and hidden information to extract and learn consumer behavior. The extracted load patterns from data clustering should be updated because consumer behaviors may be changed over time. The purpose of this study is to update the new clustering results based on the old data rather than to re-cluster all of the data from scratch. Design/methodology/approach This paper proposes an incremental clustering with nearness factor (ICNF) algorithm to update load patterns without overall daily load curve clustering. Findings Extensive experiments are implemented on real-world SEM data of Irish Social Science Data Archive (Ireland) data set. The results are evaluated by both accuracy measures and clustering validity indices, which indicate that proposed method is useful for using the enormous amount of smart meter data to understand customers’ electricity consumption behaviors. Originality/value ICNF can provide an efficient response for electricity consumption patterns analysis to end consumers via SEMs.

Download Full-text

Incremental kernel fuzzy c-means with optimizing cluster center initialization and delivery

Kybernetes ◽

10.1108/k-08-2015-0209 ◽

2016 ◽

Vol 45 (8) ◽

pp. 1273-1291 ◽

Cited By ~ 1

Author(s):

Runhai Jiao ◽

Shaolong Liu ◽

Wu Wen ◽

Biying Lin

Keyword(s):

Clustering Algorithm ◽

Clustering Algorithms ◽

Cluster Center ◽

Accurate Information ◽

Incremental Clustering ◽

Data Set ◽

Content Type ◽

Fuzzy C Means ◽

Initial Cluster

Purpose The large volume of big data makes it impractical for traditional clustering algorithms which are usually designed for entire data set. The purpose of this paper is to focus on incremental clustering which divides data into series of data chunks and only a small amount of data need to be clustered at each time. Few researches on incremental clustering algorithm address the problem of optimizing cluster center initialization for each data chunk and selecting multiple passing points for each cluster. Design/methodology/approach Through optimizing initial cluster centers, quality of clustering results is improved for each data chunk and then quality of final clustering results is enhanced. Moreover, through selecting multiple passing points, more accurate information is passed down to improve the final clustering results. The method has been proposed to solve those two problems and is applied in the proposed algorithm based on streaming kernel fuzzy c-means (stKFCM) algorithm. Findings Experimental results show that the proposed algorithm demonstrates more accuracy and better performance than streaming kernel stKFCM algorithm. Originality/value This paper addresses the problem of improving the performance of increment clustering through optimizing cluster center initialization and selecting multiple passing points. The paper analyzed the performance of the proposed scheme and proved its effectiveness.

Download Full-text

Aberration detection in electricity consumption using clustering technique

International Journal of Energy Sector Management ◽

10.1108/ijesm-11-2014-0001 ◽

2015 ◽

Vol 9 (4) ◽

pp. 451-470

Author(s):

Desh Deepak Sharma ◽

S.N. Singh

Keyword(s):

Measurement Errors ◽

Clustering Algorithm ◽

Control Method ◽

Peak Load ◽

Sudden Change ◽

Electricity Consumption ◽

Data Set ◽

Content Type ◽

Clustering Technique ◽

Global Parameters

Purpose – This paper aims to detect abnormal energy uses which relate to undetected consumption, thefts, measurement errors, etc. The detection of irregular power consumption, with variation in irregularities, helps the electric utilities in planning and making strategies to transfer reliable and efficient electricity from generators to the end-users. Abnormal peak load demand is a kind of aberration that needs to be detected. Design/methodology/approach – This paper proposes a Density-Based Micro Spatial Clustering of Applications with Noise (DBMSCAN) clustering algorithm, which is implemented for identification of ranked irregular electricity consumption and occurrence of peak and valley loads. In the proposed algorithm, two parameters, a and ß, are introduced, and, on tuning of these parameters, after setting of global parameters, a varied number of micro-clusters and ranked irregular consumptions, respectively, are obtained. An approach is incorporated with the introduction of a new term Irregularity Variance in the suggested algorithm to find variation in the irregular consumptions according to anomalous behaviors. Findings – No set of global parameters in DBSCAN is found in clustering of load pattern data of a practical system as the data. The proposed DBMSCAN approach finds clustering results and ranked irregular consumption such as different types of abnormal peak demands, sudden change in the demand, nearly zero demand, etc. with computational ease without any iterative control method. Originality/value – The DBMSCAN can be applied on any data set to find ranked outliers. It is an unsupervised approach of clustering technique to find the clustering results and ranked irregular consumptions while focusing on the analysis of and variations in anomalous behaviors in electricity consumption.

Download Full-text

Detection of COVID-19 cases through X-ray images using hybrid deep neural network

World Journal of Engineering ◽

10.1108/wje-10-2020-0529 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Rajit Nair ◽

Santosh Vishwakarma ◽

Mukesh Soni ◽

Tejas Patel ◽

Shubham Joshi

Keyword(s):

Real Time ◽

Binary Classification ◽

Data Set ◽

Content Type ◽

X Ray ◽

Internal Parameters ◽

Average Accuracy ◽

Chest X Ray ◽

Multi Class Classification

Purpose The latest 2019 coronavirus (COVID-2019), which first appeared in December 2019 in Wuhan's city in China, rapidly spread around the world and became a pandemic. It has had a devastating impact on daily lives, the public's health and the global economy. The positive cases must be identified as soon as possible to avoid further dissemination of this disease and swift care of patients affected. The need for supportive diagnostic instruments increased, as no specific automated toolkits are available. The latest results from radiology imaging techniques indicate that these photos provide valuable details on the virus COVID-19. User advanced artificial intelligence (AI) technologies and radiological imagery can help diagnose this condition accurately and help resolve the lack of specialist doctors in isolated areas. In this research, a new paradigm for automatic detection of COVID-19 with bare chest X-ray images is displayed. Images are presented. The proposed model DarkCovidNet is designed to provide correct binary classification diagnostics (COVID vs no detection) and multi-class (COVID vs no results vs pneumonia) classification. The implemented model computed the average precision for the binary and multi-class classification of 98.46% and 91.352%, respectively, and an average accuracy of 98.97% and 87.868%. The DarkNet model was used in this research as a classifier for a real-time object detection method only once. A total of 17 convolutionary layers and different filters on each layer have been implemented. This platform can be used by the radiologists to verify their initial application screening and can also be used for screening patients through the cloud. Design/methodology/approach This study also uses the CNN-based model named Darknet-19 model, and this model will act as a platform for the real-time object detection system. The architecture of this system is designed in such a way that they can be able to detect real-time objects. This study has developed the DarkCovidNet model based on Darknet architecture with few layers and filters. So before discussing the DarkCovidNet model, look at the concept of Darknet architecture with their functionality. Typically, the DarkNet architecture consists of 5 pool layers though the max pool and 19 convolution layers. Assume as a convolution layer, and as a pooling layer. Findings The work discussed in this paper is used to diagnose the various radiology images and to develop a model that can accurately predict or classify the disease. The data set used in this work is the images bases on COVID-19 and non-COVID-19 taken from the various sources. The deep learning model named DarkCovidNet is applied to the data set, and these have shown signification performance in the case of binary classification and multi-class classification. During the multi-class classification, the model has shown an average accuracy 98.97% for the detection of COVID-19, whereas in a multi-class classification model has achieved an average accuracy of 87.868% during the classification of COVID-19, no detection and Pneumonia. Research limitations/implications One of the significant limitations of this work is that a limited number of chest X-ray images were used. It is observed that patients related to COVID-19 are increasing rapidly. In the future, the model on the larger data set which can be generated from the local hospitals will be implemented, and how the model is performing on the same will be checked. Originality/value Deep learning technology has made significant changes in the field of AI by generating good results, especially in pattern recognition. A conventional CNN structure includes a convolution layer that extracts characteristics from the input using the filters it applies, a pooling layer that reduces calculation efficiency and the neural network's completely connected layer. A CNN model is created by integrating one or more of these layers, and its internal parameters are modified to accomplish a specific mission, such as classification or object recognition. A typical CNN structure has a convolution layer that extracts features from the input with the filters it applies, a pooling layer to reduce the size for computational performance and a fully connected layer, which is a neural network. A CNN model is created by combining one or more such layers, and its internal parameters are adjusted to accomplish a particular task, such as classification or object recognition.

Download Full-text

An interactive query-based approach for summarizing scientific documents

Information Discovery and Delivery ◽

10.1108/idd-10-2020-0124 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Farnoush Bayatmakou ◽

Azadeh Mohebi ◽

Abbas Ahmadi

Keyword(s):

User Satisfaction ◽

Information Needs ◽

Specific Information ◽

Information Need ◽

Science Data ◽

Data Set ◽

Content Type ◽

Interactive Query ◽

Summarization System ◽

Clear Idea

Purpose Query-based summarization approaches might not be able to provide summaries compatible with the user’s information need, as they mostly rely on a limited source of information, usually represented as a single query by the user. This issue becomes even more challenging when dealing with scientific documents, as they contain more specific subject-related terms, while the user may not be able to express his/her specific information need in a query with limited terms. This study aims to propose an interactive multi-document text summarization approach that generates an eligible summary that is more compatible with the user’s information need. This approach allows the user to interactively specify the composition of a multi-document summary. Design/methodology/approach This approach exploits the user’s opinion in two stages. The initial query is refined by user-selected keywords/keyphrases and complete sentences extracted from the set of retrieved documents. It is followed by a novel method for sentence expansion using the genetic algorithm, and ranking the final set of sentences using the maximal marginal relevance method. Basically, for implementation, the Web of Science data set in the artificial intelligence (AI) category is considered. Findings The proposed approach receives feedback from the user in terms of favorable keywords and sentences. The feedback eventually improves the summary as the end. To assess the performance of the proposed system, this paper has asked 45 users who were graduate students in the field of AI to fill out a questionnaire. The quality of the final summary has been also evaluated from the user’s perspective and information redundancy. It has been investigated that the proposed approach leads to higher degrees of user satisfaction compared to the ones with no or only one step of the interaction. Originality/value The interactive summarization approach goes beyond the initial user’s query, while it includes the user’s preferred keywords/keyphrases and sentences through a systematic interaction. With respect to these interactions, the system gives the user a more clear idea of the information he/she is looking for and consequently adjusting the final result to the ultimate information need. Such interaction allows the summarization system to achieve a comprehensive understanding of the user’s information needs while expanding context-based knowledge and guiding the user toward his/her information journey.

Download Full-text

Fast and accurate detection of surface defect based on improved YOLOv4

Assembly Automation ◽

10.1108/aa-04-2021-0044 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Jiawei Lian ◽

Junhong He ◽

Yun Niu ◽

Tianze Wang

Keyword(s):

Feature Extraction ◽

Real Time ◽

Surface Defect ◽

Steel Ingot ◽

Industrial Applications ◽

Data Sets ◽

Data Set ◽

Processing Technologies ◽

Content Type ◽

Public Data

Purpose The current popular image processing technologies based on convolutional neural network have the characteristics of large computation, high storage cost and low accuracy for tiny defect detection, which is contrary to the high real-time and accuracy, limited computing resources and storage required by industrial applications. Therefore, an improved YOLOv4 named as YOLOv4-Defect is proposed aim to solve the above problems. Design/methodology/approach On the one hand, this study performs multi-dimensional compression processing on the feature extraction network of YOLOv4 to simplify the model and improve the feature extraction ability of the model through knowledge distillation. On the other hand, a prediction scale with more detailed receptive field is added to optimize the model structure, which can improve the detection performance for tiny defects. Findings The effectiveness of the method is verified by public data sets NEU-CLS and DAGM 2007, and the steel ingot data set collected in the actual industrial field. The experimental results demonstrated that the proposed YOLOv4-Defect method can greatly improve the recognition efficiency and accuracy and reduce the size and computation consumption of the model. Originality/value This paper proposed an improved YOLOv4 named as YOLOv4-Defect for the detection of surface defect, which is conducive to application in various industrial scenarios with limited storage and computing resources, and meets the requirements of high real-time and precision.

Download Full-text

Event Monitoring and Intelligence Gathering Using Twitter Based Real-Time Event Summarization and Pre-Trained Model Techniques

Applied Sciences ◽

10.3390/app112210596 ◽

2021 ◽

Vol 11 (22) ◽

pp. 10596

Author(s):

Chung-Hong Lee ◽

Hsin-Chang Yang ◽

Yenming J. Chen ◽

Yung-Lin Chuang

Keyword(s):

Real Time ◽

Data Science ◽

Clustering Algorithm ◽

Language Models ◽

Event Monitoring ◽

Data Set ◽

Intelligence Gathering ◽

Twitter Data ◽

The Government

Recently, an emerging application field through Twitter messages and algorithmic computation to detect real-time world events has become a new paradigm in the field of data science applications. During a high-impact event, people may want to know the latest information about the development of the event because they want to better understand the situation and possible trends of the event for making decisions. However, often in emergencies, the government or enterprises are usually unable to notify people in time for early warning and avoiding risks. A sensible solution is to integrate real-time event monitoring and intelligence gathering functions into their decision support system. Such a system can provide real-time event summaries, which are updated whenever important new events are detected. Therefore, in this work, we combine a developed Twitter-based real-time event detection algorithm with pre-trained language models for summarizing emergent events. We used an online text-stream clustering algorithm and self-adaptive method developed to gather the Twitter data for detection of emerging events. Subsequently we used the Xsum data set with a pre-trained language model, namely T5 model, to train the summarization model. The Rouge metrics were used to compare the summary performance of various models. Subsequently, we started to use the trained model to summarize the incoming Twitter data set for experimentation. In particular, in this work, we provide a real-world case study, namely the COVID-19 pandemic event, to verify the applicability of the proposed method. Finally, we conducted a survey on the example resulting summaries with human judges for quality assessment of generated summaries. From the case study and experimental results, we have demonstrated that our summarization method provides users with a feasible method to quickly understand the updates in the specific event intelligence based on the real-time summary of the event story.

Download Full-text

Adaptive task scheduling in IoT using reinforcement learning

International Journal of Intelligent Computing and Cybernetics ◽

10.1108/ijicc-03-2020-0021 ◽

2020 ◽

Vol 13 (3) ◽

pp. 261-282

Author(s):

Mohammad Khalid Pandit ◽

Roohie Naaz Mir ◽

Mohammad Ahsan Chishti

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Real Time ◽

Resource Utilization ◽

Task Scheduling ◽

Clustering Algorithm ◽

Fog Computing ◽

Feed Forward Neural Network ◽

Content Type ◽

Communication Costs

PurposeThe intelligence in the Internet of Things (IoT) can be embedded by analyzing the huge volumes of data generated by it in an ultralow latency environment. The computational latency incurred by the cloud-only solution can be significantly brought down by the fog computing layer, which offers a computing infrastructure to minimize the latency in service delivery and execution. For this purpose, a task scheduling policy based on reinforcement learning (RL) is developed that can achieve the optimal resource utilization as well as minimum time to execute tasks and significantly reduce the communication costs during distributed execution.Design/methodology/approachTo realize this, the authors proposed a two-level neural network (NN)-based task scheduling system, where the first-level NN (feed-forward neural network/convolutional neural network [FFNN/CNN]) determines whether the data stream could be analyzed (executed) in the resource-constrained environment (edge/fog) or be directly forwarded to the cloud. The second-level NN ( RL module) schedules all the tasks sent by level 1 NN to fog layer, among the available fog devices. This real-time task assignment policy is used to minimize the total computational latency (makespan) as well as communication costs.FindingsExperimental results indicated that the RL technique works better than the computationally infeasible greedy approach for task scheduling and the combination of RL and task clustering algorithm reduces the communication costs significantly.Originality/valueThe proposed algorithm fundamentally solves the problem of task scheduling in real-time fog-based IoT with best resource utilization, minimum makespan and minimum communication cost between the tasks.

Download Full-text

A multi-agent system for distributed smartphone sensing cycling in smart cities

Journal of Systems and Information Technology ◽

10.1108/jsit-12-2018-0158 ◽

2020 ◽

Vol 22 (1) ◽

pp. 119-134

Author(s):

Theodoros Anagnostopoulos ◽

Chu Luo ◽

Jino Ramson ◽

Klimis Ntalianis ◽

Vassilis Kostakos ◽

...

Keyword(s):

Real Time ◽

Smart City ◽

Spatiotemporal Data ◽

Inference Model ◽

Data Set ◽

Smartphone Sensing ◽

Content Type ◽

Traffic Lights ◽

Model Complex ◽

Multi Agent

Purpose The purpose of this paper is to propose a distributed smartphone sensing-enabled system, which assumes an intelligent transport signaling (ITS) infrastructure that operates traffic lights in a smart city (SC). The system is able to handle priorities between groups of cyclists (crowd-cycling) and traffic when approaching traffic lights at road junctions. Design/methodology/approach The system takes into consideration normal probability density function (PDF) and analytics computed for a certain group of cyclists (i.e. crowd-cycling). An inference model is built based on real-time spatiotemporal data of the cyclists. As the system is highly distributed – both physically (i.e. location of the cyclists) and logically (i.e. different threads), the problem is treated under the umbrella of multi-agent systems (MAS) modeling. The proposed model is experimentally evaluated by incorporating a real GPS trace data set from the SC of Melbourne, Australia. The MAS model is applied to the data set according to the quantitative and qualitative criteria adopted. Cyclists’ satisfaction (CS) is defined as a function, which measures the satisfaction of the cyclists. This is the case where the cyclists wait the least amount of time at traffic lights and move as fast as they can toward their destination. ITS system satisfaction (SS) is defined as a function that measures the satisfaction of the ITS system. This is the case where the system serves the maximum number of cyclists with the fewest transitions between the lights. Smart city satisfaction (SCS) is defined as a function that measures the overall satisfaction of the cyclists and the ITS system in the SC based on CS and SS. SCS defines three SC policies (SCP), namely, CS is maximum and SS is minimum then the SC is cyclist-friendly (SCP1), CS is average and SS is average then the SC is equally cyclist and ITS system friendly (SCP2) and CS is minimum and SS is maximum then the SC is ITS system friendly (SCP3). Findings Results are promising toward the integration of the proposed system with contemporary SCs, as the stakeholders are able to choose between the proposed SCPs according to the SC infrastructure. More specifically, cyclist-friendly SCs can adopt SCP1, SCs that treat cyclists and ITS equally can adopt SCP2 and ITS friendly SCs can adopt SCP3. Originality/value The proposed approach uses internet connectivity available in modern smartphones, which provide users control over the data they provide to us, to obviate the installation of additional sensing infrastructure. It extends related study by assuming an ITS system, which turns traffic lights green by considering the normal PDF and the analytics computed for a certain group of cyclists. The inference model is built based on the real-time spatiotemporal data of the cyclists. As the system is highly distributed – both physically (i.e. location of the cyclists) and logically (i.e. different threads), the system is treated under the umbrella of MAS. MAS has been used in the literature to model complex systems by incorporating intelligent agents. In this study, the authors treat agents as proxy threads running in the cloud, as they require high computation power not available to smartphones.

Download Full-text

Recognition and labeling of faults in wind turbines with a density-based clustering algorithm

Data Technologies and Applications ◽

10.1108/dta-09-2020-0223 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Shuai Luo ◽

Hongwei Liu ◽

Ershi Qi

Keyword(s):

Wind Turbines ◽

Clustering Algorithm ◽

Support Vector ◽

Scanning Strategy ◽

Data Set ◽

Content Type ◽

Vibration Data ◽

Density Based Clustering ◽

Extreme Gradient Boosting ◽

Data Points

PurposeThe purpose of this paper is to recognize and label the faults in wind turbines with a new density-based clustering algorithm, named contour density scanning clustering (CDSC) algorithm.Design/methodology/approachThe algorithm includes four components: (1) computation of neighborhood density, (2) selection of core and noise data, (3) scanning core data and (4) updating clusters. The proposed algorithm considers the relationship between neighborhood data points according to a contour density scanning strategy.FindingsThe first experiment is conducted with artificial data to validate that the proposed CDSC algorithm is suitable for handling data points with arbitrary shapes. The second experiment with industrial gearbox vibration data is carried out to demonstrate that the time complexity and accuracy of the proposed CDSC algorithm in comparison with other conventional clustering algorithms, including k-means, density-based spatial clustering of applications with noise, density peaking clustering, neighborhood grid clustering, support vector clustering, random forest, core fusion-based density peak clustering, AdaBoost and extreme gradient boosting. The third experiment is conducted with an industrial bearing vibration data set to highlight that the CDSC algorithm can automatically track the emerging fault patterns of bearing in wind turbines over time.Originality/valueData points with different densities are clustered using three strategies: direct density reachability, density reachability and density connectivity. A contours density scanning strategy is proposed to determine whether the data points with the same density belong to one cluster. The proposed CDSC algorithm achieves automatically clustering, which means that the trends of the fault pattern could be tracked.

Download Full-text

Entropy-Based City Tunnel Real-Time Traffic Incident Detection Clustering Algorithm

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.380-384.753 ◽

2013 ◽

Vol 380-384 ◽

pp. 753-756

Author(s):

Xiao Feng Li ◽

Wei Wei Gao ◽

Xue Mei Wang

Keyword(s):

Real Time ◽

Clustering Algorithm ◽

Spatial Clustering ◽

Entropy Change ◽

Practical Significance ◽

Incident Detection ◽

Data Set ◽

Traffic Incident ◽

Real Time Traffic ◽

The City

The use of spatial clustering technology has important practical significance to obtain useful information. According to the characteristics of city tunnel real-time traffic ,then, put forward ECRT (Entropy-based City Tunnel Real-time), the object associated with the city tunnel as real-time traffic properties to calculate the entropy of information between the city tunnel, based on information entropy change to achieve real-time traffic urban tunnel clustering. Algorithm used in the actual data set ECRT test. The results showed that the algorithm ECRT is effective.

Download Full-text