NILMPEds: A Performance Evaluation Dataset for Event Detection Algorithms in Non-Intrusive Load Monitoring

Datasets are important for researchers to build models and test how these perform, as well as to reproduce research experiments from others. This data paper presents the NILM Performance Evaluation dataset (NILMPEds), which is aimed primarily at research reproducibility in the field of Non-intrusive load monitoring. This initial release of NILMPEds is dedicated to event detection algorithms and is comprised of ground-truth data for four test datasets, the specification of 47,950 event detection models, the power events returned by each model in the four test datasets, and the performance of each individual model according to 31 performance metrics.

Download Full-text

An empirical exploration of performance metrics for event detection algorithms in Non-Intrusive Load Monitoring

Sustainable Cities and Society ◽

10.1016/j.scs.2020.102399 ◽

2020 ◽

Vol 62 ◽

pp. 102399

Author(s):

Lucas Pereira ◽

Nuno Nunes

Keyword(s):

Event Detection ◽

Performance Metrics ◽

Detection Algorithms ◽

Load Monitoring

Download Full-text

Mapping of ImageNet and Wikidata for Knowledge Graphs Enabled Computer Vision

Business Information Systems ◽

10.52825/bis.v1i.65 ◽

2021 ◽

pp. 151-161

Author(s):

Dominik Filipiak ◽

Anna Fensel ◽

Agata Filipowska

Keyword(s):

Computer Vision ◽

Performance Metrics ◽

Contextual Information ◽

Ground Truth ◽

Knowledge Graph ◽

Graph Structure ◽

Exact Match ◽

Ground Truth Data ◽

Final Performance ◽

Knowledge Graphs

Knowledge graphs are used as a source of prior knowledge in numerous computer vision tasks. However, such an approach requires to have a mapping between ground truth data labels and the target knowledge graph. We linked the ILSVRC 2012 dataset (often simply referred to as ImageNet) labels to Wikidata entities. This enables using rich knowledge graph structure and contextual information for several computer vision tasks, traditionally benchmarked with ImageNet and its variations. For instance, in few-shot learning classification scenarios with neural networks, this mapping can be leveraged for weight initialisation, which can improve the final performance metrics value. We mapped all 1000 ImageNet labels – 461 were already directly linked with the exact match property (P2888), 467 have exact match candidates, and 72 cannot be matched directly. For these 72 labels, we discuss different problem categories stemming from the inability of finding an exact match. Semantically close non-exact match candidates are presented as well. The mapping is publicly available athttps://github.com/DominikFilipiak/imagenet-to-wikidata-mapping.

Download Full-text

Gravitational community detection by predicting diameter

Discrete Mathematics Algorithms and Applications ◽

10.1142/s1793830921501457 ◽

2021 ◽

pp. 2150145

Author(s):

Himansu Sekhar Pattanayak ◽

Harsh K. Verma ◽

Amrit Lal Sangal

Keyword(s):

Community Detection ◽

Ground Truth ◽

Detection Algorithm ◽

Local Information ◽

Ground Truth Data ◽

Overlapping Communities ◽

Detection Algorithms ◽

Np Hard Problem ◽

Community Detection Algorithm ◽

The Individual

Community detection is a pivotal part of network analysis and is classified as an NP-hard problem. In this paper, a novel community detection algorithm is proposed, which probabilistically predicts communities’ diameter using the local information of random seed nodes. The gravitation method is then applied to discover communities surrounding the seed nodes. The individual communities are combined to get the community structure of the whole network. The proposed algorithm, named as Local Gravitational community detection algorithm (LGCDA), can also work with overlapping communities. LGCDA algorithm is evaluated based on quality metrics and ground-truth data by comparing it with some of the widely used community detection algorithms using synthetic and real-world networks.

Download Full-text

Performance Evaluation of Real-Time Event Detection Algorithms

Water Distribution Systems Analysis Symposium 2006 ◽

10.1061/40941(247)131 ◽

2008 ◽

Author(s):

Katherine Umberg ◽

James G. Uber ◽

Regan Murray

Keyword(s):

Performance Evaluation ◽

Real Time ◽

Event Detection ◽

Detection Algorithms

Download Full-text

Increasing the Effectiveness of Active Learning: Introducing Artificial Data Generation in Active Learning for Land Use/Land Cover Classification

Remote Sensing ◽

10.3390/rs13132619 ◽

2021 ◽

Vol 13 (13) ◽

pp. 2619

Author(s):

Joao Fonseca ◽

Georgios Douzas ◽

Fernando Bacao

Keyword(s):

Active Learning ◽

Performance Metrics ◽

User Interaction ◽

Ground Truth ◽

Data Generation ◽

Artificial Data ◽

Ground Truth Data ◽

Data Generator ◽

Benchmark Datasets ◽

Time Requirements

In remote sensing, Active Learning (AL) has become an important technique to collect informative ground truth data ``on-demand'' for supervised classification tasks. Despite its effectiveness, it is still significantly reliant on user interaction, which makes it both expensive and time consuming to implement. Most of the current literature focuses on the optimization of AL by modifying the selection criteria and the classifiers used. Although improvements in these areas will result in more effective data collection, the use of artificial data sources to reduce human--computer interaction remains unexplored. In this paper, we introduce a new component to the typical AL framework, the data generator, a source of artificial data to reduce the amount of user-labeled data required in AL. The implementation of the proposed AL framework is done using Geometric SMOTE as the data generator. We compare the new AL framework to the original one using similar acquisition functions and classifiers over three AL-specific performance metrics in seven benchmark datasets. We show that this modification of the AL framework significantly reduces cost and time requirements for a successful AL implementation in all of the datasets used in the experiment.

Download Full-text

Enhancing AI-guided STEMI detection algorithms by incorporating higher quality fiduciary EKG elements

European Heart Journal ◽

10.1093/ehjci/ehaa946.3554 ◽

2020 ◽

Vol 41 (Supplement_2) ◽

Author(s):

S Mehta ◽

J Avila ◽

S Niklitschek ◽

F Fernandez ◽

C Villagran ◽

...

Keyword(s):

Performance Indicators ◽

Performance Metrics ◽

Ground Truth ◽

Machine Learning Techniques ◽

Dense Layer ◽

Detection Algorithms ◽

Complete Dataset ◽

Low Pass ◽

Normal Branch ◽

Low Pass Filters

Abstract Background As EKG interpretation paradigms to a physician-free milieu, accumulating massive quantities of distilled pre-processed data becomes a must for machine learning techniques. In our pursuit of reducing ischemic times in STEMI management, we have improved our Artificial Intelligence (AI)-guided diagnostic tool by following a three-step approach: 1) Increase accuracy by adding larger clusters of data. 2) Increase the breadth of EKG classifications to provide more precise feedback and further refine the inputs which ultimately reflects in better and more accurate outputs. 3) Improving the algorithms' ability to discern between cardiovascular entities reflected in the EKG records. Purpose To bolster our algorithm's accuracy and reliability for electrocardiographic STEMI recognition. Methods Dataset: A total of 7,286 12-lead EKG records of 10-seconds length with a sampling frequency of 500 Hz obtained from Latin America Telemedicine Infarct Network from April 2014 to December 2019. This included the following balanced classes: angiographically confirmed STEMI, branch blocks, non-specific ST-T abnormalities, normal, and abnormal (200+ CPT codes, excluding the ones included in other classes). Labels of each record were manually checked by cardiologists to ensure precision (Ground truth). Pre-processing: First and last 250 samples were discarded to avoid a standardization pulse. Order 5 digital low pass filters with a 35 Hz cut-off was applied. For each record, the mean was subtracted to each individual lead. Classification: Determined classes were “STEMI” and “Not-STEMI” (A combination of randomly sampled normal, branch blocks, non-specific ST-T abnormalities and abnormal records – 25% of each subclass). Training & Testing: A 1-D Convolutional Neural Network was trained and tested with a dataset proportion of 90/10, respectively. The last dense layer outputs a probability for each record of being STEMI or Not-STEMI. Additional testing was performed with a subset of the original complete dataset of unconfirmed STEMI. Performance indicators (accuracy, sensitivity, and specificity) were calculated for each model and results were compared with our previous findings from past experiments. Results Complete STEMI data: Accuracy: 95.9% Sensitivity: 95.7% Specificity: 96.5%; Confirmed STEMI: Accuracy: 98.1% Sensitivity: 98.1% Specificity: 98.1%; Prior Data obtained in our previous experiments are shown below for comparison. Conclusion(s) After the addition of clustered pre-processed data, all performance indicators for STEMI detection increased considerably between both Confirmed STEMI datasets. On the other hand, the Complete STEMI dataset kept a strong and steady set of performance metrics when compared with past results. These findings not only validate the consistency and reliability of our algorithm but also connotes the importance of creating a pristine dataset for this and any other AI-derived medical tools. Funding Acknowledgement Type of funding source: None

Download Full-text

A Comparative Analysis of Object Detection Metrics with a Companion Open-Source Toolkit

Electronics ◽

10.3390/electronics10030279 ◽

2021 ◽

Vol 10 (3) ◽

pp. 279

Author(s):

Rafael Padilla ◽

Wesley L. Passos ◽

Thadeu L. B. Dias ◽

Sergio L. Netto ◽

Eduardo A. B. da Silva

Keyword(s):

Comparative Analysis ◽

Object Detection ◽

Open Source ◽

Performance Metrics ◽

Ground Truth ◽

Detection Methods ◽

Detection Algorithms ◽

Spatio Temporal ◽

Bounding Boxes ◽

Temporal Overlap

Recent outstanding results of supervised object detection in competitions and challenges are often associated with specific metrics and datasets. The evaluation of such methods applied in different contexts have increased the demand for annotated datasets. Annotation tools represent the location and size of objects in distinct formats, leading to a lack of consensus on the representation. Such a scenario often complicates the comparison of object detection methods. This work alleviates this problem along the following lines: (i) It provides an overview of the most relevant evaluation methods used in object detection competitions, highlighting their peculiarities, differences, and advantages; (ii) it examines the most used annotation formats, showing how different implementations may influence the assessment results; and (iii) it provides a novel open-source toolkit supporting different annotation formats and 15 performance metrics, making it easy for researchers to evaluate the performance of their detection algorithms in most known datasets. In addition, this work proposes a new metric, also included in the toolkit, for evaluating object detection in videos that is based on the spatio-temporal overlap between the ground-truth and detected bounding boxes.

Download Full-text

MORED: A Moroccan Buildings’ Electricity Consumption Dataset

Energies ◽

10.3390/en13246737 ◽

2020 ◽

Vol 13 (24) ◽

pp. 6737

Author(s):

Mohamed Aymane Ahajjam ◽

Daniel Bonilla Licea ◽

Chaimaa Essayeh ◽

Mounir Ghogho ◽

Abdellatif Kobbane

Keyword(s):

Electricity Consumption ◽

Ground Truth ◽

Ground Truth Data ◽

Disadvantaged Neighborhoods ◽

Consumption Data ◽

Load Monitoring ◽

Open Datasets

This paper consists of two parts: an overview of existing open datasets of electricity consumption and a description of the Moroccan Buildings’ Electricity Consumption Dataset, a first of its kind, coined as MORED. The new dataset comprises electricity consumption data of various Moroccan premises. Unlike existing datasets, MORED provides three main data components: whole premises (WP) electricity consumption, individual load (IL) ground-truth consumption, and fully labeled IL signatures, from affluent and disadvantaged neighborhoods. The WP consumption data were acquired at low rates (1/5 or 1/10 samples/s) from 12 households; the IL ground-truth data were acquired at similar rates from five households for extended durations; and IL signature data were acquired at high and low rates (50 k and 4 samples/s) from 37 different residential and industrial loads. In addition, the dataset encompasses non-intrusive load monitoring (NILM) metadata.

Download Full-text

Event-Detection Algorithms for Low Sampling Nonintrusive Load Monitoring Systems Based on Low Complexity Statistical Features

IEEE Transactions on Instrumentation and Measurement ◽

10.1109/tim.2019.2904351 ◽

2020 ◽

Vol 69 (3) ◽

pp. 751-759 ◽

Cited By ~ 2

Author(s):

Attique Ur Rehman ◽

Tek Tjing Lie ◽

Brice Valles ◽

Shafiqur Rahman Tito

Keyword(s):

Event Detection ◽

Low Complexity ◽

Monitoring Systems ◽

Statistical Features ◽

Detection Algorithms ◽

Load Monitoring

Download Full-text

Supervised SVM Transfer Learning for Modality-Specific Artefact Detection in ECG

Sensors ◽

10.3390/s21020662 ◽

2021 ◽

Vol 21 (2) ◽

pp. 662

Author(s):

Jonathan Moeyersons ◽

John Morales ◽

Amalia Villa ◽

Ivan Castro ◽

Dries Testelmans ◽

...

Keyword(s):

Transfer Learning ◽

Ground Truth ◽

Ground Truth Data ◽

Capacitively Coupled ◽

Ecg Signals ◽

Detection Model ◽

Detection Algorithms ◽

Artefact Detection ◽

Important Diagnostic Tool ◽

Electrocardiogram Ecg

The electrocardiogram (ECG) is an important diagnostic tool for identifying cardiac problems. Nowadays, new ways to record ECG signals outside of the hospital are being investigated. A promising technique is capacitively coupled ECG (ccECG), which allows ECG signals to be recorded through insulating materials. However, as the ECG is no longer recorded in a controlled environment, this inevitably implies the presence of more artefacts. Artefact detection algorithms are used to detect and remove these. Typically, the training of a new algorithm requires a lot of ground truth data, which is costly to obtain. As many labelled contact ECG datasets exist, we could avoid the use of labelling new ccECG signals by making use of previous knowledge. Transfer learning can be used for this purpose. Here, we applied transfer learning to optimise the performance of an artefact detection model, trained on contact ECG, towards ccECG. We used ECG recordings from three different datasets, recorded with three recording devices. We showed that the accuracy of a contact-ECG classifier improved between 5 and 8% by means of transfer learning when tested on a ccECG dataset. Furthermore, we showed that only 20 segments of the ccECG dataset are sufficient to significantly increase the accuracy.

Download Full-text