scholarly journals NILMPEds: A Performance Evaluation Dataset for Event Detection Algorithms in Non-Intrusive Load Monitoring

Data ◽  
2019 ◽  
Vol 4 (3) ◽  
pp. 127 ◽  
Author(s):  
Lucas Pereira

Datasets are important for researchers to build models and test how these perform, as well as to reproduce research experiments from others. This data paper presents the NILM Performance Evaluation dataset (NILMPEds), which is aimed primarily at research reproducibility in the field of Non-intrusive load monitoring. This initial release of NILMPEds is dedicated to event detection algorithms and is comprised of ground-truth data for four test datasets, the specification of 47,950 event detection models, the power events returned by each model in the four test datasets, and the performance of each individual model according to 31 performance metrics.

2021 ◽  
pp. 151-161
Author(s):  
Dominik Filipiak ◽  
Anna Fensel ◽  
Agata Filipowska

Knowledge graphs are used as a source of prior knowledge in numerous computer vision tasks. However, such an approach requires to have a mapping between ground truth data labels and the target knowledge graph. We linked the ILSVRC 2012 dataset (often simply referred to as ImageNet) labels to Wikidata entities. This enables using rich knowledge graph structure and contextual information for several computer vision tasks, traditionally benchmarked with ImageNet and its variations. For instance, in few-shot learning classification scenarios with neural networks, this mapping can be leveraged for weight initialisation, which can improve the final performance metrics value. We mapped all 1000 ImageNet labels – 461 were already directly linked with the exact match property (P2888), 467 have exact match candidates, and 72 cannot be matched directly. For these 72 labels, we discuss different problem categories stemming from the inability of finding an exact match. Semantically close non-exact match candidates are presented as well. The mapping is publicly available athttps://github.com/DominikFilipiak/imagenet-to-wikidata-mapping.


Author(s):  
Himansu Sekhar Pattanayak ◽  
Harsh K. Verma ◽  
Amrit Lal Sangal

Community detection is a pivotal part of network analysis and is classified as an NP-hard problem. In this paper, a novel community detection algorithm is proposed, which probabilistically predicts communities’ diameter using the local information of random seed nodes. The gravitation method is then applied to discover communities surrounding the seed nodes. The individual communities are combined to get the community structure of the whole network. The proposed algorithm, named as Local Gravitational community detection algorithm (LGCDA), can also work with overlapping communities. LGCDA algorithm is evaluated based on quality metrics and ground-truth data by comparing it with some of the widely used community detection algorithms using synthetic and real-world networks.


2021 ◽  
Vol 13 (13) ◽  
pp. 2619
Author(s):  
Joao Fonseca ◽  
Georgios Douzas ◽  
Fernando Bacao

In remote sensing, Active Learning (AL) has become an important technique to collect informative ground truth data ``on-demand'' for supervised classification tasks. Despite its effectiveness, it is still significantly reliant on user interaction, which makes it both expensive and time consuming to implement. Most of the current literature focuses on the optimization of AL by modifying the selection criteria and the classifiers used. Although improvements in these areas will result in more effective data collection, the use of artificial data sources to reduce human--computer interaction remains unexplored. In this paper, we introduce a new component to the typical AL framework, the data generator, a source of artificial data to reduce the amount of user-labeled data required in AL. The implementation of the proposed AL framework is done using Geometric SMOTE as the data generator. We compare the new AL framework to the original one using similar acquisition functions and classifiers over three AL-specific performance metrics in seven benchmark datasets. We show that this modification of the AL framework significantly reduces cost and time requirements for a successful AL implementation in all of the datasets used in the experiment.


2020 ◽  
Vol 41 (Supplement_2) ◽  
Author(s):  
S Mehta ◽  
J Avila ◽  
S Niklitschek ◽  
F Fernandez ◽  
C Villagran ◽  
...  

Abstract Background As EKG interpretation paradigms to a physician-free milieu, accumulating massive quantities of distilled pre-processed data becomes a must for machine learning techniques. In our pursuit of reducing ischemic times in STEMI management, we have improved our Artificial Intelligence (AI)-guided diagnostic tool by following a three-step approach: 1) Increase accuracy by adding larger clusters of data. 2) Increase the breadth of EKG classifications to provide more precise feedback and further refine the inputs which ultimately reflects in better and more accurate outputs. 3) Improving the algorithms' ability to discern between cardiovascular entities reflected in the EKG records. Purpose To bolster our algorithm's accuracy and reliability for electrocardiographic STEMI recognition. Methods Dataset: A total of 7,286 12-lead EKG records of 10-seconds length with a sampling frequency of 500 Hz obtained from Latin America Telemedicine Infarct Network from April 2014 to December 2019. This included the following balanced classes: angiographically confirmed STEMI, branch blocks, non-specific ST-T abnormalities, normal, and abnormal (200+ CPT codes, excluding the ones included in other classes). Labels of each record were manually checked by cardiologists to ensure precision (Ground truth). Pre-processing: First and last 250 samples were discarded to avoid a standardization pulse. Order 5 digital low pass filters with a 35 Hz cut-off was applied. For each record, the mean was subtracted to each individual lead. Classification: Determined classes were “STEMI” and “Not-STEMI” (A combination of randomly sampled normal, branch blocks, non-specific ST-T abnormalities and abnormal records – 25% of each subclass). Training & Testing: A 1-D Convolutional Neural Network was trained and tested with a dataset proportion of 90/10, respectively. The last dense layer outputs a probability for each record of being STEMI or Not-STEMI. Additional testing was performed with a subset of the original complete dataset of unconfirmed STEMI. Performance indicators (accuracy, sensitivity, and specificity) were calculated for each model and results were compared with our previous findings from past experiments. Results Complete STEMI data: Accuracy: 95.9% Sensitivity: 95.7% Specificity: 96.5%; Confirmed STEMI: Accuracy: 98.1% Sensitivity: 98.1% Specificity: 98.1%; Prior Data obtained in our previous experiments are shown below for comparison. Conclusion(s) After the addition of clustered pre-processed data, all performance indicators for STEMI detection increased considerably between both Confirmed STEMI datasets. On the other hand, the Complete STEMI dataset kept a strong and steady set of performance metrics when compared with past results. These findings not only validate the consistency and reliability of our algorithm but also connotes the importance of creating a pristine dataset for this and any other AI-derived medical tools. Funding Acknowledgement Type of funding source: None


Electronics ◽  
2021 ◽  
Vol 10 (3) ◽  
pp. 279
Author(s):  
Rafael Padilla ◽  
Wesley L. Passos ◽  
Thadeu L. B. Dias ◽  
Sergio L. Netto ◽  
Eduardo A. B. da Silva

Recent outstanding results of supervised object detection in competitions and challenges are often associated with specific metrics and datasets. The evaluation of such methods applied in different contexts have increased the demand for annotated datasets. Annotation tools represent the location and size of objects in distinct formats, leading to a lack of consensus on the representation. Such a scenario often complicates the comparison of object detection methods. This work alleviates this problem along the following lines: (i) It provides an overview of the most relevant evaluation methods used in object detection competitions, highlighting their peculiarities, differences, and advantages; (ii) it examines the most used annotation formats, showing how different implementations may influence the assessment results; and (iii) it provides a novel open-source toolkit supporting different annotation formats and 15 performance metrics, making it easy for researchers to evaluate the performance of their detection algorithms in most known datasets. In addition, this work proposes a new metric, also included in the toolkit, for evaluating object detection in videos that is based on the spatio-temporal overlap between the ground-truth and detected bounding boxes.


Energies ◽  
2020 ◽  
Vol 13 (24) ◽  
pp. 6737
Author(s):  
Mohamed Aymane Ahajjam ◽  
Daniel Bonilla Licea ◽  
Chaimaa Essayeh ◽  
Mounir Ghogho ◽  
Abdellatif Kobbane

This paper consists of two parts: an overview of existing open datasets of electricity consumption and a description of the Moroccan Buildings’ Electricity Consumption Dataset, a first of its kind, coined as MORED. The new dataset comprises electricity consumption data of various Moroccan premises. Unlike existing datasets, MORED provides three main data components: whole premises (WP) electricity consumption, individual load (IL) ground-truth consumption, and fully labeled IL signatures, from affluent and disadvantaged neighborhoods. The WP consumption data were acquired at low rates (1/5 or 1/10 samples/s) from 12 households; the IL ground-truth data were acquired at similar rates from five households for extended durations; and IL signature data were acquired at high and low rates (50 k and 4 samples/s) from 37 different residential and industrial loads. In addition, the dataset encompasses non-intrusive load monitoring (NILM) metadata.


Sensors ◽  
2021 ◽  
Vol 21 (2) ◽  
pp. 662
Author(s):  
Jonathan Moeyersons ◽  
John Morales ◽  
Amalia Villa ◽  
Ivan Castro ◽  
Dries Testelmans ◽  
...  

The electrocardiogram (ECG) is an important diagnostic tool for identifying cardiac problems. Nowadays, new ways to record ECG signals outside of the hospital are being investigated. A promising technique is capacitively coupled ECG (ccECG), which allows ECG signals to be recorded through insulating materials. However, as the ECG is no longer recorded in a controlled environment, this inevitably implies the presence of more artefacts. Artefact detection algorithms are used to detect and remove these. Typically, the training of a new algorithm requires a lot of ground truth data, which is costly to obtain. As many labelled contact ECG datasets exist, we could avoid the use of labelling new ccECG signals by making use of previous knowledge. Transfer learning can be used for this purpose. Here, we applied transfer learning to optimise the performance of an artefact detection model, trained on contact ECG, towards ccECG. We used ECG recordings from three different datasets, recorded with three recording devices. We showed that the accuracy of a contact-ECG classifier improved between 5 and 8% by means of transfer learning when tested on a ccECG dataset. Furthermore, we showed that only 20 segments of the ccECG dataset are sufficient to significantly increase the accuracy.


Sign in / Sign up

Export Citation Format

Share Document