Analysis of Industrial and Household IoT Data Using Computationally Intelligent Algorithm

In this chapter, data mining approaches are applied on standard IoT dataset to identify relationship among attributes of the dataset. IoT is not an exception; data mining can be used in this domain also. Various rule-based classifiers and unsupervised classifiers are implemented here. Using these approaches relation between various IoT features are determined based on different properties of classification like support, confidence, etc. For classification, a real-time IoT dataset is used, which consists of household figures collected from various sources over a long duration. A brief comparison is also shown for different classification approaches on the IoT dataset. Kappa coefficient is also calculated for these classification techniques to measure the robustness of these approaches. In this chapter, standard and popular power utilization in household dataset is used to show the association between the different intra-data dependency. Classification accuracy of more than 86% is found with the Almanac of Minutely Power Dataset (AMPds) in the present work.

Download Full-text

EXPLORING MACHINE LEARNING CLASSIFICATION ALGORITHMS FOR CROP CLASSIFICATION USING SENTINEL 2 DATA

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlii-3-w6-573-2019 ◽

2019 ◽

Vol XLII-3/W6 ◽

pp. 573-578 ◽

Cited By ~ 3

Author(s):

◽

S. S. Ray

Keyword(s):

Machine Learning ◽

Satellite Data ◽

Classification Accuracy ◽

Ground Truth ◽

Kappa Coefficient ◽

Ground Truth Data ◽

Classification Techniques ◽

Machine Learning Classification ◽

Crop Classification ◽

Sentinel 2

<p><strong>Abstract.</strong> Crop Classification and recognition is a very important application of Remote Sensing. In the last few years, Machine learning classification techniques have been emerging for crop classification. Google Earth Engine (GEE) is a platform to explore the multiple satellite data with different advanced classification techniques without even downloading the satellite data. The main objective of this study is to explore the ability of different machine learning classification techniques like, Random Forest (RF), Classification And Regression Trees (CART) and Support Vector Machine (SVM) for crop classification. High Resolution optical data, Sentinel-2, MSI (10&thinsp;m) was used for crop classification in the Indian Agricultural Research Institute (IARI) farm for the Rabi season 2016 for major crops. Around 100 crop fields (~400 Hectare) in IARI were analysed. Smart phone-based ground truth data were collected. The best cloud free image of Sentinel 2 MSI data (5 Feb 2016) was used for classification using automatic filtering by percentage cloud cover property using the GEE. Polygons as feature space was used as training data sets based on the ground truth data for crop classification using machine learning techniques. Post classification, accuracy assessment analysis was done through the generation of the confusion matrix (producer and user accuracy), kappa coefficient and F value. In this study it was found that using GEE through cloud platform, satellite data accessing, filtering and pre-processing of satellite data could be done very efficiently. In terms of overall classification accuracy and kappa coefficient, Random Forest (93.3%, 0.9178) and CART (73.4%, 0.6755) classifiers performed better than SVM (74.3%, 0.6867) classifier. For validation, Field Operation Service Unit (FOSU) division of IARI, data was used and encouraging results were obtained.</p>

Download Full-text

Application of rule-based data mining techniques to real time ATLAS Grid job monitoring data

Journal of Physics Conference Series ◽

10.1088/1742-6596/396/3/032060 ◽

2012 ◽

Vol 396 (3) ◽

pp. 032060

Author(s):

R Ahrens ◽

T Harenberg ◽

S Kalinin ◽

P Mättig ◽

M Sandhoff ◽

...

Keyword(s):

Data Mining ◽

Real Time ◽

Monitoring Data ◽

Rule Based ◽

Data Mining Techniques ◽

Job Monitoring

Download Full-text

Survey of Data Mining Techniques Used for Real Time Churn Prediction

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse/v7i3/01319 ◽

2017 ◽

Vol 7 (3) ◽

pp. 219-222 ◽

Cited By ~ 1

Author(s):

Jean Claude Turiho ◽

◽

Wilson Cheruiyot ◽

Anne Kibe ◽

Irénée Mungwarakarama ◽

...

Keyword(s):

Data Mining ◽

Real Time ◽

Churn Prediction ◽

Data Mining Techniques

Download Full-text

Data Mining-based Financial Statement Fraud Detection: Systematic Literature Review and Meta-analysis to Estimate Data Sample Mapping of Fraudulent Companies Against Non-fraudulent Companies

Global Business Review ◽

10.1177/0972150920984857 ◽

2021 ◽

pp. 097215092098485

Author(s):

Sonika Gupta ◽

Sushil Kumar Mehta

Keyword(s):

Machine Learning ◽

Data Mining ◽

Literature Review ◽

Systematic Literature Review ◽

Classification Accuracy ◽

Meta Analysis ◽

Financial Statement ◽

Research Articles ◽

Financial Statement Fraud ◽

Data Mining Techniques

Data mining techniques have proven quite effective not only in detecting financial statement frauds but also in discovering other financial crimes, such as credit card frauds, loan and security frauds, corporate frauds, bank and insurance frauds, etc. Classification of data mining techniques, in recent years, has been accepted as one of the most credible methodologies for the detection of symptoms of financial statement frauds through scanning the published financial statements of companies. The retrieved literature that has used data mining classification techniques can be broadly categorized on the basis of the type of technique applied, as statistical techniques and machine learning techniques. The biggest challenge in executing the classification process using data mining techniques lies in collecting the data sample of fraudulent companies and mapping the sample of fraudulent companies against non-fraudulent companies. In this article, a systematic literature review (SLR) of studies from the area of financial statement fraud detection has been conducted. The review has considered research articles published between 1995 and 2020. Further, a meta-analysis has been performed to establish the effect of data sample mapping of fraudulent companies against non-fraudulent companies on the classification methods through comparing the overall classification accuracy reported in the literature. The retrieved literature indicates that a fraudulent sample can either be equally paired with non-fraudulent sample (1:1 data mapping) or be unequally mapped using 1:many ratio to increase the sample size proportionally. Based on the meta-analysis of the research articles, it can be concluded that machine learning approaches, in comparison to statistical approaches, can achieve better classification accuracy, particularly when the availability of sample data is low. High classification accuracy can be obtained with even a 1:1 mapping data set using machine learning classification approaches.

Download Full-text

OFCOD: On the Fly Clustering Based Outlier Detection Framework

Data ◽

10.3390/data6010001 ◽

2020 ◽

Vol 6 (1) ◽

pp. 1

Author(s):

Ahmed Elmogy ◽

Hamada Rizk ◽

Amany M. Sarhan

Keyword(s):

Data Mining ◽

Image Processing ◽

Intrusion Detection ◽

Real Time ◽

Outlier Detection ◽

Real World ◽

Medical Data ◽

Experimental Results ◽

Real Time Applications ◽

Real World Datasets

In data mining, outlier detection is a major challenge as it has an important role in many applications such as medical data, image processing, fraud detection, intrusion detection, and so forth. An extensive variety of clustering based approaches have been developed to detect outliers. However they are by nature time consuming which restrict their utilization with real-time applications. Furthermore, outlier detection requests are handled one at a time, which means that each request is initiated individually with a particular set of parameters. In this paper, the first clustering based outlier detection framework, (On the Fly Clustering Based Outlier Detection (OFCOD)) is presented. OFCOD enables analysts to effectively find out outliers on time with request even within huge datasets. The proposed framework has been tested and evaluated using two real world datasets with different features and applications; one with 699 records, and another with five millions records. The experimental results show that the performance of the proposed framework outperforms other existing approaches while considering several evaluation metrics.

Download Full-text

Wearable Sensor-Based Real-Time Gait Detection: A Systematic Review

Sensors ◽

10.3390/s21082727 ◽

2021 ◽

Vol 21 (8) ◽

pp. 2727

Author(s):

Hari Prasanth ◽

Miroslav Caban ◽

Urs Keller ◽

Grégoire Courtine ◽

Auke Ijspeert ◽

...

Keyword(s):

Systematic Review ◽

Gait Analysis ◽

Real Time ◽

Wearable Sensors ◽

Pressure Sensors ◽

Clinical Applications ◽

Performance Comparison ◽

Heel Strike ◽

Rule Based ◽

Pathological Gait

Gait analysis has traditionally been carried out in a laboratory environment using expensive equipment, but, recently, reliable, affordable, and wearable sensors have enabled integration into clinical applications as well as use during activities of daily living. Real-time gait analysis is key to the development of gait rehabilitation techniques and assistive devices such as neuroprostheses. This article presents a systematic review of wearable sensors and techniques used in real-time gait analysis, and their application to pathological gait. From four major scientific databases, we identified 1262 articles of which 113 were analyzed in full-text. We found that heel strike and toe off are the most sought-after gait events. Inertial measurement units (IMU) are the most widely used wearable sensors and the shank and foot are the preferred placements. Insole pressure sensors are the most common sensors for ground-truth validation for IMU-based gait detection. Rule-based techniques relying on threshold or peak detection are the most widely used gait detection method. The heterogeneity of evaluation criteria prevented quantitative performance comparison of all methods. Although most studies predicted that the proposed methods would work on pathological gait, less than one third were validated on such data. Clinical applications of gait detection algorithms were considered, and we recommend a combination of IMU and rule-based methods as an optimal solution.

Download Full-text

Data mining-based air pollution characteristics and real-time monitoring of college students’ physical and mental health

Arabian Journal of Geosciences ◽

10.1007/s12517-021-07926-2 ◽

2021 ◽

Vol 14 (15) ◽

Author(s):

Xiaomei Wu ◽

Xuejun Ma

Keyword(s):

Mental Health ◽

College Students ◽

Data Mining ◽

Air Pollution ◽

Real Time ◽

Physical And Mental Health ◽

Real Time Monitoring ◽

Pollution Characteristics

Download Full-text

A study on classification techniques in data mining

2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT) ◽

10.1109/icccnt.2013.6726842 ◽

2013 ◽

Cited By ~ 42

Author(s):

G. Kesavaraj ◽

S. Sukumaran

Keyword(s):

Data Mining ◽

Classification Techniques

Download Full-text

Dynamic optimization for real-time rule-based systems using predicate dependency

Proceedings Sixth IEEE Real-Time Technology and Applications Symposium. RTAS 2000 ◽

10.1109/rttas.2000.852459 ◽

2002 ◽

Cited By ~ 3

Author(s):

Y.-H. Lee ◽

A.M.K. Cheng

Keyword(s):

Real Time ◽

Dynamic Optimization ◽

Rule Based ◽

Rule Based Systems

Download Full-text

Improved differentiation classification of variable precision artificial intelligence higher education management

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-219036 ◽

2021 ◽

pp. 1-10

Author(s):

Chao Dong ◽

Yan Guo

Keyword(s):

Artificial Intelligence ◽

Higher Education ◽

Data Mining ◽

Decision Tree ◽

Classification Accuracy ◽

Attribute Selection ◽

Higher Education Management ◽

Education Management ◽

Decision Tree Classification

The wide application of artificial intelligence technology in various fields has accelerated the pace of people exploring the hidden information behind large amounts of data. People hope to use data mining methods to conduct effective research on higher education management, and decision tree classification algorithm as a data analysis method in data mining technology, high-precision classification accuracy, intuitive decision results, and high generalization ability make it become a more ideal method of higher education management. Aiming at the sensitivity of data processing and decision tree classification to noisy data, this paper proposes corresponding improvements, and proposes a variable precision rough set attribute selection standard based on scale function, which considers both the weighted approximation accuracy and attribute value of the attribute. The number improves the anti-interference ability of noise data, reduces the bias in attribute selection, and improves the classification accuracy. At the same time, the suppression factor threshold, support and confidence are introduced in the tree pre-pruning process, which simplifies the tree structure. The comparative experiments on standard data sets show that the improved algorithm proposed in this paper is better than other decision tree algorithms and can effectively realize the differentiated classification of higher education management.

Download Full-text