Mining Environmental Data in the ADMIRE Project Using New Advanced Methods and Tools

Author(s):  
Ondrej Habala ◽  
Martin Šeleng ◽  
Viet Tran ◽  
Branislav Šimo ◽  
Ladislav Hluchý

The project Advanced Data Mining and Integration Research for Europe (ADMIRE) is designing new methods and tools for comfortable mining and integration of large, distributed data sets. One of the prospective application domains for such methods and tools is the environmental applications domain, which often uses various data sets from different vendors where data mining is becoming increasingly popular and more computer power becomes available. The authors present a set of experimental environmental scenarios, and the application of ADMIRE technology in these scenarios. The scenarios try to predict meteorological and hydrological phenomena which currently cannot or are not predicted by using data mining of distributed data sets from several providers in Slovakia. The scenarios have been designed by environmental experts and apart from being used as the testing grounds for the ADMIRE technology; results are of particular interest to experts who have designed them.

Author(s):  
Ondrej Habala ◽  
Martin Šeleng ◽  
Viet Tran ◽  
Branislav Šimo ◽  
Ladislav Hluchý

The project Advanced Data Mining and Integration Research for Europe (ADMIRE) is designing new methods and tools for comfortable mining and integration of large, distributed data sets. One of the prospective application domains for such methods and tools is the environmental applications domain, which often uses various data sets from different vendors where data mining is becoming increasingly popular and more computer power becomes available. The authors present a set of experimental environmental scenarios, and the application of ADMIRE technology in these scenarios. The scenarios try to predict meteorological and hydrological phenomena which currently cannot or are not predicted by using data mining of distributed data sets from several providers in Slovakia. The scenarios have been designed by environmental experts and apart from being used as the testing grounds for the ADMIRE technology; results are of particular interest to experts who have designed them.


2016 ◽  
Vol 2016 ◽  
pp. 1-11 ◽  
Author(s):  
Ivan Kholod ◽  
Ilya Petukhov ◽  
Andrey Shorov

This paper describes the construction of a Cloud for Distributed Data Analysis (CDDA) based on the actor model. The design uses an approach to map the data mining algorithms on decomposed functional blocks, which are assigned to actors. Using actors allows users to move the computation closely towards the stored data. The process does not require loading data sets into the cloud and allows users to analyze confidential information locally. The results of experiments show that the efficiency of the proposed approach outperforms established solutions.


Acta Numerica ◽  
2001 ◽  
Vol 10 ◽  
pp. 313-355 ◽  
Author(s):  
Markus Hegland

Methods for knowledge discovery in data bases (KDD) have been studied for more than a decade. New methods are required owing to the size and complexity of data collections in administration, business and science. They include procedures for data query and extraction, for data cleaning, data analysis, and methods of knowledge representation. The part of KDD dealing with the analysis of the data has been termed data mining. Common data mining tasks include the induction of association rules, the discovery of functional relationships (classification and regression) and the exploration of groups of similar data objects in clustering. This review provides a discussion of and pointers to efficient algorithms for the common data mining tasks in a mathematical framework. Because of the size and complexity of the data sets, efficient algorithms and often crude approximations play an important role.


2021 ◽  
Vol 10 (3) ◽  
pp. 121-127
Author(s):  
Bareen Haval ◽  
Karwan Jameel Abdulrahman ◽  
Araz Rajab

This article presents the results of connecting an educational data mining techniques to the academic performance of students. Three classification models (Decision Tree, Random Forest and Deep Learning) have been developed to analyze data sets and predict the performance of students. The projected submission of the three classificatory was calculated and matched. The academic history and data of the students from the Office of the Registrar were used to train the models. Our analysis aims to evaluate the results of students using various variables such as the student's grade. Data from (221) students with (9) different attributes were used. The results of this study are very important, provide a better understanding of student success assessments and stress the importance of data mining in education. The main purpose of this study is to show the student successful forecast using data mining techniques to improve academic programs. The results of this research indicate that the Decision Tree classifier overtakes two other classifiers by achieving a total prediction accuracy of 97%.


Big Data ◽  
2016 ◽  
pp. 1347-1366
Author(s):  
Lucía Serrano-Luján ◽  
Jose Manuel Cadenas ◽  
Antonio Urbina

Data mining techniques have been used on data collected from a photovoltaic system to predict its generation and performance. Nevertheless, up to date, this computing approach has needed the simultaneous measurement of environmental parameters that are collected by an array of sensors. This chapter presents the application of several computing learning techniques to electrical data in order to detect and classify the occurrence of failures (i.e. shadows, bad weather conditions, etc.) without using environmental data. The results of a 222kWp (CdTe) case study show how the application of computing learning algorithms can be used to improve the management and performance of photovoltaic generators without relying on environmental parameters.


Author(s):  
Lucía Serrano-Luján ◽  
Jose Manuel Cadenas ◽  
Antonio Urbina

Data mining techniques have been used on data collected from a photovoltaic system to predict its generation and performance. Nevertheless, up to date, this computing approach has needed the simultaneous measurement of environmental parameters that are collected by an array of sensors. This chapter presents the application of several computing learning techniques to electrical data in order to detect and classify the occurrence of failures (i.e. shadows, bad weather conditions, etc.) without using environmental data. The results of a 222kWp (CdTe) case study show how the application of computing learning algorithms can be used to improve the management and performance of photovoltaic generators without relying on environmental parameters.


2017 ◽  
Vol 106 (11) ◽  
pp. 3270-3279 ◽  
Author(s):  
Maulik K. Nariya ◽  
Jae Hyun Kim ◽  
Jian Xiong ◽  
Peter A. Kleindl ◽  
Asha Hewarathna ◽  
...  

2018 ◽  
Vol 7 (4.19) ◽  
pp. 806
Author(s):  
Zahraa Shams Alden ◽  
Ayad Hameed Mousa

Data mining, usually known as knowledge elicitation in the field of computer science databases, is the procedure to find out an important relationship, useful patterns in a huge amount of raw data. Besides, many sectors have adapted and used data mining in their applications such as healthcare and industry sector. In the healthcare sector, data mining can help in determining the probability of particular health cases in medical issues which the related variables pre-known as well as predicting future events. The availability of medical data for data mining usually exist in a raw data format, therefore, it needs for making ready and exploration to be willing to use. In the context of this paper, an analyzing of medical data was introduced to support prosthetics service centers to analyze find out the significant information from limbless medical cases, besides, in providing a comprehensive understanding of amputation and its types as well as the level of amputation. To ensure extract meaningful information from the intended data sets as well as to follow a systematic approach, the CRISP-DM model was adopted. The findings show the important and meaningful of the analyzing data using data mining modes. 


Sign in / Sign up

Export Citation Format

Share Document