Mining Data with Group Theoretical Means

Author(s):  
Gabriele Kern-Isberner

Knowledge discovery refers to the process of extracting new, interesting, and useful knowledge from data and presenting it in an intelligible way to the user. Roughly, knowledge discovery can be considered a three-step process: preprocessing data; data mining, in which the actual exploratory work is done; and interpreting the results to the user. Here, I focus on the data-mining step, assuming that a suitable set of data has been chosen properly. The patterns that we search for in the data are plausible relationships, which agents may use to establish cognitive links for reasoning. Such plausible relationships can be expressed via association rules. Usually, the criteria to judge the relevance of such rules are either frequency based (Bayardo & Agrawal, 1999) or causality based (for Bayesian networks, see Spirtes, Glymour, & Scheines, 1993). Here, I will pursue a different approach that aims at extracting what can be regarded as structures of knowledge — relationships that may support the inductive reasoning of agents and whose relevance is founded on information theory. The method that I will sketch in this article takes numerical relationships found in data and interprets these relationships as structural ones, using mostly algebraic techniques to elaborate structural information.

Author(s):  
Gabriele Kern-Isberner

Knowledge discovery refers to the process of extracting new, interesting, and useful knowledge from data and presenting it in an intelligible way to the user. Roughly, knowledge discovery can be considered a three-step process: preprocessing data; data mining, in which the actual exploratory work is done; and interpreting the results to the user. Here, I focus on the data-mining step, assuming that a suitable set of data has been chosen properly.


2018 ◽  
Vol 7 (2.6) ◽  
pp. 93 ◽  
Author(s):  
Deepali R Vora ◽  
Kamatchi Iyer

Educational Data Mining (EDM) is a new field of research in the data mining and Knowledge Discovery in Databases (KDD) field. It mainly focuses in mining useful patterns and discovering useful knowledge from the educational information systems from schools, to colleges and universities. Analysing students’ data and information to perform various tasks like classification of students, or to create decision trees or association rules, so as to make better decisions or to enhance student’s performance is an interesting field of research. The paper presents a survey of various tasks performed in EDM and algorithms (methods) used for the same. The paper identifies the lacuna and challenges in Algorithms applied, Performance Factors considered and data used in EDM.


Author(s):  
Andi Baritchi

In today’s business world, the use of computers for everyday business processes and data recording has become virtually ubiquitous. With the advent of this electronic age comes one priceless by-product — data. As more and more executives are discovering each day, companies can harness data to gain valuable insights into their customer base. Data mining is the process used to take these immense streams of data and reduce them to useful knowledge. Data mining has limitless applications, including sales and marketing, customer support, knowledge-base development, not to mention fraud detection for virtually any field, etc. “Data mining,” a bit of a misnomer, refers to mining the data to find the gems hidden inside the data, and as such it is the most often-used reference to this process. It is important to note, however, that data mining is only one part of the Knowledge Discovery in Databases process, albeit it is the workhorse. In this chapter, we provide a concise description of the Knowledge Discovery process, from domain analysis and data selection, to data preprocessing and transformation, to the data mining itself, and finally the interpretation and evaluation of the results as applied to the domain. We describe the different flavors of data mining, including association rules, classification and prediction, clustering and outlier analysis, customer profiling, and how each of these can be used in practice to improve a business’ understanding of its customers. We introduce the reader to some of today’s hot data mining resources, and then for those that are interested, at the end of the chapter we provide a concise technical overview of how each data-mining technology works.


2008 ◽  
pp. 2105-2120
Author(s):  
Kesaraporn Techapichetvanich ◽  
Amitava Datta

Both visualization and data mining have become important tools in discovering hidden relationships in large data sets, and in extracting useful knowledge and information from large databases. Even though many algorithms for mining association rules have been researched extensively in the past decade, they do not incorporate users in the association-rule mining process. Most of these algorithms generate a large number of association rules, some of which are not practically interesting. This chapter presents a new technique that integrates visualization into the mining association rule process. Users can apply their knowledge and be involved in finding interesting association rules through interactive visualization, after obtaining visual feedback as the algorithm generates association rules. In addition, the users gain insight and deeper understanding of their data sets, as well as control over mining meaningful association rules.


Author(s):  
Kesaraporn Techapichetvanich ◽  
Amitava Datta

Both visualization and data mining have become important tools in discovering hidden relationships in large data sets, and in extracting useful knowledge and information from large databases. Even though many algorithms for mining association rules have been researched extensively in the past decade, they do not incorporate users in the association-rule mining process. Most of these algorithms generate a large number of association rules, some of which are not practically interesting. This chapter presents a new technique that integrates visualization into the mining association rule process. Users can apply their knowledge and be involved in finding interesting association rules through interactive visualization, after obtaining visual feedback as the algorithm generates association rules. In addition, the users gain insight and deeper understanding of their data sets, as well as control over mining meaningful association rules.


Author(s):  
Edgard Benítez-Guerrero ◽  
Omar Nieva-García

The vast amounts of digital information stored in databases and other repositories represent a challenge for finding useful knowledge. Traditionalmethods for turning data into knowledge based on manual analysis reach their limits in this context, and for this reason, computer-based methods are needed. Knowledge Discovery in Databases (KDD) is the semi-automatic, nontrivial process of identifying valid, novel, potentially useful, and understandable knowledge (in the form of patterns) in data (Fayyad, Piatetsky-Shapiro, Smyth & Uthurusamy, 1996). KDD is an iterative and interactive process with several steps: understanding the problem domain, data preprocessing, pattern discovery, and pattern evaluation and usage. For discovering patterns, Data Mining (DM) techniques are applied.


2010 ◽  
Vol 108-111 ◽  
pp. 50-56 ◽  
Author(s):  
Liang Zhong Shen

Due to the popularity of knowledge discovery and data mining, in practice as well as among academic and corporate professionals, association rule mining is receiving increasing attention. The technology of data mining is applied in analyzing data in databases. This paper puts forward a new method which is suit to design the distributed databases.


2001 ◽  
Vol 10 (01n02) ◽  
pp. 107-135 ◽  
Author(s):  
ISTVAN JONYER ◽  
LAWRENCE B. HOLDER ◽  
DIANE J. COOK

Hierarchical conceptual clustering has proven to be a useful, although greatly under-explored data mining technique. A graph-based representation of structural information combined with a substructure discovery technique has been shown to be successful in knowledge discovery. The SUBDUE substructure discovery system provides the advantages of both approaches. This work presents SUBDUE and the development of its clustering functionalities. Several examples are used to illustrate the validity of the approach both in structured and unstructured domains, as well as compare SUBDUE to earlier clustering algorithms. Results show that SUBDUE successfully discovers hierarchical clusterings in both structured and unstructured data.


2014 ◽  
Vol 543-547 ◽  
pp. 3569-3572
Author(s):  
Tian Xiang Zhu ◽  
Xiao Lan Tian ◽  
Shu Hui Sun ◽  
Shu Jie Sun

Cloud computing is the latest trend in IT technical development, the importance of cloud databases has been widely acknowledged. There are numerous data in the cloud database and among these data, much potential and valuable knowledge are implicit. The key point is to discover and pick up the useful knowledge automatically. An association rule is one of the main models in mining out these data, and it mainly focuses on the relationship among different areas in the data. This paper puts forward the basic model of data mining based on association rules in cloud database and introduces corresponding mining algorithms.


Sign in / Sign up

Export Citation Format

Share Document