Interactive Visual Analytics of Databases and Frequent Sets

2013 ◽  
Vol 3 (4) ◽  
pp. 120-140 ◽  
Author(s):  
Carson K.S. Leung ◽  
Christopher L. Carmichael ◽  
Patrick Johnstone ◽  
David Sonny Hung-Cheung Yuen

In numerous real-life applications, large databases can be easily generated. Implicitly embedded in these databases is previously unknown and potentially useful knowledge such as frequently occurring sets of items, merchandise, or events. Different algorithms have been proposed for managing and retrieving useful information from these databases. Various algorithms have also been proposed for mining these databases to find frequent sets, which are usually presented in a lengthy textual list. As “a picture is worth a thousand words”, the use of visual representations can enhance user understanding of the inherent relationships among the mined frequent sets. Many of the existing visualizers were not designed to visualize these mined frequent sets. In this journal article, an interactive visual analytic system is proposed for providing visual analytic solutions to the frequent set mining problem. The system enables the management, visualization, and advanced analysis of the original transaction databases as well as the frequent sets mined from these databases.

Author(s):  
Carson K.-S. Leung ◽  
Christopher L. Carmichael ◽  
Patrick Johnstone ◽  
Roy Ruokun Xing ◽  
David Sonny Hung-Cheung Yuen

High volumes of a wide variety of data can be easily generated at a high velocity in many real-life applications. Implicitly embedded in these big data is previously unknown and potentially useful knowledge such as frequently occurring sets of items, merchandise, or events. Different algorithms have been proposed for either retrieving information about the data or mining the data to find frequent sets, which are usually presented in a lengthy textual list. As “a picture is worth a thousand words”, the use of visual representations can enhance user understanding of the inherent relationships among the mined frequent sets. However, many of the existing visualizers were not designed to visualize these mined frequent sets. This book chapter presents an interactive next-generation visual analytic system. The system enables the management, visualization, and advanced analysis of the original big data and the frequent sets mined from the data.


Author(s):  
Carson Kai-Sang Leung ◽  
Christopher Carmichael

Nowadays, various data, text, and web mining applications can easily generate large volumes of data. Embedded within these data is previously unknown and potential useful knowledge such as frequently occurring sets of items, merchandise, or events. Hence, numerous algorithms have been proposed for finding these frequent sets, which are usually presented in a lengthy textual list. However, “a picture is worth a thousand words”. The use of visual representations can enhance user understanding of the inherent relations among the frequent sets. Although a few visualizers have been developed, most of them were not designed for visualizing the mined frequent sets. In this chapter, an interactive visual analytic system called iVAS is proposed for providing visual analytic solutions to the frequent set mining problem. The system enables the visualization and advanced analysis of the original transaction databases as well as the frequent sets mined from these databases.


Author(s):  
Carson K.S. Leung ◽  
Yibin Zhang

In the current era of big data, high volumes of a wide variety of valuable data—which may be of different veracities—can be easily generated or collected at a high speed in various real-life applications related to art, culture, design, engineering, mathematics, science, and technology. A data science solution helps manage, analyze, and mine these big data—such as musical data—for the discovery of interesting information and useful knowledge. As “a picture is worth a thousand words,” a visual representation provided by the data science solution helps visualize the big data and comprehend the mined information and discovered knowledge. This journal article presents a visual analytic system—which uses a hue-saturation-value (HSV) color model to represent big data—for data science on musical data and beyond (e.g., other types of big data).


Author(s):  
Ivan Bruha

Research in intelligent information systems investigates the possibilities of enhancing their over-all performance, particularly their prediction accuracy and time complexity. One such discipline, data mining (DM), processes usually very large databases in a profound and robust way (Fayyad et al., 1996). DM points to the overall process of determining a useful knowledge from databases, that is, extracting high-level knowledge from low-level data in the context of large databases. This article discusses two newer directions in this field, namely knowledge combination and meta-learning (Vilalta & Drissi, 2002). There exist approaches to combine various paradigms into one robust (hybrid, multistrategy) system which utilizes the advantages of each subsystem and tries to eliminate their drawbacks. There is a general belief that integrating results obtained from multiple lower-level decision-making systems, each usually (but not required) based on a different paradigm, produce better performance. Such multi-level knowledgebased systems are usually referred to as knowledge integration systems. One subset of these systems is called knowledge combination (Fan et al., 1996). We focus on a common topology of the knowledge combination strategy with base learners and base classifiers (Bruha, 2004). Meta-learning investigates how learning systems may improve their performance through experience in order to become flexible. Its goal is to search dynamically for the best learning strategy. We define the fundamental characteristics of the meta-learning such as bias, and hypothesis space. Section 2 surveys the various directions in algorithms and topologies utilized in knowledge combination and meta-learning. Section 3 represents the main focus of this article: description of knowledge combination techniques, meta-learning, and a particular application including the corresponding flow charts. The last section presents the future trends in these topics.


2008 ◽  
pp. 2105-2120
Author(s):  
Kesaraporn Techapichetvanich ◽  
Amitava Datta

Both visualization and data mining have become important tools in discovering hidden relationships in large data sets, and in extracting useful knowledge and information from large databases. Even though many algorithms for mining association rules have been researched extensively in the past decade, they do not incorporate users in the association-rule mining process. Most of these algorithms generate a large number of association rules, some of which are not practically interesting. This chapter presents a new technique that integrates visualization into the mining association rule process. Users can apply their knowledge and be involved in finding interesting association rules through interactive visualization, after obtaining visual feedback as the algorithm generates association rules. In addition, the users gain insight and deeper understanding of their data sets, as well as control over mining meaningful association rules.


Author(s):  
Kesaraporn Techapichetvanich ◽  
Amitava Datta

Both visualization and data mining have become important tools in discovering hidden relationships in large data sets, and in extracting useful knowledge and information from large databases. Even though many algorithms for mining association rules have been researched extensively in the past decade, they do not incorporate users in the association-rule mining process. Most of these algorithms generate a large number of association rules, some of which are not practically interesting. This chapter presents a new technique that integrates visualization into the mining association rule process. Users can apply their knowledge and be involved in finding interesting association rules through interactive visualization, after obtaining visual feedback as the algorithm generates association rules. In addition, the users gain insight and deeper understanding of their data sets, as well as control over mining meaningful association rules.


2010 ◽  
Vol 25 (3) ◽  
pp. 247-248
Author(s):  
Roman Barták ◽  
Amedeo Cesta ◽  
Lee McCluskey ◽  
Miguel A. Salido

AbstractPlanning, scheduling and constraint satisfaction are important areas in artificial intelligence (AI) with broad practical applicability. Many real-world problems can be formulated as AI planning and scheduling (P&S) problems, where resources must be allocated to optimize overall performance objectives. Frequently, solving these problems requires an adequate mixture of planning, scheduling and resource allocation to competing goal activities over time in the presence of complex state-dependent constraints. Constraint satisfaction plays an important role in solving such real-life problems, and integrated techniques that manage P&S with constraint satisfaction are particularly useful. Knowledge engineering supports the solution of such problems by providing adequate modelling techniques and knowledge extraction techniques for improving the performance of planners and schedulers. Briefly speaking, knowledge engineering tools serve as a bridge between the real world and P&S systems.


2021 ◽  
Author(s):  
Mikhail Andronov ◽  
Maxim Fedorov ◽  
Sergey Sosnin

<div>Humans prefer visual representations for the analysis of large databases. In this work, we suggest a method for the visualization of the chemical reaction space. Our technique uses the t-SNE approach that is parameterized by a deep neural network (parametric t-SNE). We demonstrated that the parametric t-SNE combined with reaction difference fingerprints can provide a tool for the projection of chemical reactions onto a low-dimensional manifold for easy exploration of reaction space. We showed that the global reaction landscape, been projected onto a 2D plane, corresponds well with already known reaction types. The application of a pretrained parametric t-SNE model to new reactions allows chemists to study these reactions on a global reaction space. We validated the feasibility of this approach for two marketed drugs: darunavir and oseltamivir. We believe that our method can help explore reaction space and inspire chemists to find new reactions and synthetic ways. </div><div><br></div>


Author(s):  
Ivan Bruha

A ‘traditional’ learning algorithm that can induce a set of decision rules usually represents a robust and comprehensive system that discovers a knowledge from usually large datasets. We call this discipline Data Mining (DM). Any classifier, expert system, or generally a decision-supporting system can then utilize this decision set to derive a decision (prediction) about given problems, observations, diagnostics. DM can be defined as a nontrivial process of identifying valid, novel, and ultimately understandable knowledge in data. It is understood that DM as a multidisciplinary activity points to the overall process of determining a useful knowledge from databases, i.e. extracting highlevel knowledge from low-level data in the context of large databases. A rule-inducing learning algorithm may yield either an ordered or unordered set of decision rules. The latter seems to be more understandable by humans and directly applicable in most expert systems or decisionsupporting ones. However, classification utilizing the unordered-mode decision rules may be accompanied by some conflict situations, particularly when several rules belonging to different classes match (‘fire’ for) an input to-be-classified (unseen) object. One of the possible solutions to this conflict is to associate each decision rule induced by a learning algorithm with a numerical factor which is commonly called the rule quality. The chapter first surveys empirical and statistical formulas of the rule quality and compares their characteristics. Statistical tools such as contingency tables, rule consistency, completeness, quality, measures of association, measures of agreement are introduced as suitable vehicles for depicting a behaviour of a decision rule. After that, a very brief theoretical methodology for defining rule qualities is acquainted. The chapter then concludes by analysis of the formulas for rule qualities, and exhibits a list of future trends in this discipline.


2003 ◽  
Vol 8 (3) ◽  
pp. 219-238 ◽  
Author(s):  
Morten Levin

Can universities ever become a greenhouse for education in Action Research? Would it be possible to create Ph.D. programs in Action Research that are loyal to the genuine characteristics of Action Research? The hegemony of conventional researcher education has dominated university activities. Action Research has inherent characteristics that break radically with the academic tradition. The core challenge is to assess whether high-level training in Action Research can find a home in universities. Training action researchers in conventional academic institutions will in itself be an action research project. The paper presents three different AR projects, all aimed at training cohorts of students to become professional Action Researchers through obtaining a Ph.D. The first program started in 1989, the second in 1995, and the new program began in May 2003. The main conclusion is that it is a feasible strategy to create action research learning opportunities within a conventional academic context. This is partly due to a change in conceptualization of what constitutes knowledge, adding onto a stronger demand for practical and useful knowledge. At the local design and implementation level, curriculum design — both collective learning processes and theses that were closely connected to real life change activities — were important factors for success.


Sign in / Sign up

Export Citation Format

Share Document