scholarly journals Knowledge Discovery from Very Large Databases Using Frequent Concept Lattices

Author(s):  
Kitsana Waiyamai ◽  
Lotfi Lakhal
1994 ◽  
Vol 9 (1) ◽  
pp. 57-60 ◽  
Author(s):  
Gregory Piatetsky-Shapiro

As the number and size of very large databases continues to grow rapidly, so does the need to make sense of them. This need is addressed by the field called knowledge Discovery in Databases (KDD), which combines approaches from machine learning, statistics, intelligent databases, and knowledge acquisition. KDD encompasses a number of different discovery methods, such as clustering, data summarization, learning classification rules, finding dependency networks, analysing changes, and detecting anomalies (Matheus et at., 1993).


Author(s):  
Gautam Das

In recent years, advances in data collection and management technologies have led to a proliferation of very large databases. These large data repositories typically are created in the hope that, through analysis such as data mining and decision support, they will yield new insights into the data and the real-world processes that created them. In practice, however, while the collection and storage of massive datasets has become relatively straightforward, effective data analysis has proven more difficult to achieve. One reason that data analysis successes have proven elusive is that most analysis queries, by their nature, require aggregation or summarization of large portions of the data being analyzed. For multi-gigabyte data repositories, this means that processing even a single analysis query involves accessing enormous amounts of data, leading to prohibitively expensive running times. This severely limits the feasibility of many types of analysis applications, especially those that depend on timeliness or interactivity.


Author(s):  
Andrew Borthwick ◽  
Stephen Ash ◽  
Bin Pang ◽  
Shehzad Qureshi ◽  
Timothy Jones

Author(s):  
Ivan Bruha

Research in intelligent information systems investigates the possibilities of enhancing their over-all performance, particularly their prediction accuracy and time complexity. One such discipline, data mining (DM), processes usually very large databases in a profound and robust way (Fayyad et al., 1996). DM points to the overall process of determining a useful knowledge from databases, that is, extracting high-level knowledge from low-level data in the context of large databases. This article discusses two newer directions in this field, namely knowledge combination and meta-learning (Vilalta & Drissi, 2002). There exist approaches to combine various paradigms into one robust (hybrid, multistrategy) system which utilizes the advantages of each subsystem and tries to eliminate their drawbacks. There is a general belief that integrating results obtained from multiple lower-level decision-making systems, each usually (but not required) based on a different paradigm, produce better performance. Such multi-level knowledgebased systems are usually referred to as knowledge integration systems. One subset of these systems is called knowledge combination (Fan et al., 1996). We focus on a common topology of the knowledge combination strategy with base learners and base classifiers (Bruha, 2004). Meta-learning investigates how learning systems may improve their performance through experience in order to become flexible. Its goal is to search dynamically for the best learning strategy. We define the fundamental characteristics of the meta-learning such as bias, and hypothesis space. Section 2 surveys the various directions in algorithms and topologies utilized in knowledge combination and meta-learning. Section 3 represents the main focus of this article: description of knowledge combination techniques, meta-learning, and a particular application including the corresponding flow charts. The last section presents the future trends in these topics.


2008 ◽  
pp. 3235-3251
Author(s):  
Yongqiao Xiao ◽  
Jenq-Foung Yao ◽  
Guizhen Yang

Recent years have witnessed a surge of research interest in knowledge discovery from data domains with complex structures, such as trees and graphs. In this paper, we address the problem of mining maximal frequent embedded subtrees which is motivated by such important applications as mining “hot” spots of Web sites from Web usage logs and discovering significant “deep” structures from tree-like bioinformatic data. One major challenge arises due to the fact that embedded subtrees are no longer ordinary subtrees, but preserve only part of the ancestor-descendant relationships in the original trees. To solve the embedded subtree mining problem, in this article we propose a novel algorithm, called TreeGrow, which is optimized in two important respects. First, it obtains frequency counts of root-to-leaf paths through efficient compression of trees, thereby being able to quickly grow an embedded subtree pattern path by path instead of node by node. Second, candidate subtree generation is highly localized so as to avoid unnecessary computational overhead. Experimental results on benchmark synthetic data sets have shown that our algorithm can outperform unoptimized methods by up to 20 times.


Author(s):  
Rashed Mustafa ◽  
Md Javed Hossain ◽  
Thomas Chowdhury

Distributed Database Management System (DDBMS) is one of the prime concerns in distributed computing. The driving force of development of DDBMS is the demand of the applications that need to query very large databases (order of terabytes). Traditional Client- Server database systems are too slower to handle such applications. This paper presents a better way to find the optimal number of nodes in a distributed database management systems. Keywords: DDBMS, Data Fragmentation, Linear Search, RMI.   DOI: 10.3329/diujst.v4i2.4362 Daffodil International University Journal of Science and Technology Vol.4(2) 2009 pp.19-22


Sign in / Sign up

Export Citation Format

Share Document