From data to knowledge mining

Author(s):  
Ana Cristina Bicharra Garcia ◽  
Inhauma Ferraz ◽  
Adriana S. Vivacqua

AbstractMost past approaches to data mining have been based on association rules. However, the simple application of association rules usually only changes the user's problem from dealing with millions of data points to dealing with thousands of rules. Although this may somewhat reduce the scale of the problem, it is not a completely satisfactory solution. This paper presents a new data mining technique, called knowledge cohesion (KC), which takes into account a domain ontology and the user's interest in exploring certain data sets to extract knowledge, in the form of semantic nets, from large data sets. The KC method has been successfully applied to mine causal relations from oil platform accident reports. In a comparison with association rule techniques for the same domain, KC has shown a significant improvement in the extraction of relevant knowledge, using processing complexity and knowledge manageability as the evaluation criteria.

2008 ◽  
pp. 2105-2120
Author(s):  
Kesaraporn Techapichetvanich ◽  
Amitava Datta

Both visualization and data mining have become important tools in discovering hidden relationships in large data sets, and in extracting useful knowledge and information from large databases. Even though many algorithms for mining association rules have been researched extensively in the past decade, they do not incorporate users in the association-rule mining process. Most of these algorithms generate a large number of association rules, some of which are not practically interesting. This chapter presents a new technique that integrates visualization into the mining association rule process. Users can apply their knowledge and be involved in finding interesting association rules through interactive visualization, after obtaining visual feedback as the algorithm generates association rules. In addition, the users gain insight and deeper understanding of their data sets, as well as control over mining meaningful association rules.


Author(s):  
Kesaraporn Techapichetvanich ◽  
Amitava Datta

Both visualization and data mining have become important tools in discovering hidden relationships in large data sets, and in extracting useful knowledge and information from large databases. Even though many algorithms for mining association rules have been researched extensively in the past decade, they do not incorporate users in the association-rule mining process. Most of these algorithms generate a large number of association rules, some of which are not practically interesting. This chapter presents a new technique that integrates visualization into the mining association rule process. Users can apply their knowledge and be involved in finding interesting association rules through interactive visualization, after obtaining visual feedback as the algorithm generates association rules. In addition, the users gain insight and deeper understanding of their data sets, as well as control over mining meaningful association rules.


Author(s):  
Md. Zakir Hossain ◽  
Md.Nasim Akhtar ◽  
R.B. Ahmad ◽  
Mostafijur Rahman

<span>Data mining is the process of finding structure of data from large data sets. With this process, the decision makers can make a particular decision for further development of the real-world problems. Several data clusteringtechniques are used in data mining for finding a specific pattern of data. The K-means method isone of the familiar clustering techniques for clustering large data sets.  The K-means clustering method partitions the data set based on the assumption that the number of clusters are fixed.The main problem of this method is that if the number of clusters is to be chosen small then there is a higher probability of adding dissimilar items into the same group. On the other hand, if the number of clusters is chosen to be high, then there is a higher chance of adding similar items in the different groups. In this paper, we address this issue by proposing a new K-Means clustering algorithm. The proposed method performs data clustering dynamically. The proposed method initially calculates a threshold value as a centroid of K-Means and based on this value the number of clusters are formed. At each iteration of K-Means, if the Euclidian distance between two points is less than or equal to the threshold value, then these two data points will be in the same group. Otherwise, the proposed method will create a new cluster with the dissimilar data point. The results show that the proposed method outperforms the original K-Means method.</span>


Author(s):  
LAWRENCE MAZLACK

Determining causality has been a tantalizing goal throughout human history. Proper sacrifices to the gods were thought to bring rewards; failure to make suitable observations were thought to lead to disaster. Today, data mining holds the promise of extracting unsuspected information from very large databases. Methods have been developed to build association rules from large data sets. Association rules indicate the strength of association of two or more data attributes. In many ways, the interest in association rules is that they offer the promise (or illusion) of causal, or at least, predictive relationships. However, association rules only calculate a joint probability; they do not express a causal relationship. If causal relationships could be discovered, it would be very useful. Our goal is to explore causality in the data mining context.


Author(s):  
Ratchakoon Pruengkarn ◽  
◽  
Kok Wai Wong ◽  
Chun Che Fung

Data mining is the analytics and knowledge discovery process of analyzing large volumes of data from various sources and transforming the data into useful information. Various disciplines have contributed to its development and is becoming increasingly important in the scientific and industrial world. This article presents a review of data mining techniques and applications from 1996 to 2016. Techniques are divided into two main categories: predictive methods and descriptive methods. Due to the huge number of publications available on this topic, only a selected number are used in this review to highlight the developments of the past 20 years. Applications are included to provide some insights into how each data mining technique has evolved over the last two decades. Recent research trends focus more on large data sets and big data. Recently there have also been more applications in area of health informatics with the advent of newer algorithms.


2008 ◽  
Vol 17 (06) ◽  
pp. 1109-1129 ◽  
Author(s):  
BASILIS BOUTSINAS ◽  
COSTAS SIOTOS ◽  
ANTONIS GEROLIMATOS

One of the most important data mining problems is learning association rules of the form "90% of the customers that purchase product x also purchase product y". Discovering association rules from huge volumes of data requires substantial processing power. In this paper we present an efficient distributed algorithm for mining association rules that reduces the time complexity in a magnitude that renders as suitable for scaling up to very large data sets. The proposed algorithm is based on partitioning the initial data set into subsets and processing each subset in parallel. The proposed algorithm can maintain the set of association rules that are extracted when applying an association rule mining algorithm to all the data, by reducing the support threshold during processing the subsets. The above are confirmed by empirical tests that we present and which also demonstrate the utility of the method.


JURTEKSI ◽  
2019 ◽  
Vol 5 (1) ◽  
pp. 89-96
Author(s):  
Edi Kurniawan

Abstract: The library is one of the most important means to add insight and knowledge to everyone. In general, borrowing transaction data books that exist in a library are only left to accumulate by the library in the database without any utilization or further processing of the data that has long been stored. By utilizing the Data Mining technique using association rules with FP-Growth, these data will be very useful. Because from the data lending books to the library, new information can be gleaned about what books are often borrowed and know the pattern of relationships between books that have been borrowed together so that later it can be used to compile books in accordance with the existing borrowing patterns so that they can facilitate library visitors in the process of finding books. Keywords: Data Mining, Association Rule, FP-Growth, Library Abstrak: Perpustakaan merupakan salah satu sarana yang sangat penting untuk menambah wawasan dan keilmuan setiap orang. Pada umumnya data transaksi peminjaman buku yang ada pada sebuah perpustakaan hanya dibiarkan saja menumpuk oleh pihak perpustakaan di dalam database tanpa ada pemanfaatan atau pengolahan lebih lanjut dari data-data yang telah lama tersimpan tersebut. Dengan melakukan pemanfaatan menggunakan Teknik Data Mining metode association rules dengan FP-Growth, data-data tersebut akan jadi sangat bermanfaat. Karena dari data peminjaman buku pada perpustakaan tersebut dapat diggali informasi baru tentang buku-buku apa yang sering dipinjam dan mengetahui pola hubungan antara buku yang telah dipinjam secara bersama-sama sehingga nantinya dapat dimanfaatkan untuk melakukan penyusunan buku sesuai dengan pola peminjaman buku yang ada sehingga dapat mempermudah para pengunjung perpustakaan dalam proses pencarian buku. Kata Kunci : Data Mining, Asociation Rule, FP-Growth, Perpustakaan


Sign in / Sign up

Export Citation Format

Share Document