scholarly journals Implementation of K-Means Algorithm using Clustering Rules on Medical Data Sets

Author(s):  
N. Raga Chandrika ◽  
Vipparla Aruna

During the process of mining frequent item sets, when minimum support is little, the production of candidate sets is a kind of time-consuming and frequent operation in the mining algorithm. The K-Means algorithm does not need to produce the candidate sets, the database which provides the frequent item set is compressed to a frequent pattern tree (or FP tree), and frequent item set is mining by using of FP tree. These algorithms considered as efficient because of their compact structure and also for less generation of candidates itemsets compare to Apriori and Apriori like algorithms. Therefore this paper aims to presents a basic Concepts of some of the algorithms (K-Means Algorithmn, COFI-Tree, CT-PRO) based upon the FP- Tree like structure for mining the frequent item sets along with their capabilities and comparisons. Data mining implementation on spatial data to generate rules and patterns using Frequent Pattern (FP)-Growth algorithm is the major concern of this research study. We presented in this paper how data mining can apply on spatial data.

Author(s):  
Anusha Viswanadapalli ◽  
Praveen Kumar Nelapati

During the process of mining frequent item sets, when minimum support is little, the production of candidate sets is a kind of time-consuming and frequent operation in the mining algorithm. The APRIORI growth algorithm does not need to produce the candidate sets, the database which provides the frequent item set is compressed to a frequent pattern tree (or APRIORI tree), and frequent item set is mining by using of APRIORI tree. These algorithms considered as efficient because of their compact structure and also for less generation of candidates item sets compare to Apriori and Apriori like algorithms. Therefore this paper aims to presents a basic Concepts of some of the algorithms (APRIORI-Growth, COFI-Tree, CT-PRO) based upon the APRIORI- Tree like structure for mining the frequent item sets along with their capabilities and comparisons. Data mining implementation on MEDICAL data to generate rules and patterns using Frequent Pattern (APRIORI)-Growth algorithm is the major concern of this research study. We presented in this paper how data mining can apply on MEDICAL data.


Author(s):  
Gebeyehu Belay Gebremeskel ◽  
Chai Yi ◽  
Zhongshi He

Data Mining (DM) is a rapidly expanding field in many disciplines, and it is greatly inspiring to analyze massive data types, which includes geospatial, image and other forms of data sets. Such the fast growths of data characterized as high volume, velocity, variety, variability, value and others that collected and generated from various sources that are too complex and big to capturing, storing, and analyzing and challenging to traditional tools. The SDM is, therefore, the process of searching and discovering valuable information and knowledge in large volumes of spatial data, which draws basic principles from concepts in databases, machine learning, statistics, pattern recognition and 'soft' computing. Using DM techniques enables a more efficient use of the data warehouse. It is thus becoming an emerging research field in Geosciences because of the increasing amount of data, which lead to new promising applications. The integral SDM in which we focused in this chapter is the inference to geospatial and GIS data.


2017 ◽  
Vol 8 (1) ◽  
pp. 31-43
Author(s):  
Zuber Shaikh ◽  
Antara Mohadikar ◽  
Rachana Nayak ◽  
Rohith Padamadan

Frequent itemsets refer to a set of data values (e.g., product items) whose number of co-occurrences exceeds a given threshold. The challenge is that the design of proofs and verification objects has to be customized for different data mining algorithms. Intended method will implement a basic idea of completeness verification and authentication approach in which the client will uses a set of frequent item sets as the evidence, and checks whether the server has missed any frequent item set as evidence in its returned result. It will help client detect untrusted server and system will become much more efficiency by reducing time. In authentication process CaRP is both a captcha and a graphical password scheme. CaRP addresses a number of security problems altogether, such as online guessing attacks, relay attacks, and, if combined with dual-view technologies, shoulder-surfing attacks.


Author(s):  
Moch. Syahrir ◽  
Fatimatuzzahra Fatimatuzzahra

Data mining dengan peran asosiasi sudah banyak digunakan oleh dunia usaha, salah satu algoritma yang sering digunakan untuk aturan asosiasi adalah apriori. Namun apriori memiliki kelemahan dalam hal performa, karena pada setiap penentuan frequent k-itemset harus melakukan scan database. Hal ini akan menjadi masalah apabila kandidat k-itemset memiliki dimensi yang banyak. proses scan database yang besar akan memakan waktu yang lama dan berpengaruh pada penggunaan memori dan prosesor. Apriori sudah sering dikembangkan, salah satu yang populer adalah Frequent Pattern (fp-growth), apriori dan fp-growth sama-sama merupakan algoritma untuk aturan asosiasi, hanya saja fp-growth menggunakan pendekatan yang berbeda dengan apriori yakni menggunakan pendekatan Frequent Pattern Tree (fp-tree). Meski fp-growth memiiki performa yang bagus ketika scan database namun rules yang di hasilkan oleh fp-growth tidak sebaik yang di hasilkan oleh apriori. Alternatif lain yang bisa digunakan adalah metode hashing, hal ini bisa menjadi solusi untuk mengatasi masalah dalam proses pencarian dan penentuan frequent k-itemset, sehingga proses scan database bisa lebih cepat. Tujuan penelitian adalah memperbaiki kinerja apriori dalam proses pencarian frekuensi itemset sehingga waktu scan database bisa lebih cepat


Smart systems are the one of the most significant inventions of our times. These systems rely on powerful information mining techniques to achieve intelligence in decision making. Frequent item set mining (FIM), has become one of the most significant research area of data mining. The information present in databases is in-general ambiguous and uncertain. In such databases, one should think of weighted FIM to discover item sets which are significant from end user’s perspective. Be that as it may, with introduction of weight-factor for FIM makes the weighted continuous item sets may not fulfil the descending conclusion property anymore. Subsequently, the pursuit space of successive item set can't be limited by descending conclusion property which prompts a poor time effectiveness. In this paper, we introduce two properties for FIM, first one is, weight judgment downward closure property (WD-FIM), it is for weighted FIM and the second one is existence property for its subsets. In view of above two properties, the WD-FIM calculation is proposed to limit the looking through space of the weighted regular item sets and improve the time effectiveness. In addition, the culmination and time productivity of WD-FIM calculation are examined hypothetically. At last, the exhibition of the proposed WD-FIM calculation is confirmed on both engineered and genuine data sets


Author(s):  
Hanane Menad ◽  
Abdelmalek Amine

Medical data mining has great potential for exploring the hidden patterns in the data sets of the medical domain. These patterns can be utilized for clinical diagnosis. Bio-inspired algorithms is a new field of research. Its main advantage is knitting together subfields related to the topics of connectionism, social behavior, and emergence. Briefly put, it is the use of computers to model living phenomena and simultaneously the study of life to improve the usage of computers. In this chapter, the authors present an application of four bio-inspired algorithms and meta heuristics for classification of seven different real medical data sets. Two of these algorithms are based on similarity calculation between training and test data while the other two are based on random generation of population to construct classification rules. The results showed a very good efficiency of bio-inspired algorithms for supervised classification of medical data.


Sign in / Sign up

Export Citation Format

Share Document