Review and comparison of Apriori algorithm implementations on Hadoop-MapReduce and Spark

Improved Classification Techniques to Predict the Co-disease in Diabetic Mellitus Patients using Discretization and Apriori Algorithm

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.k1434.0981119 ◽

2019 ◽

Vol 8 (11) ◽

pp. 730-733

Keyword(s):

Data Mining ◽

Association Rules ◽

Census Data ◽

Early Stage ◽

Research Work ◽

Numerical Data ◽

Medical Data ◽

Data Sets ◽

Apriori Algorithm ◽

Data Set

The demand for data mining is now unavoidable in the medical industry due to its various applications and uses in predicting the diseases at the early stage. The methods available in the data mining theories are easy to extract the useful patterns and speed to recognize the task based outcomes. In data mining the classification models are really useful in building the classes for the medical data sets for future analysis in an accurate way. Besides these facilities, Association rules in data mining are a promising technique to find hidden patterns in a medical data set and have been successfully applied with market basket data, census data and financial data. Apriori algorithm, is considered to be a classic algorithm, is useful in mining frequent item sets on a database containing a large number of transactions and it also predicts the relevant association rules. Association rules capture the relationship of items that are present in data sets and when the data set contains continuous attributes, the existing algorithms may not work due to this, discretization can be applied to the association rules in order to find the relation between various patterns in data set. In this paper of our research, using Discretized Apriori the research work is done to predict the by-disease in people who are found with diabetic syndrome; also the rules extracted are analyzed. In the discretization step, numerical data is discretized and fed to the Apriori algorithm for better association rules to predict the diseases.

Download Full-text

Improved Classification Techniques to Predict the Co-disease in Diabetic Mellitus Patients using Discretization and Apriori Algorithm

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.k1434.0881119 ◽

2019 ◽

Vol 8 (11) ◽

pp. 730-733

Keyword(s):

Data Mining ◽

Association Rules ◽

Census Data ◽

Early Stage ◽

Research Work ◽

Numerical Data ◽

Medical Data ◽

Data Sets ◽

Apriori Algorithm ◽

Data Set

The demand for data mining is now unavoidable in the medical industry due to its various applications and uses in predicting the diseases at the early stage. The methods available in the data mining theories are easy to extract the useful patterns and speed to recognize the task based outcomes. In data mining the classification models are really useful in building the classes for the medical data sets for future analysis in an accurate way. Besides these facilities, Association rules in data mining are a promising technique to find hidden patterns in a medical data set and have been successfully applied with market basket data, census data and financial data. Apriori algorithm, is considered to be a classic algorithm, is useful in mining frequent item sets on a database containing a large number of transactions and it also predicts the relevant association rules. Association rules capture the relationship of items that are present in data sets and when the data set contains continuous attributes, the existing algorithms may not work due to this, discretization can be applied to the association rules in order to find the relation between various patterns in data set. In this paper of our research, using Discretized Apriori the research work is done to predict the by-disease in people who are found with diabetic syndrome; also the rules extracted are analyzed. In the discretization step, numerical data is discretized and fed to the Apriori algorithm for better association rules to predict the diseases.

Download Full-text

Mining association rules based on an improved Apriori Algorithm

2010 International Conference on Audio, Language and Image Processing ◽

10.1109/icalip.2010.5684546 ◽

2010 ◽

Author(s):

Yanfei Zhou ◽

Wanggen Wan ◽

Junwei Liu ◽

Long Cai

Keyword(s):

Association Rules ◽

Apriori Algorithm ◽

Mining Association Rules

Download Full-text

Finding Persistent Strong Rules

Knowledge Discovery Practices and Emerging Applications of Data Mining - Advances in Data Mining and Database Management ◽

10.4018/978-1-60960-067-9.ch005 ◽

2010 ◽

pp. 85-107

Author(s):

Anthony Scime ◽

Karthik Rajasethupathy ◽

Kulathur S. Rajasethupathy ◽

Gregg R. Murray

Keyword(s):

Data Mining ◽

Association Rules ◽

Strong Association ◽

National Election ◽

Data Sets ◽

Rule Discovery ◽

Discovery Process ◽

Data Set ◽

Rule Sets ◽

Election Studies

Data mining is a collection of algorithms for finding interesting and unknown patterns or rules in data. However, different algorithms can result in different rules from the same data. The process presented here exploits these differences to find particularly robust, consistent, and noteworthy rules among much larger potential rule sets. More specifically, this research focuses on using association rules and classification mining to select the persistently strong association rules. Persistently strong association rules are association rules that are verifiable by classification mining the same data set. The process for finding persistent strong rules was executed against two data sets obtained from the American National Election Studies. Analysis of the first data set resulted in one persistent strong rule and one persistent rule, while analysis of the second data set resulted in 11 persistent strong rules and 10 persistent rules. The persistent strong rule discovery process suggests these rules are the most robust, consistent, and noteworthy among the much larger potential rule sets.

Download Full-text

Models for Internal Clustering Validation Indexes Based on Hadoop-MapReduce

International Journal of Distributed Systems and Technologies ◽

10.4018/ijdst.2020070103 ◽

2020 ◽

Vol 11 (3) ◽

pp. 42-67

Author(s):

Soumeya Zerabi ◽

Souham Meshoul ◽

Samia Chikhi Boucherkha

Keyword(s):

Clustering Algorithms ◽

Large Data ◽

Optimal Number ◽

Data Sets ◽

Data Set ◽

Number Of Clusters ◽

Distributed Models ◽

Hadoop Mapreduce ◽

Distributed Solutions ◽

Clustering Validation

Cluster validation aims to both evaluate the results of clustering algorithms and predict the number of clusters. It is usually achieved using several indexes. Traditional internal clustering validation indexes (CVIs) are mainly based in computing pairwise distances which results in a quadratic complexity of the related algorithms. The existing CVIs cannot handle large data sets properly and need to be revisited to take account of the ever-increasing data set volume. Therefore, design of parallel and distributed solutions to implement these indexes is required. To cope with this issue, the authors propose two parallel and distributed models for internal CVIs namely for Silhouette and Dunn indexes using MapReduce framework under Hadoop. The proposed models termed as MR_Silhouette and MR_Dunn have been tested to solve both the issue of evaluating the clustering results and identifying the optimal number of clusters. The results of experimental study are very promising and show that the proposed parallel and distributed models achieve the expected tasks successfully.

Download Full-text

Improved Apriori Algorithm for Mining Association Rules

DEStech Transactions on Materials Science and Engineering ◽

10.12783/dtmse/icmsa2018/20584 ◽

2018 ◽

Cited By ~ 1

Author(s):

Yu-gang DAI ◽

Xiang ZHANG ◽

Tao XU ◽

Lin YE ◽

Ya-jing MA

Keyword(s):

Association Rules ◽

Apriori Algorithm ◽

Mining Association Rules

Download Full-text

The Application of Apriori Algorithm in Analysis on Admitted Students of Colleges and Universities

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.321-324.2578 ◽

2013 ◽

Vol 321-324 ◽

pp. 2578-2582

Author(s):

Qian Zhang

Keyword(s):

Data Mining ◽

Association Rules ◽

Colleges And Universities ◽

Apriori Algorithm ◽

Data Mining Techniques ◽

Minimum Support ◽

Sample Data ◽

Mining Association Rules

This paper examined the application of Apriori algorithm in extracting association rules in data mining by sample data on student enrollments. It studied the data mining techniques for extraction of association rules, analyzed the correlation between specialties and characteristics of admitted students, and evaluated the algorithm for mining association rules, in which the minimum support was 30% and the minimum confidence was 40%.

Download Full-text

A Data Warehouse Cleansing Approach Based on Mathematical Association Rules

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.490-495.1878 ◽

2012 ◽

Vol 490-495 ◽

pp. 1878-1882

Author(s):

Yu Xiang Song

Keyword(s):

Data Mining ◽

Data Warehouse ◽

Association Rules ◽

Data Sets ◽

Mathematical Association ◽

Manual Intervention ◽

Mining Association Rules ◽

High Degree

The alliance rules stated above based on the principle of data mining association rules provide a solution for detecting errors in the data sets. The errors are detected automatically. The manual intervention in the proposed algorithm is highly negligible resulting in high degree of automation and accuracy. The duplicity in the names field of the data warehouse has been remarkably cleansed and worked out. Domain independency has been achieved using the concept of integer domain which even adds on to the memory saving capability of the algorithm.

Download Full-text

Research and Application of Apriori Algorithm for Mining Association Rules

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.1079-1080.737 ◽

2014 ◽

Vol 1079-1080 ◽

pp. 737-742

Author(s):

Yi Yong Ye

Keyword(s):

Association Analysis ◽

Association Rules ◽

Web Mining ◽

Recommendation System ◽

Apriori Algorithm ◽

Mining Algorithm ◽

Common Technique ◽

Mining Association Rules

For large amounts of data generated by the e-commerceplatform, combining with the actual needs of e-commerce recommendation system,make research on a common technique of association rules which orientede-commerce Web mining association analysis, introduces the association rules ofApriori mining algorithm, and the specific application of Apriori algorithm isanalyzed through a practical example, Finally, point out the shortcomings ofclassical Apriori algorithm, and gives directions for improvement.

Download Full-text

Improved Apriori Algorithm for Mining Association Rules

International Journal of Information Technology and Computer Science ◽

10.5815/ijitcs.2014.07.03 ◽

2014 ◽

Vol 6 (7) ◽

pp. 15-23 ◽

Cited By ~ 8

Author(s):

Darshan M. Tank

Keyword(s):

Association Rules ◽

Apriori Algorithm ◽

Mining Association Rules

Download Full-text