Analysis of Data on Staff Turnover Using Association Rules and Predictive Techniques

Purpose: The purpose of this paper is to present the results of an analysis and evaluation of data on employee turnover based on deep data mining using association rules and decision trees in a specific organisation.Methodology/Approach: For the analysis, we chose deep data mining methods, primarily a search for association rules using the Apriori algorithm in the R programming language. For the sake of supplementation and comparison of results, data were also analysed using the predictive decision trees method, applying the C5.0, rpart and ctree algorithms in the R program.Findings: The results of the analyses showed that observing the basic principles of correct communication from the beginning of an employment relationship, or during hiring, is justified. Communication and regular conversations between a superior and employees can help identify problems earlier, address them and reduce the number of people leaving the company. The results of the analysis helped the organisation to set measures to reduce the number of an employee leaving.Research Limitation/implication: A limiting factor in performing such analyses is the availability of quality data in the required quantity. Our most significant advantage when performing our analysis was that quality data were available. To create the final structure of the required data set, we used data from the organisation’s internal information systems.Originality/Value of paper: This contribution offers a new approach to analysing data on employee turnover, whose essence is that we need to find the most interesting and frequent correlations in a significant amount of data.

Download Full-text

Finding Persistent Strong Rules

Knowledge Discovery Practices and Emerging Applications of Data Mining - Advances in Data Mining and Database Management ◽

10.4018/978-1-60960-067-9.ch005 ◽

2010 ◽

pp. 85-107

Author(s):

Anthony Scime ◽

Karthik Rajasethupathy ◽

Kulathur S. Rajasethupathy ◽

Gregg R. Murray

Keyword(s):

Data Mining ◽

Association Rules ◽

Strong Association ◽

National Election ◽

Data Sets ◽

Rule Discovery ◽

Discovery Process ◽

Data Set ◽

Rule Sets ◽

Election Studies

Data mining is a collection of algorithms for finding interesting and unknown patterns or rules in data. However, different algorithms can result in different rules from the same data. The process presented here exploits these differences to find particularly robust, consistent, and noteworthy rules among much larger potential rule sets. More specifically, this research focuses on using association rules and classification mining to select the persistently strong association rules. Persistently strong association rules are association rules that are verifiable by classification mining the same data set. The process for finding persistent strong rules was executed against two data sets obtained from the American National Election Studies. Analysis of the first data set resulted in one persistent strong rule and one persistent rule, while analysis of the second data set resulted in 11 persistent strong rules and 10 persistent rules. The persistent strong rule discovery process suggests these rules are the most robust, consistent, and noteworthy among the much larger potential rule sets.

Download Full-text

Association Rules Mining Based on Adaptive Fuzzy Clustering Algorithm

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.998-999.842 ◽

2014 ◽

Vol 998-999 ◽

pp. 842-845 ◽

Cited By ~ 1

Author(s):

Jia Mei Guo ◽

Yin Xiang Pei

Keyword(s):

Data Mining ◽

Association Rules ◽

Clustering Algorithm ◽

Original Data ◽

Data Set ◽

Association Rules Mining ◽

Fuzzy Association Rules ◽

Redundant Data ◽

Fuzzy Partitions ◽

Rules Extraction

Association rules extraction is one of the important goals of data mining and analyzing. Aiming at the problem that information lose caused by crisp partition of numerical attribute , in this article, we put forward a fuzzy association rules mining method based on fuzzy logic. First, we use c-means clustering to generate fuzzy partitions and eliminate redundant data, and then map the original data set into fuzzy interval, in the end, we extract the fuzzy association rules on the fuzzy data set as providing the basis for proper decision-making. Results show that this method can effectively improve the efficiency of data mining and the semantic visualization and credibility of association rules.

Download Full-text

Association Rules Analysis on FP-Growth Method in Predicting Sales

10.31227/osf.io/8m57c ◽

2017 ◽

Author(s):

Andysah Putera Utama Siahaan ◽

Mesran Mesran ◽

Andre Hasudungan Lubis ◽

Ali Ikhwan ◽

Supiyandi

Keyword(s):

Data Mining ◽

Association Rules ◽

Frequent Itemset ◽

Frequent Pattern ◽

Data Set ◽

Pattern Processing ◽

Large Databases ◽

Growth Method ◽

Association Rules Analysis ◽

A Company

Sales transaction data on a company will continue to increase day by day. Large amounts of data can be problematic for a company if it is not managed properly. Data mining is a field of science that unifies techniques from machine learning, pattern processing, statistics, databases, and visualization to handle the problem of retrieving information from large databases. The relationship sought in data mining can be a relationship between two or more in one dimension. The algorithm included in association rules in data mining is the Frequent Pattern Growth (FP-Growth) algorithm is one of the alternatives that can be used to determine the most frequent itemset in a data set.

Download Full-text

Soil Data Analysis and Crop Yield Prediction in Data Mining using R – Programming

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.c8683.019320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 1857-1860

Keyword(s):

Data Mining ◽

Data Analysis ◽

Decision Tree ◽

Crop Yield ◽

Climatic Condition ◽

Research Work ◽

Yield Prediction ◽

Decision Tree Algorithm ◽

Data Set ◽

R Programming

Data mining is better choices in emerging research filed- soil data analysis. crop yield prediction is an important issue for selecting the crop. earlier prediction of crop is done by the experience of farmer on a particular type of field and crop. predicting the crop is done by the farmer’s experience based on the factors like soil types, climatic condition, seasons, and weather, rainfall and irrigation facilities. data mining techniques is the better choice for predicting the crop. the analysis of soil plays an important role in agricultural filed. soil fertility prediction is one of the very important factors in agriculture this research work implements to predict yield of crop, decision tree algorithm is used to find yield. the aim of this research to pinpoint the accuracy and to finding the yield of the crop using decision tree and c 4.5 algorithm is used to predict the yield of crop using rprogramming and also to find range of magnesium found in the collected soil data set. this prediction will be very useful for the farmer to predict the crop yield for cultivation

Download Full-text

Finding Persistent Strong Rules

Data Mining ◽

10.4018/978-1-4666-2455-9.ch002 ◽

2013 ◽

pp. 28-49

Author(s):

Anthony Scime ◽

Karthik Rajasethupathy ◽

Kulathur S. Rajasethupathy ◽

Gregg R. Murray

Keyword(s):

Data Mining ◽

Association Rules ◽

Strong Association ◽

National Election ◽

Data Sets ◽

Rule Discovery ◽

Discovery Process ◽

Data Set ◽

Rule Sets ◽

Election Studies

Download Full-text

Study on Data Mining Techniques and Algorithms of Association Rules Data Mining

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.543-547.2040 ◽

2014 ◽

Vol 543-547 ◽

pp. 2040-2044

Author(s):

Yan Bo Wang

Keyword(s):

Data Mining ◽

Association Rules ◽

Granular Computing ◽

Rapid Development ◽

System Structure ◽

Data Set ◽

Mining Technology ◽

Technology System ◽

Operation Process ◽

Database Technology

With the rapid development of network and database technology, data need to be processed massively increased, how to carry out effective data mining is a serious problem. The mature development of granular computing algorithm provides new ideas and new methods to study for data mining. Association rules of granular computing can reduce the number of object scanning data set, and improve the efficiency of the algorithm. In this paper we introduce the data source, classification, technology, system structure, operation process, application in other areas of data mining technology. Based on association rules of granular computing, data mining technology can provide quantitative basis for enterprise in screening assessment, so the service object has a stronger competitive advantage and focus more on its problems.

Download Full-text

Identifying small groups of foods that can predict achievement of key dietary recommendations: data mining of the UK National Diet and Nutrition Survey, 2008–12

Public Health Nutrition ◽

10.1017/s1368980016000185 ◽

2016 ◽

Vol 19 (9) ◽

pp. 1543-1551 ◽

Cited By ~ 15

Author(s):

Philippe J Giabbanelli ◽

Jean Adams

Keyword(s):

Data Mining ◽

Decision Trees ◽

Dietary Assessment ◽

Saturated Fat ◽

Fruit And Vegetables ◽

Dietary Recommendations ◽

Nutrition Survey ◽

Data Set ◽

Redundant Data ◽

Dietary Assessment Methods

AbstractObjectiveMany dietary assessment methods attempt to estimate total food and nutrient intake. If the intention is simply to determine whether participants achieve dietary recommendations, this leads to much redundant data. We used data mining techniques to explore the number of foods that intake information was required on to accurately predict achievement, or not, of key dietary recommendations.DesignWe built decision trees for achievement of recommendations for fruit and vegetables, sodium, fat, saturated fat and free sugars using data from a national dietary surveillance data set. Decision trees describe complex relationships between potential predictor variables (age, sex and all foods listed in the database) and outcome variables (achievement of each of the recommendations).SettingUK National Diet and Nutrition Survey (NDNS, 2008–12).SubjectsThe analysis included 4156 individuals.ResultsInformation on consumption of 113 out of 3911 (3 %) foods, plus age and sex was required to accurately categorize individuals according to all five recommendations. The best trade-off between decision tree accuracy and number of foods included occurred at between eleven (for fruit and vegetables) and thirty-two (for fat, plus age) foods, achieving an accuracy of 72 % (for fat) to 83 % (for fruit and vegetables), with similar values for sensitivity and specificity.ConclusionsUsing information on intake of 113 foods, it is possible to predict with 72–83 % accuracy whether individuals achieve key dietary recommendations. Substantial further research is required to make use of these findings for dietary assessment.

Download Full-text

Improved Classification Techniques to Predict the Co-disease in Diabetic Mellitus Patients using Discretization and Apriori Algorithm

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.k1434.0981119 ◽

2019 ◽

Vol 8 (11) ◽

pp. 730-733

Keyword(s):

Data Mining ◽

Association Rules ◽

Census Data ◽

Early Stage ◽

Research Work ◽

Numerical Data ◽

Medical Data ◽

Data Sets ◽

Apriori Algorithm ◽

Data Set

The demand for data mining is now unavoidable in the medical industry due to its various applications and uses in predicting the diseases at the early stage. The methods available in the data mining theories are easy to extract the useful patterns and speed to recognize the task based outcomes. In data mining the classification models are really useful in building the classes for the medical data sets for future analysis in an accurate way. Besides these facilities, Association rules in data mining are a promising technique to find hidden patterns in a medical data set and have been successfully applied with market basket data, census data and financial data. Apriori algorithm, is considered to be a classic algorithm, is useful in mining frequent item sets on a database containing a large number of transactions and it also predicts the relevant association rules. Association rules capture the relationship of items that are present in data sets and when the data set contains continuous attributes, the existing algorithms may not work due to this, discretization can be applied to the association rules in order to find the relation between various patterns in data set. In this paper of our research, using Discretized Apriori the research work is done to predict the by-disease in people who are found with diabetic syndrome; also the rules extracted are analyzed. In the discretization step, numerical data is discretized and fed to the Apriori algorithm for better association rules to predict the diseases.

Download Full-text

Implementasi Data Mining Menggunakan Algoritma Apriori Untuk Meningkatkan Pola Penjualan Obat

JATISI (Jurnal Teknik Informatika dan Sistem Informasi) ◽

10.35957/jatisi.v7i2.195 ◽

2020 ◽

Vol 7 (2) ◽

pp. 262-276

Author(s):

Alexander J.P. Sibarani

Keyword(s):

Data Mining ◽

Association Rules ◽

Association Rule ◽

Data Set

Dengan adanya kegiatan transaksi penjualan setiap hari, data semakin lama akan semakin bertambah banyak. Data tersebut tidak hanya berfungsi sebagai arsip bagi perusahaan, data tersebut dapat dimanfaatkan dan diolah menjadi informasi untuk meningkatan penjualan obat. Permasalahan yang sering timbul di Apotik Pusaka Arta yaitu sering sekali penjualan obat yang diinginkan konsumen tidak ada atau habis karena apotek tidak memperhatikan stok, apotek tidak memanfaatkan data transaksi penjualan yang ada dan biasanya data transaksi penjualan tersebut hanya menjadi arsip yang tidak dimanfaatkan. Untuk memecahkan masalah tersebut, maka dibuatlah aplikasi Data mining menggunakan Algoritma Apriori. Metode yang dipakai penulis dalam menerapkan penelitian ini adalah Association Rules. Asociation Rule merupakan suatu teknik dalam data mining untuk menentukan hubungan antar item dalam satu data set (sekumpulan data) yang telah ditentukan. Teknik ini mencari kemungkinan kombinasi yang sering muncul (frequenct) dari suatu itemset (sekumpulan item). Dalam penelitian ini Association Rule berfungsi untuk menganalisa beberapa sering suatu obat yang sering dijual secara bersamaan, analisis ini akan ditinjau dari data transaksi yang telah terjadi. Penerapan Algoritma Apriori dalam aplikasi ini berhasil mencari kombinasi item terbanyak berdasarkan data transaksi dan kemudian membentuk pola asosiasi dari kombinasi item tersebut. Hasil aplikasi ini dapat mengetahui apa saja obat yang sering dibeli oleh konsumen secara bersamaan sehingga dapat mengetahui pola penjualan obat.

Download Full-text

Uncovering Actionable Knowledge in Corporate Data with Qualified Association Rules

International Journal of Business Intelligence Research ◽

10.4018/jbir.2011040101 ◽

2011 ◽

Vol 2 (2) ◽

pp. 1-21 ◽

Cited By ~ 1

Author(s):

Nenad Jukic ◽

Svetlozar Nestorov ◽

Miguel Velasco ◽

Jami Eddington

Keyword(s):

Data Mining ◽

Association Rules ◽

Mining Method ◽

Data Set ◽

Association Rules Mining ◽

Standard Data ◽

Additional Information ◽

Day Of The Week ◽

Mining Methods ◽

Corporate Actions

Association rules mining is one of the most successfully applied data mining methods in today’s business settings (e.g. Amazon or Netflix recommendations to customers). Qualified association rules mining is an extension of the association rules data mining method, that uncovers previously unknown correlations that only manifest themselves under certain circumstances (e.g. on a particular day of the week), with the goal of improving action results, e.g. turning an underperforming campaign (spread too thin over the entire audience) into a highly targeted campaign that delivers results. Such correlations have not been easily reachable using standard data mining tools so far. This paper describes the method for straightforward discovery of qualified association rules and demonstrates the use of qualified association rules mining on an actual corporate data set. The data set is a subset of a corporate data warehouse for Sam’s Club, a division of Wal-Mart Stores, INC. The experiments described in this paper illustrate how qualified association rules supplement standard association rules data mining methods and provide additional information which can be used to better target corporate actions.

Download Full-text