scholarly journals Effective Data Mining for a Transportation Information System

10.14311/931 ◽  
2008 ◽  
Vol 48 (1) ◽  
Author(s):  
P. Haluzová

This paper describes the application of data mining methods in the database of the DORIS transportation information system, currently used by the Prague Public Transit Company. The goal is to create knowledge about the behavior of objects within this information system. Data is analyzed partly with the help of descriptive statistical methods, and partly with the help of association rules, which may discover common combinations of attributes that occur most frequently within a given data set. Two types of quantifiers were used when creating the association rules; namely “founded implication” and “above average”. The results of the analysis are presented in the form of graphs and hypotheses. 

2011 ◽  
Vol 2 (2) ◽  
pp. 1-21 ◽  
Author(s):  
Nenad Jukic ◽  
Svetlozar Nestorov ◽  
Miguel Velasco ◽  
Jami Eddington

Association rules mining is one of the most successfully applied data mining methods in today’s business settings (e.g. Amazon or Netflix recommendations to customers). Qualified association rules mining is an extension of the association rules data mining method, that uncovers previously unknown correlations that only manifest themselves under certain circumstances (e.g. on a particular day of the week), with the goal of improving action results, e.g. turning an underperforming campaign (spread too thin over the entire audience) into a highly targeted campaign that delivers results. Such correlations have not been easily reachable using standard data mining tools so far. This paper describes the method for straightforward discovery of qualified association rules and demonstrates the use of qualified association rules mining on an actual corporate data set. The data set is a subset of a corporate data warehouse for Sam’s Club, a division of Wal-Mart Stores, INC. The experiments described in this paper illustrate how qualified association rules supplement standard association rules data mining methods and provide additional information which can be used to better target corporate actions.


Author(s):  
Nenad Jukic ◽  
Svetlozar Nestorov ◽  
Miguel Velasco ◽  
Jami Eddington

Association rules mining is one of the most successfully applied data mining methods in today’s business settings (e.g. Amazon or Netflix recommendations to customers). Qualified association rules mining is an extension of the association rules data mining method, that uncovers previously unknown correlations that only manifest themselves under certain circumstances (e.g. on a particular day of the week), with the goal of improving action results, e.g. turning an underperforming campaign (spread too thin over the entire audience) into a highly targeted campaign that delivers results. Such correlations have not been easily reachable using standard data mining tools so far. This paper describes the method for straightforward discovery of qualified association rules and demonstrates the use of qualified association rules mining on an actual corporate data set. The data set is a subset of a corporate data warehouse for Sam’s Club, a division of Wal-Mart Stores, INC. The experiments described in this paper illustrate how qualified association rules supplement standard association rules data mining methods and provide additional information which can be used to better target corporate actions.


Author(s):  
Anthony Scime ◽  
Karthik Rajasethupathy ◽  
Kulathur S. Rajasethupathy ◽  
Gregg R. Murray

Data mining is a collection of algorithms for finding interesting and unknown patterns or rules in data. However, different algorithms can result in different rules from the same data. The process presented here exploits these differences to find particularly robust, consistent, and noteworthy rules among much larger potential rule sets. More specifically, this research focuses on using association rules and classification mining to select the persistently strong association rules. Persistently strong association rules are association rules that are verifiable by classification mining the same data set. The process for finding persistent strong rules was executed against two data sets obtained from the American National Election Studies. Analysis of the first data set resulted in one persistent strong rule and one persistent rule, while analysis of the second data set resulted in 11 persistent strong rules and 10 persistent rules. The persistent strong rule discovery process suggests these rules are the most robust, consistent, and noteworthy among the much larger potential rule sets.


2017 ◽  
Vol 9 (1) ◽  
pp. 38-49
Author(s):  
Fatma Önay Koçoğlu ◽  
İlkim Ecem Emre ◽  
Çiğdem Selçukcan Erol

The aim of this study is to analyze success in e-learning with data mining methods and find out potential patterns. In this context, 374.073 data of 2013-14 period taken from an institution serving in e-learning field in Turkey are used. Data set, which is collected from information technology, banking and pharmaceutical industries, includes success and industry of employees', trainings which they complete, whether the trainings are completed, first login and last logout dates, training completion date and duration of experience in training. Using this data set, success status of participants is observed by using data mining methods (C5.0, Random Forest and Gini). By observing using accuracy, error rate, specificity and f- score from performance evaluation criteria, C5.0 has chosen the algorithm which gives the best performance results. According to the results of the study, it has been determined that the sectors of the employees are not important, on the contrary the ones that are important are the completion status, the duration of experience and training.


Author(s):  
Paolo Giudici

Several classes of computational and statistical methods for data mining are available. Each class can be parameterised so that models within the class differ in terms of such parameters (see, for instance, Giudici, 2003; Hastie et al., 2001; Han & Kamber, 2000; Hand et al., 2001; Witten & Frank, 1999): for example, the class of linear regression models, which differ in the number of explanatory variables; the class of Bayesian networks, which differ in the number of conditional dependencies (links in the graph); the class of tree models, which differ in the number of leaves; and the class multi-layer perceptrons, which differ in terms of the number of hidden strata and nodes. Once a class of models has been established the problem is to choose the “best” model from it.


2014 ◽  
Vol 998-999 ◽  
pp. 842-845 ◽  
Author(s):  
Jia Mei Guo ◽  
Yin Xiang Pei

Association rules extraction is one of the important goals of data mining and analyzing. Aiming at the problem that information lose caused by crisp partition of numerical attribute , in this article, we put forward a fuzzy association rules mining method based on fuzzy logic. First, we use c-means clustering to generate fuzzy partitions and eliminate redundant data, and then map the original data set into fuzzy interval, in the end, we extract the fuzzy association rules on the fuzzy data set as providing the basis for proper decision-making. Results show that this method can effectively improve the efficiency of data mining and the semantic visualization and credibility of association rules.


2018 ◽  
Vol 0 (7/2018) ◽  
pp. 45-52
Author(s):  
Bolesław Szafrański ◽  
Mirosław Zieja ◽  
Jarosław Wójcik ◽  
Krzysztof Murawski

The article is devoted to the analysis of data coming from the operation process and collected in computer system TURAWA, which focuses on supporting the management of flight safety in the Polish Air Force. The Armed Forces are equipped with a system, which collects and processes data concerning the whole air crew, all performed flights and all aircraft. The increasing opportunities in obtaining data and the continuous development of data mining methods allow to extract information never been known before, which, together with conclusions obtained from the data analysis, will help to improve the level of flight safety.


2017 ◽  
Author(s):  
Andysah Putera Utama Siahaan ◽  
Mesran Mesran ◽  
Andre Hasudungan Lubis ◽  
Ali Ikhwan ◽  
Supiyandi

Sales transaction data on a company will continue to increase day by day. Large amounts of data can be problematic for a company if it is not managed properly. Data mining is a field of science that unifies techniques from machine learning, pattern processing, statistics, databases, and visualization to handle the problem of retrieving information from large databases. The relationship sought in data mining can be a relationship between two or more in one dimension. The algorithm included in association rules in data mining is the Frequent Pattern Growth (FP-Growth) algorithm is one of the alternatives that can be used to determine the most frequent itemset in a data set.


Sign in / Sign up

Export Citation Format

Share Document