A Data Warehouse Cleansing Approach Based on Mathematical Association Rules

The way that email has extraordinary significance in present day business communication is certain. Consistently, a bulk of emails is sent from organizations to clients and suppliers, from representatives to their managers and starting with one colleague then onto the next. In this way there is vast of email in data warehouse. Data cleaning is an activity performed on the data sets of data warehouse to upgrade and keep up the quality and consistency of the data. This paper underlines the issues related with dirty data, detection of duplicatein email column. The paper identifies the strategy of data cleaning from adifferent point of view. It provides an algorithm to the discovery of error and duplicates entries in the data sets of existing data warehouse. The paper characterizes the alliance rules based on the concept of mathematical association rules to determine the duplicate entries in email column in data sets.

Download Full-text

The Integral of Spatial Data Mining in the Era of Big Data

Advances in Business Information Systems and Analytics - Handbook of Research on Advanced Data Mining Techniques and Applications for Business Intelligence ◽

10.4018/978-1-5225-2031-3.ch006 ◽

2017 ◽

pp. 90-126

Author(s):

Gebeyehu Belay Gebremeskel ◽

Chai Yi ◽

Zhongshi He

Keyword(s):

Data Mining ◽

Data Warehouse ◽

Spatial Data ◽

High Volume ◽

Spatial Data Mining ◽

Research Field ◽

Data Sets ◽

Data Types ◽

Basic Principles ◽

Gis Data

Data Mining (DM) is a rapidly expanding field in many disciplines, and it is greatly inspiring to analyze massive data types, which includes geospatial, image and other forms of data sets. Such the fast growths of data characterized as high volume, velocity, variety, variability, value and others that collected and generated from various sources that are too complex and big to capturing, storing, and analyzing and challenging to traditional tools. The SDM is, therefore, the process of searching and discovering valuable information and knowledge in large volumes of spatial data, which draws basic principles from concepts in databases, machine learning, statistics, pattern recognition and 'soft' computing. Using DM techniques enables a more efficient use of the data warehouse. It is thus becoming an emerging research field in Geosciences because of the increasing amount of data, which lead to new promising applications. The integral SDM in which we focused in this chapter is the inference to geospatial and GIS data.

Download Full-text

Finding Persistent Strong Rules

Knowledge Discovery Practices and Emerging Applications of Data Mining - Advances in Data Mining and Database Management ◽

10.4018/978-1-60960-067-9.ch005 ◽

2010 ◽

pp. 85-107

Author(s):

Anthony Scime ◽

Karthik Rajasethupathy ◽

Kulathur S. Rajasethupathy ◽

Gregg R. Murray

Keyword(s):

Data Mining ◽

Association Rules ◽

Strong Association ◽

National Election ◽

Data Sets ◽

Rule Discovery ◽

Discovery Process ◽

Data Set ◽

Rule Sets ◽

Election Studies

Data mining is a collection of algorithms for finding interesting and unknown patterns or rules in data. However, different algorithms can result in different rules from the same data. The process presented here exploits these differences to find particularly robust, consistent, and noteworthy rules among much larger potential rule sets. More specifically, this research focuses on using association rules and classification mining to select the persistently strong association rules. Persistently strong association rules are association rules that are verifiable by classification mining the same data set. The process for finding persistent strong rules was executed against two data sets obtained from the American National Election Studies. Analysis of the first data set resulted in one persistent strong rule and one persistent rule, while analysis of the second data set resulted in 11 persistent strong rules and 10 persistent rules. The persistent strong rule discovery process suggests these rules are the most robust, consistent, and noteworthy among the much larger potential rule sets.

Download Full-text

XML-Enabled Association Analysis

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch324 ◽

2011 ◽

pp. 2117-2122

Author(s):

Ling Feng

Keyword(s):

Data Mining ◽

Decision Support ◽

Marketing Strategy ◽

Association Rules ◽

Association Rule ◽

Rule Mining ◽

Xml Data ◽

Market Basket ◽

Transactional Databases ◽

Mining Association Rules

The discovery of association rules from large amounts of structured or semi-structured data is an important data mining problem [Agrawal et al. 1993, Agrawal and Srikant 1994, Miyahara et al. 2001, Termier et al. 2002, Braga et al. 2002, Cong et al. 2002, Braga et al. 2003, Xiao et al. 2003, Maruyama and Uehara 2000, Wang and Liu 2000]. It has crucial applications in decision support and marketing strategy. The most prototypical application of association rules is market basket analysis using transaction databases from supermarkets. These databases contain sales transaction records, each of which details items bought by a customer in the transaction. Mining association rules is the process of discovering knowledge such as “80% of customers who bought diapers also bought beer, and 35% of customers bought both diapers and beer”, which can be expressed as “diaper ? beer” (35%, 80%), where 80% is the confidence level of the rule, and 35% is the support level of the rule indicating how frequently the customers bought both diapers and beer. In general, an association rule takes the form X ? Y (s, c), where X and Y are sets of items, and s and c are support and confidence, respectively. In the XML Era, mining association rules is confronted with more challenges than in the traditional well-structured world due to the inherent flexibilities of XML in both structure and semantics [Feng and Dillon 2005]. First, XML data has a more complex hierarchical structure than a database record. Second, elements in XML data have contextual positions, which thus carry the order notion. Third, XML data appears to be much bigger than traditional data. To address these challenges, the classic association rule mining framework originating with transactional databases needs to be re-examined.

Download Full-text

Visual Data Mining for Discovering Association Rules

Data Warehousing and Mining ◽

10.4018/978-1-59904-951-9.ch125 ◽

2008 ◽

pp. 2105-2120

Author(s):

Kesaraporn Techapichetvanich ◽

Amitava Datta

Keyword(s):

Data Mining ◽

Association Rules ◽

Association Rule ◽

Large Data ◽

Data Sets ◽

Visual Data Mining ◽

Useful Knowledge ◽

Large Databases ◽

A New Technique ◽

Mining Association Rule

Both visualization and data mining have become important tools in discovering hidden relationships in large data sets, and in extracting useful knowledge and information from large databases. Even though many algorithms for mining association rules have been researched extensively in the past decade, they do not incorporate users in the association-rule mining process. Most of these algorithms generate a large number of association rules, some of which are not practically interesting. This chapter presents a new technique that integrates visualization into the mining association rule process. Users can apply their knowledge and be involved in finding interesting association rules through interactive visualization, after obtaining visual feedback as the algorithm generates association rules. In addition, the users gain insight and deeper understanding of their data sets, as well as control over mining meaningful association rules.

Download Full-text

Mining Association Rules in Data Warehouses

Data Warehousing and Mining ◽

10.4018/978-1-59904-951-9.ch020 ◽

2008 ◽

pp. 303-335

Author(s):

Haorianto Cokrowijoyo Tjioe ◽

David Taniar

Keyword(s):

Data Mining ◽

Association Rules ◽

Strategic Decision ◽

Multidimensional Data ◽

Data Warehouses ◽

Multidimensional Databases ◽

Efficient Data ◽

Decision Making Processes ◽

Mining Association Rules ◽

Transactional Data

Data mining applications have enormously altered the strategic decision-making processes of organizations. The application of association rules algorithms is one of the well-known data mining techniques that have been developed to cope with multidimensional databases. However, most of these algorithms focus on multidimensional data models for transactional data. As data warehouses can be presented using a multidimensional model, in this paper we provide another perspective to mine association rules in data warehouses by focusing on a measurement of summarized data. We propose four algorithms — VAvg, HAvg, WMAvg, and ModusFilter — to provide efficient data initialization for mining association rules in data warehouses by concentrating on the measurement of aggregate data. Then we apply those algorithms both on a non-repeatable predicate, which is known as mining normal association rules, using GenNLI, and a repeatable predicate using ComDims and GenHLI, which is known as mining hybrid association rules.

Download Full-text

Visual Data Mining for Discovering Association Rules

Business Applications and Computational Intelligence ◽

10.4018/978-1-59140-702-7.ch011 ◽

2011 ◽

pp. 209-226

Author(s):

Kesaraporn Techapichetvanich ◽

Amitava Datta

Keyword(s):

Data Mining ◽

Association Rules ◽

Association Rule ◽

Large Data ◽

Data Sets ◽

Visual Data Mining ◽

Useful Knowledge ◽

Large Databases ◽

A New Technique ◽

Mining Association Rule

Both visualization and data mining have become important tools in discovering hidden relationships in large data sets, and in extracting useful knowledge and information from large databases. Even though many algorithms for mining association rules have been researched extensively in the past decade, they do not incorporate users in the association-rule mining process. Most of these algorithms generate a large number of association rules, some of which are not practically interesting. This chapter presents a new technique that integrates visualization into the mining association rule process. Users can apply their knowledge and be involved in finding interesting association rules through interactive visualization, after obtaining visual feedback as the algorithm generates association rules. In addition, the users gain insight and deeper understanding of their data sets, as well as control over mining meaningful association rules.

Download Full-text

A Novel Efficient Mining Association Rules Algorithm for Distributed Databases

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.108-111.50 ◽

2010 ◽

Vol 108-111 ◽

pp. 50-56 ◽

Cited By ~ 2

Author(s):

Liang Zhong Shen

Keyword(s):

Data Mining ◽

Knowledge Discovery ◽

Association Rules ◽

Association Rule ◽

Association Rule Mining ◽

Distributed Databases ◽

New Method ◽

Rule Mining ◽

Mining Association Rules

Due to the popularity of knowledge discovery and data mining, in practice as well as among academic and corporate professionals, association rule mining is receiving increasing attention. The technology of data mining is applied in analyzing data in databases. This paper puts forward a new method which is suit to design the distributed databases.

Download Full-text

Application of Association Rules Mining in Employment Guidance

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.479-481.129 ◽

2012 ◽

Vol 479-481 ◽

pp. 129-132

Author(s):

Lei Wang ◽

Cun Xiao Yi

Keyword(s):

Data Mining ◽

Association Rules ◽

Important Task ◽

Complex Data ◽

Employment Rate ◽

Association Rules Mining ◽

Mining Association Rules ◽

Vocational Colleges

How to improve the employment rate of graduates is an important task for higher vocational colleges to solve. In order to effectively improve their Employment competitiveness, advice should be made to help students to enhance specific kinds of learning and ability. Association Rules Mining is one core of the Data Mining Association Rules, it’s helpful in finding useful information hidden in complex data. By using Association Rules Mining in finding the knowledge and ability which helps employed students to earn their jobs, necessary ability for each kind of job can be found, and then advice offered for students to target their employment career will be more exact and proper.

Download Full-text

The Application of Apriori Algorithm in Analysis on Admitted Students of Colleges and Universities

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.321-324.2578 ◽

2013 ◽

Vol 321-324 ◽

pp. 2578-2582

Author(s):

Qian Zhang

Keyword(s):

Data Mining ◽

Association Rules ◽

Colleges And Universities ◽

Apriori Algorithm ◽

Data Mining Techniques ◽

Minimum Support ◽

Sample Data ◽

Mining Association Rules

This paper examined the application of Apriori algorithm in extracting association rules in data mining by sample data on student enrollments. It studied the data mining techniques for extraction of association rules, analyzed the correlation between specialties and characteristics of admitted students, and evaluated the algorithm for mining association rules, in which the minimum support was 30% and the minimum confidence was 40%.

Download Full-text