Association Mining for Super Market Sales using UP Growth and Top-K Algorithm

Frequent itemsets(HUIs) mining is an evolving field in data mining, that centers around finding itemsets having a utility that meets a user-specified minimum utility by finding all the itemsets. A problem arises in setting up minimum utility exactly which causes difficulties for users. By setting minimum utility underneath average, too many incessant itemsets will be generated, which in turn will make the mining process quite inefficient. No frequent itemsets will be found if the minimum utility is set too huge. The research focuses on generating frequent itemsets by using the transaction weighted utility of each product. While using UP growth methodology for discovering high utility items from large datasets it takes more time and consumes more memory due to which it is less efficient. So to overcome these drawbacks of UP growth we use the Top-K algorithm which makes it more scalable and efficient. Therefore, we use the Top-K algorithm which does not require a minimum threshold.

Download Full-text

A Scalable Approach for Data Mining – AHUIM

Webology ◽

10.14704/web/v18i1/web18029 ◽

2021 ◽

Vol 18 (1) ◽

pp. 92-103

Author(s):

Vandna Dahiya ◽

Sandeep Dalal

Keyword(s):

Data Mining ◽

Research Paper ◽

Large Datasets ◽

Novel Technique ◽

Itemset Mining ◽

Essential Form ◽

High Utility

Utility itemset mining, which finds the item sets based on utility factors, has established itself as an essential form of data mining. The utility is defined in terms of quantity and some interest factor. Various methods have been developed so far by the researchers to mine these itemsets but most of them are not scalable. In the present times, a scalable approach is required that can fulfill the budding needs of data mining. A Spark based novel technique has been recommended in this research paper for mining the data in a distributed way, called as Absolute High Utility Itemset Mining (AHUIM). The technique is suitable for small as well as large datasets. The performance of the technique is being measured for various parameters such as speed, scalability, and accuracy etc.

Download Full-text

Optimized High-Utility Itemsets Mining for Effective Association Mining Paper

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v7i5.pp2911-2918 ◽

2017 ◽

Vol 7 (5) ◽

pp. 2911 ◽

Cited By ~ 1

Author(s):

K Rajendra Prasad

Keyword(s):

Association Rule ◽

Optimal Number ◽

Large Datasets ◽

Association Mining ◽

Rule Mining ◽

Utility Mining ◽

Utility Factor ◽

High Utility ◽

High Utility Itemsets ◽

Growth Methods

Association rule mining is intently used for determining the frequent itemsets of transactional database; however, it is needed to consider the utility of itemsets in market behavioral applications. Apriori or FP-growth methods generate the association rules without utility factor of items. High-utility itemset mining (HUIM) is a well-known method that effectively determines the itemsets based on high-utility value and the resulting itemsets are known as high-utility itemsets. Fastest high-utility mining method (FHM) is an enhanced version of HUIM. FHM reduces the number of join operations during itemsets generation, so it is faster than HUIM. For large datasets, both methods are very expenisve. Proposed method addressed this issue by building pruning based utility co-occurrence structure (PEUCS) for elimatination of low-profit itemsets, thus, obviously it process only optimal number of high-utility itemsets, so it is called as optimal FHM (OFHM). Experimental results show that OFHM takes less computational runtime, therefore it is more efficient when compared to other existing methods for benchmarked large datasets.

Download Full-text

Parallel Mining for High Utility Itemsets Mining by Efficient Data Structure

Research and Development on Information and Communication Technology ◽

10.32913/rd-ict.vol3.no14.519 ◽

2017 ◽

Author(s):

Nguyen Manh Hung ◽

Dau Hai Phong

Keyword(s):

Data Mining ◽

Data Structure ◽

Actual Number ◽

Utility Value ◽

Weighted Utility ◽

Parallel Mining ◽

Efficient Data ◽

Transaction Database ◽

High Utility ◽

High Utility Itemsets

Mining high utility itemsets in transaction database is an important task in data mining and widely applied in many areas. Recently, many algorithms have been proposed, but most algorithms for identifying high utility itemsets need to generate candidate sets by overestimating their utility and then calculating their exact utility value. Therefore, the number of candidate itemsets is much larger than the actual number of high utility itemsets. In this paper, we introduce the Retail Transaction-Weighted Utility (RTWU) structure and propose two algorithms: EAHUIMiner algorithm and PEAHUI-Miner parallel algorithm. They have been experimented and compared to the two most efficient algorithms: EFIM and FHM. Results show that our algorithm is better with sparse datasets. DOI: 10.32913/rd-ict.vol3.no14.519

Download Full-text

A Traditional Analysis for Efficient Data Mining with Integrated Association Mining into Regression Techniques

Lecture Notes in Electrical Engineering - ICCCE 2020 ◽

10.1007/978-981-15-7961-5_127 ◽

2020 ◽

pp. 1393-1404

Author(s):

G. SuryaNarayana ◽

Kamakshaiah Kolli ◽

Mohd Dilshad Ansari ◽

Vinit Kumar Gunjan

Keyword(s):

Data Mining ◽

Association Mining ◽

Efficient Data ◽

Regression Techniques

Download Full-text

ANGEL Mining

Higher Education Institutions and Learning Management Systems ◽

10.4018/978-1-60960-884-2.ch005 ◽

2012 ◽

pp. 94-115 ◽

Cited By ~ 2

Author(s):

Tyler Swanger ◽

Kaitlyn Whitlock ◽

Anthony Scime ◽

Brendan P. Post

Keyword(s):

Data Mining ◽

Decision Making ◽

Feature Selection ◽

Decision Tree ◽

Management System ◽

Association Mining ◽

Data Mining Techniques ◽

Usage Patterns ◽

Design Selection ◽

Comprehensive College

This chapter data mines the usage patterns of the ANGEL Learning Management System (LMS) at a comprehensive college. The data includes counts of all the features ANGEL offers its users for the Fall and Spring semesters of the academic years beginning in 2007 and 2008. Data mining techniques are applied to evaluate which LMS features are used most commonly and most effectively by instructors and students. Classification produces a decision tree which predicts the courses that will use the ANGEL system based on course specific attributes. The dataset undergoes association mining to discover the usage of one feature’s effect on the usage of another set of features. Finally, clustering the data identifies messages and files as the features most commonly used. These results can be used by this institution, as well as similar institutions, for decision making concerning feature selection and overall usefulness of LMS design, selection and implementation.

Download Full-text

Security and Verification of Server Data Using Frequent Itemset Mining in Ecommerce

International Journal of Synthetic Emotions ◽

10.4018/ijse.2017010103 ◽

2017 ◽

Vol 8 (1) ◽

pp. 31-43

Author(s):

Zuber Shaikh ◽

Antara Mohadikar ◽

Rachana Nayak ◽

Rohith Padamadan

Keyword(s):

Data Mining ◽

Frequent Itemsets ◽

Frequent Itemset ◽

Graphical Password ◽

Itemset Mining ◽

Frequent Item ◽

Data Mining Algorithms ◽

Shoulder Surfing ◽

Mining Algorithms ◽

Frequent Item Sets

Frequent itemsets refer to a set of data values (e.g., product items) whose number of co-occurrences exceeds a given threshold. The challenge is that the design of proofs and verification objects has to be customized for different data mining algorithms. Intended method will implement a basic idea of completeness verification and authentication approach in which the client will uses a set of frequent item sets as the evidence, and checks whether the server has missed any frequent item set as evidence in its returned result. It will help client detect untrusted server and system will become much more efficiency by reducing time. In authentication process CaRP is both a captcha and a graphical password scheme. CaRP addresses a number of security problems altogether, such as online guessing attacks, relay attacks, and, if combined with dual-view technologies, shoulder-surfing attacks.

Download Full-text

Mining of top-k high utility itemsets with negative utility

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-201357 ◽

2020 ◽

pp. 1-16

Author(s):

Rui Sun ◽

Meng Han ◽

Chunyan Zhang ◽

Mingyao Shen ◽

Shiyu Du

Keyword(s):

Data Mining ◽

Search Space ◽

Experimental Results ◽

Effective Algorithm ◽

Memory Usage ◽

Utility Value ◽

Itemset Mining ◽

High Utility ◽

High Utility Itemsets

High utility itemset mining(HUIM) with negative utility is an emerging data mining task. However, the setting of the minimum utility threshold is always a challenge when mining high utility itemsets(HUIs) with negative items. Although the top-k HUIM method is very common, this method can only mine itemsets with positive items, and the problem of missing itemsets occurs when mining itemsets with negative items. To solve this problem, we first propose an effective algorithm called THN (Top-k High Utility Itemset Mining with Negative Utility). It proposes a strategy for automatically increasing the minimum utility threshold. In order to solve the problem of multiple scans of the database, it uses transaction merging and dataset projection technology. It uses a redefined sub-tree utility value and a redefined local utility value to prune the search space. Experimental results on real datasets show that THN is efficient in terms of runtime and memory usage, and has excellent scalability. Moreover, experiments show that THN performs particularly well on dense datasets.

Download Full-text

Personalized web search on e-commerce using ontology based association mining

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i1.1.9487 ◽

2017 ◽

Vol 7 (1.1) ◽

pp. 286

Author(s):

B. Sekhar Babu ◽

P. Lakshmi Prasanna ◽

P. Vidyullatha

Keyword(s):

Data Mining ◽

Web Search ◽

Large Data ◽

Association Mining ◽

Data Sets ◽

Data Mining Algorithm ◽

Web Data ◽

Data Mining Technique ◽

Web Data Mining ◽

The Web

In current days, World Wide Web has grown into a familiar medium to investigate the new information, Business trends, trading strategies so on. Several organizations and companies are also contracting the web in order to present their products or services across the world. E-commerce is a kind of business or saleable transaction that comprises the transfer of statistics across the web or internet. In this situation huge amount of data is obtained and dumped into the web services. This data overhead tends to arise difficulties in determining the accurate and valuable information, hence the web data mining is used as a tool to determine and mine the knowledge from the web. Web data mining technology can be applied by the E-commerce organizations to offer personalized E-commerce solutions and better meet the desires of customers. By using data mining algorithm such as ontology based association rule mining using apriori algorithms extracts the various useful information from the large data sets .We are implementing the above data mining technique in JAVA and data sets are dynamically generated while transaction is processing and extracting various patterns.

Download Full-text

An Approach to Generate Software Agents for Health Data Mining

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194017400125 ◽

2017 ◽

Vol 27 (09n10) ◽

pp. 1579-1589 ◽

Cited By ~ 1

Author(s):

Reinier Morejón ◽

Marx Viana ◽

Carlos Lucena

Keyword(s):

Machine Learning ◽

Data Mining ◽

Software Engineering ◽

Multiagent Systems ◽

Significant Role ◽

Software Agents ◽

Health Data ◽

Large Datasets ◽

Agent Oriented Software Engineering ◽

Data Volume

Data mining is a hot topic that attracts researchers of different areas, such as database, machine learning, and agent-oriented software engineering. As a consequence of the growth of data volume, there is an increasing need to obtain knowledge from these large datasets that are very difficult to handle and process with traditional methods. Software agents can play a significant role performing data mining processes in ways that are more efficient. For instance, they can work to perform selection, extraction, preprocessing, and integration of data as well as parallel, distributed, or multisource mining. This paper proposes a framework based on multiagent systems to apply data mining techniques to health datasets. Last but not least, the usage scenarios that we use are datasets for hypothyroidism and diabetes and we run two different mining processes in parallel in each database.

Download Full-text

Efficiently Finding High Utility-Frequent Itemsets Using Cutoff and Suffix Utility

Advances in Knowledge Discovery and Data Mining - Lecture Notes in Computer Science ◽

10.1007/978-3-030-16145-3_15 ◽

2019 ◽

pp. 191-203 ◽

Cited By ~ 5

Author(s):

R. Uday Kiran ◽

T. Yashwanth Reddy ◽

Philippe Fournier-Viger ◽

Masashi Toyoda ◽

P. Krishna Reddy ◽

...

Keyword(s):

Frequent Itemsets ◽

High Utility

Download Full-text