A Weighted Frequent Item-Set Mining using WD-FIM Algorithm

Smart systems are the one of the most significant inventions of our times. These systems rely on powerful information mining techniques to achieve intelligence in decision making. Frequent item set mining (FIM), has become one of the most significant research area of data mining. The information present in databases is in-general ambiguous and uncertain. In such databases, one should think of weighted FIM to discover item sets which are significant from end user’s perspective. Be that as it may, with introduction of weight-factor for FIM makes the weighted continuous item sets may not fulfil the descending conclusion property anymore. Subsequently, the pursuit space of successive item set can't be limited by descending conclusion property which prompts a poor time effectiveness. In this paper, we introduce two properties for FIM, first one is, weight judgment downward closure property (WD-FIM), it is for weighted FIM and the second one is existence property for its subsets. In view of above two properties, the WD-FIM calculation is proposed to limit the looking through space of the weighted regular item sets and improve the time effectiveness. In addition, the culmination and time productivity of WD-FIM calculation are examined hypothetically. At last, the exhibition of the proposed WD-FIM calculation is confirmed on both engineered and genuine data sets

Download Full-text

The comparative study of text documents clustering algorithms

Environment Conservation Journal ◽

10.36953/ecj.2015.se1614 ◽

2015 ◽

Vol 16 (SE) ◽

pp. 133-138

Author(s):

Mohammad Eiman Jamnezhad ◽

Reza Fattahi

Keyword(s):

Data Mining ◽

Dna Analysis ◽

Clustering Algorithms ◽

Research Area ◽

Large Set ◽

Text Documents ◽

Web Documents ◽

Significant Research ◽

The Comparative Study ◽

F Measure

Clustering is one of the most significant research area in the field of data mining and considered as an important tool in the fast developing information explosion era.Clustering systems are used more and more often in text mining, especially in analyzing texts and to extracting knowledge they contain. Data are grouped into clusters in such a way that the data of the same group are similar and those in other groups are dissimilar. It aims to minimizing intra-class similarity and maximizing inter-class dissimilarity. Clustering is useful to obtain interesting patterns and structures from a large set of data. It can be applied in many areas, namely, DNA analysis, marketing studies, web documents, and classification. This paper aims to study and compare three text documents clustering, namely, k-means, k-medoids, and SOM through F-measure.

Download Full-text

Video Data Mining

Encyclopedia of Data Warehousing and Mining ◽

10.4018/978-1-59140-557-3.ch223 ◽

2011 ◽

pp. 1185-1189 ◽

Cited By ~ 2

Author(s):

Jung Hwan Oh ◽

Jeong Kyu Lee ◽

Sae Hwang

Keyword(s):

Data Mining ◽

Research Area ◽

Multimedia Databases ◽

Video Data ◽

Multimedia Data ◽

Data Sets ◽

Data Set ◽

Useful Knowledge ◽

Active Research ◽

Diverse Data

Data mining, which is defined as the process of extracting previously unknown knowledge and detecting interesting patterns from a massive set of data, has been an active research area. As a result, several commercial products and research prototypes are available nowadays. However, most of these studies have focused on corporate data — typically in an alpha-numeric database, and relatively less work has been pursued for the mining of multimedia data (Zaïane, Han, & Zhu, 2000). Digital multimedia differs from previous forms of combined media in that the bits representing texts, images, audios, and videos can be treated as data by computer programs (Simoff, Djeraba, & Zaïane, 2002). One facet of these diverse data in terms of underlying models and formats is that they are synchronized and integrated hence, can be treated as integrated data records. The collection of such integral data records constitutes a multimedia data set. The challenge of extracting meaningful patterns from such data sets has lead to research and development in the area of multimedia data mining. This is a challenging field due to the non-structured nature of multimedia data. Such ubiquitous data is required in many applications such as financial, medical, advertising and Command, Control, Communications and Intelligence (C3I) (Thuraisingham, Clifton, Maurer, & Ceruti, 2001). Multimedia databases are widespread and multimedia data sets are extremely large. There are tools for managing and searching within such collections, but the need for tools to extract hidden and useful knowledge embedded within multimedia data is becoming critical for many decision-making applications.

Download Full-text

AN OPTIMIZED ARM SCHEME FOR DISTINCT NETWORK DATA SET

International Journal of Computer and Communication Technology ◽

10.47893/ijcct.2015.1302 ◽

2015 ◽

pp. 191-195

Author(s):

K.GANESH KUMAR ◽

H.VIGNESH RAMAMOORTHY ◽

M.PREM KUMAR ◽

S. SUDHA

Keyword(s):

Data Mining ◽

Association Rule ◽

Association Rule Mining ◽

Distributed Databases ◽

Research Area ◽

Sequential Algorithm ◽

Data Sets ◽

Rule Mining ◽

Data Set ◽

Communication Costs

Association rule mining (ARM) discovers correlations between different item sets in a transaction database. It provides important knowledge in business for decision makers. Association rule mining is an active data mining research area and most ARM algorithms cater to a centralized environment. Centralized data mining to discover useful patterns in distributed databases isn't always feasible because merging data sets from different sites incurs huge network communication costs. In this paper, an improved algorithm based on good performance level for data mining is being proposed. In local sites, it runs the application based on the improved LMatrix algorithm, which is used to calculate local support counts. Local Site also finds a center site to manage every message exchanged to obtain all globally frequent item sets. It also reduces the time of scan of partition database by using LMatrix which increases the performance of the algorithm. Therefore, the research is to develop a distributed algorithm for geographically distributed data sets that reduces communication costs, superior running efficiency, and stronger scalability than direct application of a sequential algorithm in distributed databases.

Download Full-text

Data Privacy Preservation and Security Approaches for Sensitive Data in Big Data

10.3233/apc210221 ◽

2021 ◽

Author(s):

Rohit Ravindra Nikam ◽

Rekha Shahapurkar

Keyword(s):

Data Mining ◽

Data Analytics ◽

Data Privacy ◽

Privacy Preservation ◽

Large Data ◽

Research Area ◽

Data Sets ◽

Sensitive Information ◽

Sensitive Data ◽

Data Mining Techniques

Data mining is a technique that explores the necessary data is extracted from large data sets. Privacy protection of data mining is about hiding the sensitive information or identity of breach security or without losing data usability. Sensitive data contains confidential information about individuals, businesses, and governments who must not agree upon before sharing or publishing his privacy data. Conserving data mining privacy has become a critical research area. Various evaluation metrics such as performance in terms of time efficiency, data utility, and degree of complexity or resistance to data mining techniques are used to estimate the privacy preservation of data mining techniques. Social media and smart phones produce tons of data every minute. To decision making, the voluminous data produced from the different sources can be processed and analyzed. But data analytics are vulnerable to breaches of privacy. One of the data analytics frameworks is recommendation systems commonly used by e-commerce sites such as Amazon, Flip Kart to recommend items to customers based on their purchasing habits that lead to characterized. This paper presents various techniques of privacy conservation, such as data anonymization, data randomization, generalization, data permutation, etc. such techniques which existing researchers use. We also analyze the gap between various processes and privacy preservation methods and illustrate how to overcome such issues with new innovative methods. Finally, our research describes the outcome summary of the entire literature.

Download Full-text

Integration of Data Mining and Operations Research

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch162 ◽

2011 ◽

pp. 1046-1052

Author(s):

Stephan Meisel

Keyword(s):

Machine Learning ◽

Data Mining ◽

Decision Making ◽

Operations Research ◽

Secondary Analysis ◽

The Other ◽

Optimization Models ◽

Optimal Solutions ◽

Number Of Publications ◽

The One

Basically, Data Mining (DM) and Operations Research (OR) are two paradigms independent of each other. OR aims at optimal solutions of decision problems with respect to a given goal. DM is concerned with secondary analysis of large amounts of data (Hand et al., 2001). However, there are some commonalities. Both paradigms are application focused (Wu et al., 2003; White, 1991). Many Data Mining approaches are within traditional OR domains like logistics, manufacturing, health care or finance. Further, both DM and OR are multidisciplinary. Since its origins, OR has been relying on fields such as mathematics, statistics, economics and computer science. In DM, most of the current textbooks show a strong bias towards one of its founding disciplines, like database management, machine learning or statistics. Being multidisciplinary and application focused, it seems to be a natural step for both paradigms to gain synergies from integration. Thus, recently an increasing number of publications of successful approaches at the intersection of DM and OR can be observed. On the one hand, efficiency of the DM process is increased by use of advanced optimization models and methods originating from OR. On the other hand, effectiveness of decision making is increased by augmentation of traditional OR approaches with DM results. Meisel and Mattfeld (in press) provide a detailed discussion of the synergies of DM and OR.

Download Full-text

Passive Copy-Move Tamper Detection Methods for Digital Images

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.c4730.098319 ◽

2019 ◽

Vol 8 (3) ◽

pp. 5882-5887

Keyword(s):

Digital Images ◽

Research Area ◽

Tamper Detection ◽

Detection Methods ◽

Zernike Moment ◽

Detection Accuracy ◽

Data Sets ◽

Research Application ◽

Challenges And Opportunities ◽

Significant Research

Copy-move tamper discovery in digital image is a significant research application in forensic investigation. Developing an imperative and reliable means to discover copy-move tampering in order to guarantee the authenticity of digital images is a solitary research area in image processing. This category of tampering is performed to hide some unwanted information or duplicate certain region of image. This article introduces a generic algorithmic skeleton to copy-move tamper discovery Experiments are carried out on three states of art copy-move tamper detection methods with respect to three different data sets. Empirical results indicates that PCA based method is better than Frequency transformation and Zernike moment based method in terms of detection accuracy in all three data sets. This article also identified the key challenges and opportunities in copy-move tamper detection.

Download Full-text

USE OF DATA MINING TECHNIQUES IN ADVANCE DECISION MAKING PROCESSES IN A LOCAL FIRM

European Journal of Business and Economics ◽

10.12955/ejbe.v10i2.682 ◽

2015 ◽

Vol 10 (2) ◽

Cited By ~ 1

Author(s):

Onur Doğan ◽

Hakan Aşan ◽

Ejder Ayç

Keyword(s):

Data Mining ◽

Decision Making ◽

Big Data ◽

Data Cleaning ◽

Decision Makers ◽

Data Sets ◽

Data Set ◽

Scientific Methods ◽

Local Firm ◽

Data Mining Techniques

In today’s competitive world, organizations need to make the right decisions to prolong their existence. Using non-scientific methods and making emotional decisions gave way to the use of scientific methods in the decision making process in this competitive area. Within this scope, many decision support models are still being developed in order to assist the decision makers and owners of organizations. It is easy to collect massive amount of data for organizations, but generally the problem is using this data to achieve economic advances. There is a critical need for specialization and automation to transform the data into the knowledge in big data sets. Data mining techniques are capable of providing description, estimation, prediction, classification, clustering, and association. Recently, many data mining techniques have been developed in order to find hidden patterns and relations in big data sets. It is important to obtain new correlations, patterns, and trends, which are understandable and useful to the decision makers. There have been many researches and applications focusing on different data mining techniques and methodologies.In this study, we aim to obtain understandable and applicable results from a large volume of record set that belong to a firm, which is active in the meat processing industry, by using data mining techniques. In the application part, firstly, data cleaning and data integration, which are the first steps of data mining process, are performed on the data in the database. With the aid of data cleaning and data integration, the data set was obtained, which is suitable for data mining. Then, various association rule algorithms were applied to this data set. This analysis revealed that finding unexplored patterns in the set of data would be beneficial for the decision makers of the firm. Finally, many association rules are obtained, which are useful for decision makers of the local firm.

Download Full-text

Creating Competitive Advantage by Using Data Mining Technique as an Innovative Method for Decision Making Process in Business

Transdisciplinary Marketing Concepts and Emergent Methods for Virtual Environments ◽

10.4018/978-1-4666-1861-9.ch014 ◽

2013 ◽

pp. 205-213

Author(s):

Mert Bal ◽

Yasemin Bal ◽

Ayse Demirhan

Keyword(s):

Data Mining ◽

Decision Making ◽

Data Analysis ◽

Competitive Advantage ◽

Decision Making Process ◽

Success Factor ◽

Data Sets ◽

Data Mining Technique ◽

Critical Success ◽

Using Data

Competitive advantage is at the heart of a firm’s performance in today’s challenging and rapidly changing environment. One of the central bases for achieving competitive advantage is the organizational capability to create new knowledge and transfer it across various levels of the organization. Traditional methods of data analysis, based mainly on human dealing directly with the data, simply do not scale to handle with large data sets. This explosive growth in data and databases has generated an urgent need for new techniques and tools that can intelligently and automatically transform the processed data into useful information and knowledge. Consequently, data mining has become a research area with increasing importance. Organizations of all sizes have started to develop and deploy data mining technologies to leverage data resources to enhance their decision making capabilities. Business information received from data analysis and data mining is a critical success factor for companies wishing to maximize competitive advantage. In this study, the importance of gaining knowledge for organizations in today’s competitive environment are discussed and data mining method in decision making process is analyzed as an innovative technique for organizations.

Download Full-text

Association Rule Hiding Methods

Database Technologies ◽

10.4018/978-1-60566-058-5.ch138 ◽

2009 ◽

pp. 2268-2274

Author(s):

Vassilios S. Verykios

Keyword(s):

Data Mining ◽

Data Analysis ◽

Research Area ◽

Distributed Data ◽

Intelligent Algorithms ◽

Processing Power ◽

Risk Sensitive ◽

The One ◽

New Research ◽

And Storage

The enormous expansion of data collection and storage facilities has created an unprecedented increase in the need for data analysis and processing power. Data mining has long been the catalyst for automated and sophisticated data analysis and interrogation. Recent advances in data mining and knowledge discovery have generated controversial impact in both scientific and technological arenas. On the one hand, data mining is capable of analyzing vast amounts of information within a minimum amount of time, an analysis that has exceeded the expectations of even the most imaginative scientists of the last decade. On the other hand, the excessive processing power of intelligent algorithms which is brought with this new research area puts at risk sensitive and confidential information that resides in large and distributed data stores.

Download Full-text