Data Mining | ScienceGate

Every day, enormous amounts of information are generated from all sectors, whether it be business, education, the scientific community, the World Wide Web (WWW), or one of many readily available off-line and online data sources. From all of this, which represents a sizable repository of data and information, it is possible to generate worthwhile and usable knowledge. As a result, the field of Data Mining (DM) and knowledge discovery in databases (KDD) has grown in leaps and bounds and has shown great potential for the future (Han & Kamber, 2001). The purpose of this chapter is to survey many of the critical and future trends in the field of DM, with a focus on those which are thought to have the most promise and applicability to future DM applications.

Download Full-text

Data Mining in Health Care Applications

Data Mining ◽

10.4018/978-1-59140-051-6.ch015 ◽

2011 ◽

pp. 350-365 ◽

Cited By ~ 4

Author(s):

Fay Cobb Payton

Keyword(s):

Data Mining ◽

Health Care ◽

Care Delivery ◽

Patient Data ◽

Organizational Level ◽

Privacy And Security ◽

Political Issues ◽

Health Care Applications ◽

Community Data ◽

The Individual

Recent attention has turned to the healthcare industry and its use of voluntary community health information network (CHIN) models for e-health and care delivery. This chapter suggests that competition, economic dimensions, political issues, and a group of enablers are the primary determinants of implementation success. Most critical to these implementations is the issue of data management and utilization. Thus, health care organizations are finding value as well as strategic applications to mining patient data, in general, and community data, in particular. While significant gains can be obtained and have been noted at the organizational level of analysis, much attention has been given to the individual, where the focal points have centered on privacy and security of patient data. While the privacy debate is a salient issue, data mining (DM) offers broader community-based gains that enable and improve healthcare forecasting, analyses, and visualization.

Download Full-text

A Survey of Bayesian Data Mining

Data Mining ◽

10.4018/978-1-59140-051-6.ch001 ◽

2011 ◽

pp. 1-26 ◽

Cited By ~ 1

Author(s):

Stefan Arnborg

Keyword(s):

Data Mining ◽

Bayesian Analysis ◽

Graphical Models ◽

Bayesian Approach ◽

Data Sets ◽

Media Distribution ◽

Dirichlet Prior ◽

Approach Problem ◽

The Bayesian Approach ◽

Selection Of

This chapter reviews the fundamentals of inference, and gives a motivation for Bayesian analysis. The method is illustrated with dependency tests in data sets with categorical data variables, and the Dirichlet prior distributions. Principles and problems for deriving causality conclusions are reviewed, and illustrated with Simpson’s paradox. The selection of decomposable and directed graphical models illustrates the Bayesian approach. Bayesian and EM classification is shortly described. The material is illustrated on two cases, one in personalization of media distribution, one in schizophrenia research. These cases are illustrations of how to approach problem types that exist in many other application areas.

Download Full-text

Data Mining for Human Resource Information Systems

Data Mining ◽

10.4018/978-1-59140-051-6.ch016 ◽

2011 ◽

pp. 366-381 ◽

Cited By ~ 2

Author(s):

Lori K. Long ◽

Mavin D. Troutt

Keyword(s):

Data Mining ◽

Decision Making ◽

Information Systems ◽

Human Resource ◽

Data Warehousing ◽

Organizational Decision Making ◽

Resource Information ◽

Potential Applications ◽

Statistical Studies ◽

Hr Function

This chapter focuses on the potential contributions that Data Mining (DM) could make within the Human Resource (HR) function in organizations. We first provide a basic introduction to DM techniques and processes and a survey of the literature on the steps involved in successfully mining this information. We also discuss the importance of data warehousing and datamart considerations. An examination of the contrast between DM and more routine statistical studies is given, and the value of HR information to support a firm’s competitive position and organizational decision-making is considered. Examples of potential applications are outlined in terms of data that is ordinarily captured in HR information systems.

Download Full-text

Mining Text Documents for Thematic Hierarchies Using Self-Organizing Maps

Data Mining ◽

10.4018/978-1-59140-051-6.ch008 ◽

2011 ◽

pp. 199-219 ◽

Cited By ~ 2

Author(s):

Hsin-Chang Yang ◽

Chung-Hong Lee

Keyword(s):

Text Categorization ◽

Hierarchical Structures ◽

Self Organizing Map ◽

Feature Maps ◽

Text Documents ◽

Self Organizing Maps ◽

Test Corpus ◽

The Hierarchical Structure ◽

Document Collection ◽

Self Organizing

Recently, many approaches have been devised for mining various kinds of knowledge from texts. One important application of text mining is to identify themes and the semantic relations among these themes for text categorization. Traditionally, these themes were arranged in a hierarchical manner to achieve effective searching and indexing as well as easy comprehension for human beings. The determination of category themes and their hierarchical structures was mostly done by human experts. In this work, we developed an approach to automatically generate category themes and reveal the hierarchical structure among them. We also used the generated structure to categorize text documents. The document collection was trained by a self-organizing map to form two feature maps. We then analyzed these maps and obtained the category themes and their structure. Although the test corpus contains documents written in Chinese, the proposed approach can be applied to documents written in any language, and such documents can be transformed into a list of separated terms.

Download Full-text

Data Mining Based on Rough Sets

Data Mining ◽

10.4018/978-1-59140-051-6.ch006 ◽

2011 ◽

pp. 142-173 ◽

Cited By ~ 47

Author(s):

Jerzy W. Grzymala-Busse ◽

Wojciech Ziarko

Keyword(s):

Data Mining ◽

Set Theory ◽

Rough Set ◽

Rough Sets ◽

Model Building ◽

Rough Set Theory ◽

Mining System ◽

Data Mining System ◽

Similarities And Differences ◽

Variable Precision Rough Set

The chapter is focused on the data mining aspect of the applications of rough set theory. Consequently, the theoretical part is minimized to emphasize the practical application side of the rough set approach in the context of data analysis and model-building applications. Initially, the original rough set approach is presented and illustrated with detailed examples showing how data can be analyzed with this approach. The next section illustrates the Variable Precision Rough Set Model (VPRSM) to expose similarities and differences between these two approaches. Then, the data mining system LERS, based on a different generalization of the original rough set theory than VPRSM, is presented. Brief descriptions of algorithms are also cited. Finally, some applications of the LERS data mining system are listed.

Download Full-text

Parallel and Distributed Data Mining through Parallel Skeletons and Distributed Objects

Data Mining ◽

10.4018/978-1-59140-051-6.ch005 ◽

2011 ◽

pp. 106-141

Author(s):

Massimo Coppola ◽

Marco Vanneschi

Keyword(s):

Data Mining ◽

High Performance ◽

Distributed Databases ◽

Sequential Algorithm ◽

Distributed Data ◽

Programming Environment ◽

Programming Environments ◽

Geographically Distributed ◽

High Level ◽

Performance Results

We consider the application of parallel programming environments to develop portable and efficient high performance data mining (DM) tools. We first assess the need of parallel and distributed DM applications, by pointing out the problems of scalability of some mining techniques and the need to mine large, eventually geographically distributed databases. We discuss the main issues of exploiting parallel and distributed computation for DM algorithms. A high-level programming language enhances the software engineering aspects of parallel DM, and it simplifies the problems of integration with existing sequential and parallel data management systems, thus leading to programming-efficient and high-performance implementations of applications. We describe a programming environment we have implemented that is based on the parallel skeleton model, and we examine the addition of object-like interfaces toward external libraries and system software layers. This kind of abstractions will be included in the forthcoming programming environment ASSIST. In the main part of the chapter, as a proof-of-concept we describe three well-known DM algorithms, Apriori, C4.5, and DBSCAN. For each problem, we explain the sequential algorithm and a structured parallel version, which is discussed and compared to parallel solutions found in the literature. We also discuss the potential gain in performance and expressiveness from the addition of external objects on the basis of the experiments we performed so far. We evaluate the approach with respect to performance results, design, and implementation considerations.

Download Full-text

Social, Ethical and Legal Issues of Data Mining

Data Mining ◽

10.4018/978-1-59140-051-6.ch018 ◽

2011 ◽

pp. 395-420 ◽

Cited By ~ 9

Author(s):

Jack S. Cook ◽

Laura L. Cook

Keyword(s):

Data Mining ◽

Legal Issues ◽

The Road ◽

Use Of Data ◽

Legal Implications ◽

The Social ◽

Improper Use ◽

Small Industry ◽

Ethical And Legal Issues ◽

Insight Into

This chapter highlights both the positive and negative aspects of Data Mining (DM). Specifically, the social, ethical, and legal implications of DM are examined through recent case law, current public opinion, and small industry-specific examples. There are many issues concerning this topic. Therefore, the purpose of this chapter is to expose the reader to some of the more interesting ones and provide insight into how information systems (IS) professionals and businesses may protect themselves from the negative ramifications associated with improper use of data. The more experience with and exposure to social, ethical, and legal concerns with respect to DM, the better prepared you will be to prevent trouble down the road.

Download Full-text

Financial Benchmarking Using Self-Organizing Maps - Studying the International Pulp and Paper Industry

Data Mining ◽

10.4018/978-1-59140-051-6.ch014 ◽

2011 ◽

pp. 323-349 ◽

Cited By ~ 3

Author(s):

Tomas Eklund ◽

Barbro Back ◽

Hannu Vanharanta ◽

Ari Visa

Keyword(s):

Paper Industry ◽

Primary Source ◽

Time Frame ◽

Pulp And Paper ◽

Financial Data ◽

Annual Reports ◽

The Internet ◽

Self Organizing Map ◽

Self Organizing Maps ◽

Self Organizing

Performing financial benchmarks in today’s information-rich society can be a daunting task. With the evolution of the Internet, access to massive amounts of financial data, typically in the form of financial statements, is widespread. Managers and stakeholders are in need of a tool that allows them to quickly and accurately analyze these data. An emerging technique that may be suited for this application is the self-organizing map. The purpose of this study was to evaluate the performance of self-organizing maps for the purpose of financial benchmarking of international pulp and paper companies. For the study, financial data in the form of seven financial ratios were collected, using the Internet as the primary source of information. A total of 77 companies and six regional averages were included in the study. The time frame of the study was the period 1995-2000. A number of benchmarks were performed, and the results were analyzed based on information contained in the annual reports. The results of the study indicate that self-organizing maps can be feasible tools for the financial benchmarking of large amounts of financial data.

Download Full-text

Mining Free Text for Structure

Data Mining ◽

10.4018/978-1-59140-051-6.ch012 ◽

2011 ◽

pp. 278-300

Author(s):

Vladimir A. Kulyukin ◽

Robin Burke

Keyword(s):

Structural Organization ◽

Knowledge Engineering ◽

Knowledge Bases ◽

Free Text ◽

Learning Approaches ◽

Structural Components ◽

Text Documents ◽

Retrieval Systems ◽

Information Retrieval Systems ◽

Ultimate Objective

Knowledge of the structural organization of information in documents can be of significant assistance to information systems that use documents as their knowledge bases. In particular, such knowledge is of use to information retrieval systems that retrieve documents in response to user queries. This chapter presents an approach to mining free-text documents for structure that is qualitative in nature. It complements the statistical and machine-learning approaches, insomuch as the structural organization of information in documents is discovered through mining free text for content markers left behind by document writers. The ultimate objective is to find scalable data mining (DM) solutions for free-text documents in exchange for modest knowledge-engineering requirements. The problem of mining free text for structure is addressed in the context of finding structural components of files of frequently asked questions (FAQs) associated with many USENET newsgroups. The chapter describes a system that mines FAQs for structural components. The chapter concludes with an outline of possible future trends in the structural mining of free text.

Download Full-text

Data Mining
Latest Publications

TOTAL DOCUMENTS

H-INDEX

Published By IGI Global

Critical and Future Trends in Data Mining

Data Mining in Health Care Applications

A Survey of Bayesian Data Mining

Data Mining for Human Resource Information Systems

Mining Text Documents for Thematic Hierarchies Using Self-Organizing Maps

Data Mining Based on Rough Sets

Parallel and Distributed Data Mining through Parallel Skeletons and Distributed Objects

Social, Ethical and Legal Issues of Data Mining

Financial Benchmarking Using Self-Organizing Maps - Studying the International Pulp and Paper Industry

Mining Free Text for Structure

Export Citation Format

Data MiningLatest Publications

TOTAL DOCUMENTS

H-INDEX

Published By IGI Global

Critical and Future Trends in Data Mining

Data Mining in Health Care Applications

A Survey of Bayesian Data Mining

Data Mining for Human Resource Information Systems

Mining Text Documents for Thematic Hierarchies Using Self-Organizing Maps

Data Mining Based on Rough Sets

Parallel and Distributed Data Mining through Parallel Skeletons and Distributed Objects

Social, Ethical and Legal Issues of Data Mining

Financial Benchmarking Using Self-Organizing Maps - Studying the International Pulp and Paper Industry

Mining Free Text for Structure

Data Mining
Latest Publications