Social Implications of Data Mining and Information Privacy
Latest Publications


TOTAL DOCUMENTS

15
(FIVE YEARS 0)

H-INDEX

2
(FIVE YEARS 0)

Published By IGI Global

9781605661964, 9781605661971

Author(s):  
Karl-Ernst Erich Biebler

This chapter gives a summary of data types, mathematical structures, and associated methods of data mining. Topological, order theoretical, algebraic, and probability theoretical mathematical structures are introduced. The n-dimensional Euclidean space, the model used most for data, is defined. It is executed briefly that the treatment of higher dimensional random variables and related data is problematic. Since topological concepts are less well known than statistical concepts, many examples of metrics are given. Related classification concepts are defined and explained. Possibilities of their quality identification are discussed. One example each is given for topological cluster and for topological discriminant analyses.


Author(s):  
Robert K. McCormack

This chapter highlights a case study involving research into the science of building teams. Accomplishment of mission goals requires team members to not only possess the required technical skills but also the ability to collaborate effectively. The authors describe a research project that aims to develop an automated staffing system. Any such system requires a large amount of personal information about the potential team members under consideration. Gathering, storing, and applying this data raises a spectrum of concerns, from social and ethical implications, to technical hurdles. The authors hope to highlight these concerns by focusing on their research efforts which include obtaining and using employee data within a small business.


Author(s):  
E. Arlin Torbett ◽  
Tanya M. Candia

Data on the production, sale, repackaging, and transportation of fresh produce is scarce, yet with recent threats to national safety and security, forward and backward traceability of produce is mandatory. Recent advances in online marketing of fresh produce, a new international codification system and use of advanced technologies such as Radio Frequency Identification (RFID) and bar coding are working together to fill the gap, building a solid database of rich information that can be mined. While agricultural data mining holds much promise for farmers, with better indications of what and when to plant, and for buyers, giving them access to improved food quality and availability information, it is the world’s health organizations and governments who stand to be the biggest beneficiaries. This chapter describes the current state of fresh produce data collection and access, new trends that fill important gaps, and emerging methods of mining fresh produce data for improved production, product safety and public health through traceability.


Author(s):  
K. Selvakuberan ◽  
M. Indra Devi ◽  
R. Rajaram

The World Wide Web serves as a huge, widely distributed, global information service center for news, advertisements, customer information, financial management, education, government, e-commerce and many others. The Web contains a rich and dynamic collection of hyperlink information. The Web page access and usage information provide rich sources for data mining. Web pages are classified based on the content and/or contextual information embedded in them. As the Web pages contain many irrelevant, infrequent, and stop words that reduce the performance of the classifier, selecting relevant representative features from the Web page is the essential preprocessing step. This provides secured accessing of the required information. The Web access and usage information can be mined to predict the authentication of the user accessing the Web page. This information may be used to personalize the information needed for the users and to preserve the privacy of the users by hiding the personal details. The issue lies in selecting the features which represent the Web pages and processing the details of the user needed the details. In this chapter we focus on the feature selection, issues in feature selection, and the most important feature selection techniques described and used by researchers.


Author(s):  
Kenneth J. Knapp ◽  
Thomas E. Marshall ◽  
R. Kelly Rainer ◽  
F. Nelson Ford

Taking a sequential qualitative-quantitative methodological approach, we propose and test a theoretical model that includes four variables through which top management can positively influence security effectiveness: user training, security culture, policy relevance, and policy enforcement. During the qualitative phase of the study, we generated the model based on textual responses to a series of questions given to a sample of 220 information security practitioners. During the quantitative phase, we analyzed survey data collected from a sample of 740 information security practitioners. After data collection, we analyzed the survey responses using structural equation modeling and found evidence to support the hypothesized model. We also tested an alternative, higher-order factor version of the original model that demonstrated an improved overall fit and general applicability across the various demographics of the sampled data. We then linked the finding of this study to existing top management support literature, general deterrence theory research, and the theoretical notion of the dilemma of the supervisor.


Author(s):  
Robert Sprague

This chapter explores the foundations of the legal right to privacy in the United States, juxtaposed against the accumulation and mining of data in today’s society. Businesses and government agencies have the capacity to accumulate massive amounts of information, tracking the behavior of ordinary citizens carrying out ordinary routines. Data mining techniques also provide the opportunity to analyze vast amounts of data to compile comprehensive profiles of behavior. Within this context, this chapter addresses the legal frameworks for data mining and privacy. Historically, privacy laws in the United States have adapted to changing technologies, but have done so slowly; arguably not keeping pace with current technology. This chapter makes clear that the legal right to privacy in the United States is not keeping pace with the accumulation, analysis, and use of data about each and every one of us.


Author(s):  
Aris Gkoulalas-Divanis

In this era of significant advances in telecommunications and GPS sensors technology, a person can be tracked down to proximity of less than 5 meters. This remarkable progress enabled the offering of services that depend on user location (the so-called location-based services—LBSs), as well as the existence of applications that analyze movement data for various purposes. However, without strict safeguards, both the deployment of LBSs and the mining of movement data come at a cost of privacy for the users, whose movement is recorded. This chapter studies privacy in both online and offline movement data. After introducing the reader to this field of study, we review state-of-the-art work for location and trajectory privacy both in LBSs and in trajectory databases. Then, we present a qualitative evaluation of these works, pointing out their strengths and weaknesses. We conclude the chapter by providing our point of view regarding the future trends in trajectory data privacy.


Author(s):  
K. Selvakuberan ◽  
M. Indra Devi ◽  
R. Rajaram

The explosive growth of the Web makes it a very useful information resource to all types of users. Today, everyone accesses the Internet for various purposes and retrieving the required information within the stipulated time is the major demand from users. Also, the Internet provides millions of Web pages for each and every search term. Getting interesting and required results from the Web becomes very difficult and turning the classification of Web pages into relevant categories is the current research topic. Web page classification is the current research problem that focuses on classifying the documents into different categories, which are used by search engines for producing the result. In this chapter we focus on different machine learning techniques and how Web pages can be classified using these machine learning techniques. The automatic classification of Web pages using machine learning techniques is the most efficient way used by search engines to provide accurate results to the users. Machine learning classifiers may also be trained to preserve the personal details from unauthenticated users and for privacy preserving data mining.


Author(s):  
Shahid M. Shahidullah

This chapter examines the issues and concerns raised in the context of the recent growth of federal mining programs. The chapter argues that in the context of the war on terror, intelligence gathering on terrorist activities both within and outside the United States has emerged as one of the core strategies for homeland security. The major national security related federal agencies such as the Department of Justice, Department of Homeland Security, and the Department of Defense have developed a number of data mining programs to improve terrorism intelligence gathering and analysis in the wake of the events of September 11, 2001. Some data mining programs have, however, raised a number of issues related to privacy protections and civil liberties. These issues have given birth to a wider debate in the nation and raised new tensions about how to search for a balance between the needs for the protection of privacy and civil liberties, and the needs for national security. The authors believe that the future of this debate is intimately connected to the future of the war on terror. Currently, Congress and the federal courts seem to be more in favor of supporting the preeminent needs of protecting national security. Through a number of enactments, Congress has broadened the federal power for collecting terrorism intelligence both at home and abroad. In a number of cases, the federal courts have ruled in favor of the doctrines of the “state secret privilege” and the “inherent power of the President” to emphasize the overriding need for protecting national security in the context of the war on terror. As America has embarked on a long and protracted ideological war against radical militant Islam, issues of national security and the need for data mining for detecting and analyzing terrorist activities are likely to remain dominant for a long time.


Author(s):  
Stanley R.M. Oliveira ◽  
Osmar R. Zaïane

The sharing of data is beneficial in data mining applications and widely acknowledged as advantageous in business. However, information sharing can become controversial and thwarted by privacy regulations and other privacy concerns. Rather than simply hindering data owners from sharing information for data analysis, a solution could be designed to meet privacy requirements and guarantee valid data clustering results. To achieve this dual goal, this chapter introduces a method for privacy-preserving clustering, called Dimensionality Reduction-Based Transformation (DRBT). This method relies on the intuition behind random projection to protect the underlying attribute values subjected to cluster analysis. It is shown analytically and empirically that transforming a dataset using DRBT, a data owner can achieve privacy preservation and get accurate clustering with little overhead of communication cost. The advantages of such a method are: it is independent of distance-based clustering algorithms; it has a sound mathematical foundation; and it does not require CPU-intensive operations.


Sign in / Sign up

Export Citation Format

Share Document