Data Confidentiality and Chase-Based Knowledge Discovery

Author(s):  
Seunghyun Im ◽  
Zbigniew W. Ras

This article discusses data security in Knowledge Discovery Systems (KDS). In particular, we presents the problem of confidential data reconstruction by Chase (Dardzinska and Ras, 2003c) in KDS, and discuss protection methods. In conventional database systems, data confidentiality is achieved by hiding sensitive data from unauthorized users (e.g. Data encryption or Access Control). However, hiding is not sufficient in KDS due to Chase. Chase is a generalized null value imputation algorithm that is designed to predict null or missing values, and has many application areas. For example, we can use Chase in a medical decision support system to handle difficult medical situations (e.g. dangerous invasive medical test for the patients who cannot take it). The results derived from the decision support system can help doctors diagnose and treat patients. The data approximated by Chase is particularly reliable because they reflect the actual characteristics of the data set in the information system. Chase, however, can create data security problems if an information system contains confidential data (Im and Ras, 2005) (Im, 2006). Suppose that an attribute in an information system S contains medical information about patients; some portions of the data are not confidential while others have to be confidential. In this case, part or all of the confidential data in the attribute can be revealed by Chase using knowledge extracted at S. In other words, self-generated rules extracted from non-confidential portions of data can be used to find secret data. Knowledge is often extracted from remote sites in a Distributed Knowledge Discovery System (DKDS) (Ras, 1994). The key concept of DKDS is to generate global knowledge through knowledge sharing. Each site in DKDS develops knowledge independently, and they are used jointly to produce global knowledge without complex data integrations. Assume that two sites S1 and S2 in a DKDS accept the same ontology of their attributes, and they share their knowledge in order to obtain global knowledge, and an attribute of a site S1 in a DKDS is confidential. The confidential data in S1 can be hidden by replacing them with null values. However, users at S1 may treat them as missing data and reconstruct them with Chase using the knowledge extracted from S2. A distributed medical information system is an example that an attribute is confidential for one information system while the same attribute may not be considered as secret information in another site. These examples show that hiding confidential data from an information system does not guarantee data confidentiality due to Chase, and methods that would protect against these problems are essential to build a security-aware KDS.

1993 ◽  
Vol 23 (6) ◽  
pp. 1078-1095 ◽  
Author(s):  
Robert G. Davis ◽  
David L. Martell

This paper describes a decision support system that forest managers can use to help evaluate short-term, site-specific silvicultural operating plans in terms of their potential impact on long-term, forest-level strategic objectives. The system is based upon strategic and tactical forest-level silvicultural planning models that are linked with each other and with a geographical information system. Managers can first use the strategic mathematical programming model to develop broad silvicultural strategies based on aggregate timber strata. These strategies help them to subjectively delineate specific candidate sites that might be treated during the first 10 years of a much longer planning horizon using a geographical information system and to describe potential silvicultural prescriptions for each candidate site. The tactical model identifies an annual silvicultural schedule for these candidate sites in the first 10 years, and a harvesting and regeneration schedule by 10-year periods for aggregate timber strata for the remainder of the planning horizon, that will maximize the sustainable yield of one or more timber species in the whole forest, given the candidate sites and treatments specified by the managers. The system is demonstrated on a 90 000 - ha area in northeastern Ontario.


Author(s):  
Iman Barazandeh ◽  
Mohammad Reza Gholamian

The healthcare industry is one of the most attractive domains to realize the actionable knowledge discovery objectives. This chapter studies recent researches on knowledge discovery and data mining applications in the healthcare industry and proposes a new classification of these applications. Studies show that knowledge discovery and data mining applications in the healthcare industry can be classified to three major classes, namely patient view, market view, and system view. Patient view includes papers that performed pure data mining on healthcare industry data. Market view includes papers that saw the patients as customers. System view includes papers that developed a decision support system. The goal of this classification is identifying research opportunities and gaps for researchers interested in this context.


Author(s):  
Jean-Fabrice Lebraty ◽  
Cécile Godé

This article explores the ability of a decision support system (DSS) to improve the quality of decision making in extreme environment. This DSS is actually based on a networked information system. Academic literature commonly mentions models of fit to explore the relationship between technology and performance, reckoning users' evaluations as a relevant measurement technique for Information System (IS) success. Although effective contributions have been achieved in measurement and exploration of fit, there have been few attempts to investigate the triangulation of fit between “Task-DSS-Decision Maker” under stressful and uncertain circumstances. This article provides new insights regarding the advantages provided by networked IS for making relevant decisions. An original case study has been conducted. It is focused on a networked decision support system called Link 16 that is used during aerial missions. This case study shows that the system improves decision making on an individual basis. Our result suggest the importance of three main fit criteria – Compliance, Complementarity and Conformity – to measure DSS performance under extreme environment and display a preliminary decisional fit model.


2008 ◽  
Vol 14 (3) ◽  
pp. 260-278 ◽  
Author(s):  
Dalė Dzemydienė ◽  
Saulius Maskeliūnas ◽  
Ignas Dzemyda

The interoperability problems of distributed databases are important in the developing of the operatively working web services aimed for all sectors of public administration. The following web services are designed for solving tasks in water resource management and contamination evaluation sector with a due attention to the international environment protection context. The paper is devoted to problems of developing the component‐based architecture of the integrated decision‐support system that afford ground for the monitoring and intellectual analysis of water management. Such investigations are made according to the requirements of European Union (EU) Water Framework Directive, Sustainable development Directives and EIONET ReportNet infrastructure. The main components of decision‐support system are analyzed by using different knowledge modelling and web service development techniques. The structure of water resource management information system (WRMIS) becomes the core of the decision‐support system in which web services are implemented. The main components for evaluation of processes of contamination and water monitoring are represented by data warehouse structures. The solutions to satisfy the interoperability requirements are demonstrated by architectural design decisions of the system, integrating the distributed data warehouses and geographical information system means. The web services are based on common portal technology. The organizational and political arrangements require deeper and stronger participation activities by all member states of EU in reporting, understanding the importance of sustainable development problems and risk evaluation possibilities. Santrauka Vandens išteklių valdymas ir nutekamojo vandens kokybės vertinimas yra viena svarbiausių problemų, susijusių su aplinkos apsaugos ir žmonių sveikatos uždaviniais. Vanduo yra vienas iš pagrindinių išteklių visiems biologinės įvairovės gyvavimo ciklams egzistuoti. Vandens kokybės reikalavimai daro įtaką daugeliui darniosios plėtros reikalavimų. Aplinkos apsaugos principai susideda iš daugelio tarpusavyje sąveikaujančių komponentų. Didelių įmonių, institucijų ir organizacijų veikla turėtų būti grindžiama įvairiopa atsakomybe už daromos veiklos ir žalos aplinkai padarinius. Nagrinėjami informacinių sistemų, vykdančių vandens užterštumo stebėseną ir analizę, sąveikumo užtikrinimo klausimai. Informacijos perteikimo metodai yra svarbūs kuriant konsultacines sistemas, kurios padėtų spręsti daugelį sprendimų priėmimo problemų, vertinant kompleksinius aplinkos taršos procesus. Šiame straipsnyje analizuojami pagrindiniai sprendimų paramos sistemos kūrimo komponentai aplinkos vertinimo sektoriuje, leidžiantys efektyviau spręsti šias problemas naudojant E-tinklus (vertinimo tinklus, t. y. Petri tinklų praplėtimą). Modeliai, leidžiantys atvaizduoti ir vertinti sprendimų priėmimo procesus, projektuojami keliais detalumo lygmenimis, taikant semantinio informacijos struktūrizavimo ir imitacinio modeliavimo priemones. Taršos procesai stebimi vykdant monitoringą, pirminius duomenis fiksuojant daugiamatėse duomenų saugyklose ir perteikiant vartotojams sprendimų paramos sistemos analizės priemonėmis. Aprašomi vandens išteklių ir nuotekų kontrolės duomenų analizės modeliai ir gauti rezultatai. Straipsnyje analizuojami pagrindiniai sprendimų paramos sistemos komponentai ir spendimams priimti svarbūs vandens ir aplinkos vertinimo rezultatai.


Sign in / Sign up

Export Citation Format

Share Document