Mediated Data Integration Systems Using Functional Dependencies Embedded in Ontologies

Author(s):  
Abdelghani Bakhtouchi ◽  
Chedlia Chakroun ◽  
Ladjel Bellatreche ◽  
Yamine Aït-Ameur

Author(s):  
Tadeusz Pankowski

This chapter addresses the problem of data integration in a P2P environment, where each peer stores schema of its local data, mappings between the schemas, and some schema constraints. The goal of the integration is to answer queries formulated against a chosen peer. The answer must consist of data stored in the queried peer as well as data of its direct and indirect partners. The chapter focuses on defining and using mappings, schema constraints, query propagation across the P2P system, and query answering in such scenario. Schemas, mappings, constraints (functional dependencies) and queries are all expressed using a unified approach based on tree-pattern formulas. The chapter discusses how functional dependencies can be exploited to increase information content of answers (by discovering missing values) and to control merging operations and propagation strategies. The chapter proposes algorithms for translating high-level specifications of mappings and queries into XQuery programs, and it shows how the discussed method has been implemented in SixP2P (or 6P2P) system.



2019 ◽  
Author(s):  
Rafael Pereira ◽  
Fabio Porto

Missing data is a common problem in the world of data analysis. They appear in datasets due to a multitude of reasons, from data integration to poor data input. When faced with the problem, the analyst must decide what to do with the missing data since its not always advisable to discard these values from your analysis. On this paper we shall discuss a method that takes into account information theory and functional dependencies to best imput missing values.



2019 ◽  
Author(s):  
Rafael S. Pereira ◽  
Fabio Porto

Missing data is a common problem in the world of data analysis. They appear in datasets due to a multitude of reasons, from data integration to poor data input. When faced with the problem, the analyst must decide what to do with the missing data since its not always advisable to discard these values from your analysis. On this paper we shall discuss a method that takes into account information theory and functional dependencies to best imput missing values.



Author(s):  
Naser Ayat ◽  
Hamideh Afsarmanesh ◽  
Reza Akbarinia ◽  
Patrick Valduriez


2020 ◽  
pp. 9-13
Author(s):  
A. V. Lapko ◽  
V. A. Lapko

An original technique has been justified for the fast bandwidths selection of kernel functions in a nonparametric estimate of the multidimensional probability density of the Rosenblatt–Parzen type. The proposed method makes it possible to significantly increase the computational efficiency of the optimization procedure for kernel probability density estimates in the conditions of large-volume statistical data in comparison with traditional approaches. The basis of the proposed approach is the analysis of the optimal parameter formula for the bandwidths of a multidimensional kernel probability density estimate. Dependencies between the nonlinear functional on the probability density and its derivatives up to the second order inclusive of the antikurtosis coefficients of random variables are found. The bandwidths for each random variable are represented as the product of an undefined parameter and their mean square deviation. The influence of the error in restoring the established functional dependencies on the approximation properties of the kernel probability density estimation is determined. The obtained results are implemented as a method of synthesis and analysis of a fast bandwidths selection of the kernel estimation of the two-dimensional probability density of independent random variables. This method uses data on the quantitative characteristics of a family of lognormal distribution laws.



Sign in / Sign up

Export Citation Format

Share Document