scholarly journals On the Connections between Relational and XML Probabilistic Data Models

2017 ◽  
Author(s):  
Antoine Amarilli ◽  
Pierre Senellart

A number of uncertain data models have been proposed,based on the notion of compact representations of probability distributionsover possible worlds. In probabilistic relational models, tuples areannotated with probabilities or formulae over Boolean random variables.In probabilistic XML models, XML trees are augmented with nodesthat specify probability distributions over their children. Both kinds ofmodels have been extensively studied, with respect to their expressivepower, compactness, and query efficiency, among other things. Probabilisticdatabase systems have also been implemented, in both relationaland XML settings. However, these studies have mostly been carried outindependently and the translations between relational and XML models,as well as the impact for probabilistic relational databases of resultsabout query complexity in probabilistic XML and vice versa, have notbeen made explicit: we detail such translations in this article, in bothdirections, study their impact in terms of complexity results, and presentinteresting open issues about the connections between relational andXML probabilistic data models.

Author(s):  
PATRICK BOSC ◽  
OLIVIER PIVERT

In this paper, we give an overview of the most representative approaches aimed at querying databases containing ill-known data, starting from the pioneering works by Codd and Lipski and up to very recent proposals. This study focuses on approaches with a clear and sound semantics, based on the notion of possible worlds. Three types of queries are considered: (i) those about attribute values (in an algebraic or SQL-like framework), (ii) those about the properties satisfied by a given set of worlds (i.e., a set of instances of an imprecise database), and (iii) those about the representation of uncertain data. For the first two types, it is emphasized that a trade-off has to be found between expressivity (of the model) and tractability (of the queries in the context of a given model).


Sensors ◽  
2021 ◽  
Vol 21 (3) ◽  
pp. 875
Author(s):  
Chao Li ◽  
Zhenjiang Zhang ◽  
Wei Wei ◽  
Han-Chieh Chao ◽  
Xuejun Liu

In data clustering, the measured data are usually regarded as uncertain data. As a probability-based clustering technique, possible world can easily cluster the uncertain data. However, the method of possible world needs to satisfy two conditions: determine the data of different possible worlds and determine the corresponding probability of occurrence. The existing methods mostly make multiple measurements and treat each measurement as deterministic data of a possible world. In this paper, a possible world-based fusion estimation model is proposed, which changes the deterministic data into probability distribution according to the estimation algorithm, and the corresponding probability can be confirmed naturally. Further, in the clustering stage, the Kullback–Leibler divergence is introduced to describe the relationships of probability distributions among different possible worlds. Then, an application in wearable body networks (WBNs) is given, and some interesting conclusions are shown. Finally, simulations show better performance when the relationships between features in measured data are more complex.


Data Mining ◽  
2013 ◽  
pp. 669-691 ◽  
Author(s):  
Evgeny Kharlamov ◽  
Pierre Senellart

This chapter deals with data mining in uncertain XML data models, whose uncertainty typically comes from imprecise automatic processes. We first review the literature on modeling uncertain data, starting with well-studied relational models and moving then to their semistructured counterparts. We focus on a specific probabilistic XML model, which allows representing arbitrary finite distributions of XML documents, and has been extended to also allow continuous distributions of data values. We summarize previous work on querying this uncertain data model and show how to apply the corresponding techniques to several data mining tasks, exemplified through use cases on two running examples.


Author(s):  
Evgeny Kharlamov ◽  
Pierre Senellart

This chapter deals with data mining in uncertain XML data models, whose uncertainty typically comes from imprecise automatic processes. We first review the literature on modeling uncertain data, starting with well-studied relational models and moving then to their semistructured counterparts. We focus on a specific probabilistic XML model, which allows representing arbitrary finite distributions of XML documents, and has been extended to also allow continuous distributions of data values. We summarize previous work on querying this uncertain data model and show how to apply the corresponding techniques to several data mining tasks, exemplified through use cases on two running examples.


1997 ◽  
Vol 161 ◽  
pp. 197-201 ◽  
Author(s):  
Duncan Steel

AbstractWhilst lithopanspermia depends upon massive impacts occurring at a speed above some limit, the intact delivery of organic chemicals or other volatiles to a planet requires the impact speed to be below some other limit such that a significant fraction of that material escapes destruction. Thus the two opposite ends of the impact speed distributions are the regions of interest in the bioastronomical context, whereas much modelling work on impacts delivers, or makes use of, only the mean speed. Here the probability distributions of impact speeds upon Mars are calculated for (i) the orbital distribution of known asteroids; and (ii) the expected distribution of near-parabolic cometary orbits. It is found that cometary impacts are far more likely to eject rocks from Mars (over 99 percent of the cometary impacts are at speeds above 20 km/sec, but at most 5 percent of the asteroidal impacts); paradoxically, the objects impacting at speeds low enough to make organic/volatile survival possible (the asteroids) are those which are depleted in such species.


Atmosphere ◽  
2021 ◽  
Vol 12 (6) ◽  
pp. 679
Author(s):  
Sara Cornejo-Bueno ◽  
David Casillas-Pérez ◽  
Laura Cornejo-Bueno ◽  
Mihaela I. Chidean ◽  
Antonio J. Caamaño ◽  
...  

This work presents a full statistical analysis and accurate prediction of low-visibility events due to fog, at the A-8 motor-road in Mondoñedo (Galicia, Spain). The present analysis covers two years of study, considering visibility time series and exogenous variables collected in the zone affected the most by extreme low-visibility events. This paper has then a two-fold objective: first, we carry out a statistical analysis for estimating the fittest probability distributions to the fog event duration, using the Maximum Likelihood method and an alternative method known as the L-moments method. This statistical study allows association of the low-visibility depth with the event duration, showing a clear relationship, which can be modeled with distributions for extremes such as Generalized Extreme Value and Generalized Pareto distributions. Second, we apply a neural network approach, trained by means of the ELM (Extreme Learning Machine) algorithm, to predict the occurrence of low-visibility events due to fog, from atmospheric predictive variables. This study provides a full characterization of fog events at this motor-road, in which orographic fog is predominant, causing important traffic problems during all year. We also show how the ELM approach is able to obtain highly accurate low-visibility events predictions, with a Pearson correlation coefficient of 0.8, within a half-hour time horizon, enough to initialize some protocols aiming at reducing the impact of these extreme events in the traffic of the A-8 motor road.


2012 ◽  
Vol 22 (4-5) ◽  
pp. 477-528 ◽  
Author(s):  
DEREK DREYER ◽  
GEORG NEIS ◽  
LARS BIRKEDAL

AbstractReasoning about program equivalence is one of the oldest problems in semantics. In recent years, useful techniques have been developed, based on bisimulations and logical relations, for reasoning about equivalence in the setting of increasingly realistic languages—languages nearly as complex as ML or Haskell. Much of the recent work in this direction has considered the interesting representation independence principles enabled by the use of local state, but it is also important to understand the principles that powerful features like higher-order state and control effects disable. This latter topic has been broached extensively within the framework of game semantics, resulting in what Abramsky dubbed the “semantic cube”: fully abstract game-semantic characterizations of various axes in the design space of ML-like languages. But when it comes to reasoning about many actual examples, game semantics does not yet supply a useful technique for proving equivalences.In this paper, we marry the aspirations of the semantic cube to the powerful proof method of step-indexed Kripke logical relations. Building on recent work of Ahmed et al. (2009), we define the first fully abstract logical relation for an ML-like language with recursive types, abstract types, general references and call/cc. We then show how, under orthogonal restrictions to the expressive power of our language—namely, the restriction to first-order state and/or the removal of call/cc—we can enhance the proving power of our possible-worlds model in correspondingly orthogonal ways, and we demonstrate this proving power on a range of interesting examples. Central to our story is the use of state transition systems to model the way in which properties of local state evolve over time.


2021 ◽  
Vol 6 ◽  
Author(s):  
Suzanne W. Dietrich ◽  
Don Goelman ◽  
Jennifer Broatch ◽  
Sharon Crook ◽  
Becky Ball ◽  
...  

The goal of the Databases for Many Majors project is to engage a broad audience in understanding fundamental database concepts using visualizations with color and visual cues to present these topics to students across many disciplines. There are three visualizations: introducing relational databases, querying, and design. A unique feature of these learning tools is the ability for instructors in diverse disciplines to customize the content of the visualization’s example data, supporting text, and formative assessment questions to promote relevance to their students. This paper presents a study on the impact of the customized introduction to relational databases visualization on both conceptual learning and attitudes towards databases. The assessment was performed in three different courses across two universities. The evaluation shows that learning outcomes are met with any visualization, which appears to be counter to expectations. However, students using a visualization customized to the course context had more positive attitudes and beliefs towards the usefulness of databases than the control group.


2020 ◽  
Author(s):  
Giacomo Albi ◽  
Lorenzo Pareschi ◽  
Mattia Zanella

After an initial phase characterized by the introduction of timely and drastic containment measures aimed at stopping the epidemic contagion from SARS-CoV2, many governments are preparing to relax such measures in the face of a severe economic crisis caused by lockdowns. Assessing the impact of such openings in relation to the risk of a resumption of the spread of the disease is an extremely difficult problem due to the many unknowns concerning the actual number of people infected, the actual reproduction number and infection fatality rate of the disease. In this work, starting from a compartmental model with a social structure, we derive models with multiple feedback controls depending on the social activities that allow to assess the impact of a selective relaxation of the containment measures in the presence of uncertain data. Specific contact patterns in the home, work, school and other locations for all countries considered have been used. Results from different scenarios in some of the major countries where the epidemic is ongoing, including Germany, France, Italy, Spain, the United Kingdom and the United States, are presented and discussed.


Sign in / Sign up

Export Citation Format

Share Document