schema design Latest Research Papers

Integrity constraints such as functional dependencies (FD) and multi-valued dependencies (MVD) are fundamental in database schema design. Likewise, probabilistic conditional independences (CI) are crucial for reasoning about multivariate probability distributions. The implication problem studies whether a set of constraints (antecedents) implies another constraint (consequent), and has been investigated in both the database and the AI literature, under the assumption that all constraints hold exactly. However, many applications today consider constraints that hold only approximately. In this paper we define an approximate implication as a linear inequality between the degree of satisfaction of the antecedents and consequent, and we study the relaxation problem: when does an exact implication relax to an approximate implication? We use information theory to define the degree of satisfaction, and prove several results. First, we show that any implication from a set of data dependencies (MVDs+FDs) can be relaxed to a simple linear inequality with a factor at most quadratic in the number of variables; when the consequent is an FD, the factor can be reduced to 1. Second, we prove that there exists an implication between CIs that does not admit any relaxation; however, we prove that every implication between CIs relaxes "in the limit". Then, we show that the implication problem for differential constraints in market basket analysis also admits a relaxation with a factor equal to 1. Finally, we show how some of the results in the paper can be derived using the I-measure theory, which relates between information theoretic measures and set theory. Our results recover, and sometimes extend, previously known results about the implication problem: the implication of MVDs and FDs can be checked by considering only 2-tuple relations.

Choosing the Correct Event Schema Design in Event-Driven Microservices

10.1007/978-1-4842-7468-2_8 ◽

2021 ◽

pp. 323-355

Author(s):

Hugo Filipe Oliveira Rocha

Keyword(s):

Schema Design ◽

Event Driven

Data Management Schema Design for Effective Nanoparticle Formulation for Neurotherapeutics

AIChE Journal ◽

10.1002/aic.17459 ◽

2021 ◽

Author(s):

Hawley Helmbrecht ◽

Nuo Xu ◽

Rick Liao ◽

Elizabeth Nance

Keyword(s):

Data Management ◽

Schema Design ◽

Nanoparticle Formulation

COVID term: a bilingual terminology for COVID-19

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-021-01593-9 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Hetong Ma ◽

Liu Shen ◽

Haixia Sun ◽

Zidu Xu ◽

Li Hou ◽

...

Keyword(s):

Model Building ◽

Scientific Discovery ◽

Living Organism ◽

Treatment Technique ◽

Health Providers ◽

Anatomic Site ◽

Source Selection ◽

Term Extraction ◽

Schema Design ◽

Psychological Assistance

Abstract Background The coronavirus disease (COVID-19), a pneumonia caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has shown its destructiveness with more than one million confirmed cases and dozens of thousands of death, which is highly contagious and still spreading globally. World-wide studies have been conducted aiming to understand the COVID-19 mechanism, transmission, clinical features, etc. A cross-language terminology of COVID-19 is essential for improving knowledge sharing and scientific discovery dissemination. Methods We developed a bilingual terminology of COVID-19 named COVID Term with mapping Chinese and English terms. The terminology was constructed as follows: (1) Classification schema design; (2) Concept representation model building; (3) Term source selection and term extraction; (4) Hierarchical structure construction; (5) Quality control (6) Web service. We built open access for the terminology, providing search, browse, and download services. Results The proposed COVID Term include 10 categories: disease, anatomic site, clinical manifestation, demographic and socioeconomic characteristics, living organism, qualifiers, psychological assistance, medical equipment, instruments and materials, epidemic prevention and control, diagnosis and treatment technique respectively. In total, COVID Terms covered 464 concepts with 724 Chinese terms and 887 English terms. All terms are openly available online (COVID Term URL: http://covidterm.imicams.ac.cn). Conclusions COVID Term is a bilingual terminology focused on COVID-19, the epidemic pneumonia with a high risk of infection around the world. It will provide updated bilingual terms of the disease to help health providers and medical professionals retrieve and exchange information and knowledge in multiple languages. COVID Term was released in machine-readable formats (e.g., XML and JSON), which would contribute to the information retrieval, machine translation and advanced intelligent techniques application.

Element order is always important in XML, except when it isn't

Proceedings of Balisage: The Markup Conference 2021 ◽

10.4242/balisagevol26.lafontaine01 ◽

2021 ◽

Author(s):

Robin La Fontaine

Keyword(s):

Common Ground ◽

Element Order ◽

Schema Design ◽

The One

"Which came first," begins an old joke. But the more interesting question might be, "does it even matter?" There are many obvious and several not-so-obvious ways in which the order of items (be they XML elements or attributes, or JSON maps or arrays) can be understood to be significant or insignificant. These are not new questions and how they’re answered plays out across vocabulary design, schema design, and individual documents. They are important questions when it comes deciding if two documents are “the same” or “different” and to what extent. This paper challenges the one-size-fits-all decree in XML that order needs to be preserved and reviews the implications of 'order'. When ordered elements can be moved then we have something that has some common ground with orderless. This paper establishes a continuum between ordered information and orderless information and proposes that these are not as far apart as they might at first appear.

Influence of Schema Design in NoSQL Document Stores

Mobile Computing and Sustainable Informatics - Lecture Notes on Data Engineering and Communications Technologies ◽

10.1007/978-981-16-1866-6_32 ◽

2021 ◽

pp. 435-452

Author(s):

Monika Shah ◽

Amit Kothari ◽

Samir Patel

Keyword(s):

Schema Design ◽

Document Stores

Logical Schema Design that Quantifies Update Inefficiency and Join Efficiency

Proceedings of the 2021 International Conference on Management of Data ◽

10.1145/3448016.3459238 ◽

2021 ◽

Author(s):

Sebastian Link ◽

Ziheng Wei

Keyword(s):

Schema Design

Embedded Functional Dependencies and Data-completeness Tailored Database Design

ACM Transactions on Database Systems ◽

10.1145/3450518 ◽

2021 ◽

Vol 46 (2) ◽

pp. 1-46

Author(s):

Ziheng Wei ◽

Sebastian Link

Keyword(s):

Missing Values ◽

Normal Forms ◽

Functional Dependencies ◽

Redundant Data ◽

Processing Data ◽

Data Value ◽

Schema Design ◽

Join Queries ◽

Application Data ◽

Fit For Purpose

We establish a principled schema design framework for data with missing values. The framework is based on the new notion of an embedded functional dependency, which is independent of the interpretation of missing values, able to express completeness and integrity requirements on application data, and capable of capturing redundant data value occurrences that may cause problems with processing data that meets the requirements. We establish axiomatic, algorithmic, and logical foundations for reasoning about embedded functional dependencies. These foundations enable us to introduce generalizations of Boyce-Codd and Third normal forms that avoid processing difficulties of any application data, or minimize these difficulties across dependency-preserving decompositions, respectively. We show how to transform any given schema into application schemata that meet given completeness and integrity requirements, and the conditions of the generalized normal forms. Data over those application schemata are therefore fit for purpose by design. Extensive experiments with benchmark schemata and data illustrate the effectiveness of our framework for the acquisition of the constraints, the schema design process, and the performance of the schema designs in terms of updates and join queries.

Fuzzy database for medical diagnosis.

10.32920/ryerson.14664432.v1 ◽

2021 ◽

Author(s):

Rehana Parvin

Keyword(s):

Decision Making ◽

Fuzzy Inference ◽

Database Systems ◽

Database System ◽

Inference System ◽

Related Information ◽

Data Value ◽

Schema Design ◽

Health Related ◽

Fuzzy Database

A challenge of working with traditional database systems with large amounts of data is that decision making requires numerous comparisons. Health-related database systems are examples of such databases, which contain millions of data entries and require fast data processing to examine related information to make complex decisions. In this thesis, a fuzzy database system is developed by integration of fuzzy inference system (FIS) and fuzzy schema design, and implementing it by SQL in three different health-care contexts; the assessments of heart disease, diabetes mellitus, and liver disorders. The fuzzy database system is implemented with the potential of having any form of data and tested with different types of data value, including crisp, linguistic, and null (i.e., missing) data. The developed system can explore crisp and linguistic data with loosely defined boundary conditions for decision-making. FIS and neural network-based solutions are implemented in MATLAB for the mentioned contexts for the comparison and validation with the dataset used in published works.

Fuzzy database for medical diagnosis.

10.32920/ryerson.14664432 ◽

2021 ◽

Author(s):

Rehana Parvin

Keyword(s):

Decision Making ◽

Fuzzy Inference ◽

Database Systems ◽

Database System ◽

Inference System ◽

Related Information ◽

Data Value ◽

Schema Design ◽

Health Related ◽

Fuzzy Database

A challenge of working with traditional database systems with large amounts of data is that decision making requires numerous comparisons. Health-related database systems are examples of such databases, which contain millions of data entries and require fast data processing to examine related information to make complex decisions. In this thesis, a fuzzy database system is developed by integration of fuzzy inference system (FIS) and fuzzy schema design, and implementing it by SQL in three different health-care contexts; the assessments of heart disease, diabetes mellitus, and liver disorders. The fuzzy database system is implemented with the potential of having any form of data and tested with different types of data value, including crisp, linguistic, and null (i.e., missing) data. The developed system can explore crisp and linguistic data with loosely defined boundary conditions for decision-making. FIS and neural network-based solutions are implemented in MATLAB for the mentioned contexts for the comparison and validation with the dataset used in published works.

schema design
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Integrity Constraints Revisited: From Exact to Approximate Implication

Choosing the Correct Event Schema Design in Event-Driven Microservices

Data Management Schema Design for Effective Nanoparticle Formulation for Neurotherapeutics

COVID term: a bilingual terminology for COVID-19

Element order is always important in XML, except when it isn't

Influence of Schema Design in NoSQL Document Stores

Logical Schema Design that Quantifies Update Inefficiency and Join Efficiency

Embedded Functional Dependencies and Data-completeness Tailored Database Design

Fuzzy database for medical diagnosis.

Fuzzy database for medical diagnosis.

Export Citation Format

schema designRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Integrity Constraints Revisited: From Exact to Approximate Implication

Choosing the Correct Event Schema Design in Event-Driven Microservices

Data Management Schema Design for Effective Nanoparticle Formulation for Neurotherapeutics

COVID term: a bilingual terminology for COVID-19

Element order is always important in XML, except when it isn't

Influence of Schema Design in NoSQL Document Stores

Logical Schema Design that Quantifies Update Inefficiency and Join Efficiency

Embedded Functional Dependencies and Data-completeness Tailored Database Design

Fuzzy database for medical diagnosis.

Fuzzy database for medical diagnosis.

schema design
Recently Published Documents