scholarly journals RDF graph validation using rule-based reasoning

Semantic Web ◽  
2020 ◽  
Vol 12 (1) ◽  
pp. 117-142
Author(s):  
Ben De Meester ◽  
Pieter Heyvaert ◽  
Dörthe Arndt ◽  
Anastasia Dimou ◽  
Ruben Verborgh

The correct functioning of Semantic Web applications requires that given RDF graphs adhere to an expected shape. This shape depends on the RDF graph and the application’s supported entailments of that graph. During validation, RDF graphs are assessed against sets of constraints, and found violations help refining the RDF graphs. However, existing validation approaches cannot always explain the root causes of violations (inhibiting refinement), and cannot fully match the entailments supported during validation with those supported by the application. These approaches cannot accurately validate RDF graphs, or combine multiple systems, deteriorating the validator’s performance. In this paper, we present an alternative validation approach using rule-based reasoning, capable of fully customizing the used inferencing steps. We compare to existing approaches, and present a formal ground and practical implementation “Validatrr”, based on N3Logic and the EYE reasoner. Our approach – supporting an equivalent number of constraint types compared to the state of the art – better explains the root cause of the violations due to the reasoner’s generated logical proof, and returns an accurate number of violations due to the customizable inferencing rule set. Performance evaluation shows that Validatrr is performant for smaller datasets, and scales linearly w.r.t. the RDF graph size. The detailed root cause explanations can guide future validation report description specifications, and the fine-grained level of configuration can be employed to support different constraint languages. This foundation allows further research into handling recursion, validating RDF graphs based on their generation description, and providing automatic refinement suggestions.

Author(s):  
Katherine Darveau ◽  
Daniel Hannon ◽  
Chad Foster

There is growing interest in the study and practice of applying data science (DS) and machine learning (ML) to automate decision making in safety-critical industries. As an alternative or augmentation to human review, there are opportunities to explore these methods for classifying aviation operational events by root cause. This study seeks to apply a thoughtful approach to design, compare, and combine rule-based and ML techniques to classify events caused by human error in aircraft/engine assembly, maintenance or operation. Event reports contain a combination of continuous parameters, unstructured text entries, and categorical selections. A Human Factors approach to classifier development prioritizes the evaluation of distinct data features and entry methods to improve modeling. Findings, including the performance of tested models, led to recommendations for the design of textual data collection systems and classification approaches.


2019 ◽  
Vol 56 (11) ◽  
pp. 1596-1608
Author(s):  
Priyesh Verma ◽  
Ainur Seidalinova ◽  
Dharma Wijewickreme

In current geotechnical seismic design practice, the empirical correlation between equivalent number of uniform cycles (Neq) of shaking and earthquake magnitude (Mw) forms an integral part of liquefaction potential evaluation. This relationship, in turn, is used to derive the magnitude scaling factors that are commonly used in field-based liquefaction evaluation procedures. The Neq versus Mw relationship for liquefaction assessment was examined for fine-grained soils using time-histories in the range 5 < Mw ≤ 9, especially including strong ground motion time-histories from the latest subduction zone earthquakes with Mw > 8.0. The experimental database available from cyclic direct simple shear tests conducted on natural fine-grained soils retrieved from undisturbed soil sampling was used to obtain the cyclic shear resistance weighting curves for the study. The work presented herein has contributed to further improving the current models used to represent magnitude scaling factor (MSF) values for large earthquake magnitudes and the functional dependency of this parameter on soil type. The MSF–Mw curve derived for low-plastic Fraser River Delta silt lies in-between the MSF curves derived for clean sand and clay, resonating with the inferences that have been made that the silt behavior can neither be considered sand-like nor clay-like.


2021 ◽  
Vol 10 ◽  
pp. 9-12
Author(s):  
Kris Inwood ◽  
Hamish Maxwell-Stewart

Kees Mandemakers has enriched historical databases in the Netherlands and internationally through the development of the Historical Sample of the Netherlands, the Intermediate Data Structure, a practical implementation of rule-based record linking (LINKS) and personal encouragement of high quality longitudinal data in a number of countries.


SQL injection vulnerabilities have been predominant on database-driven web applications since almost one decade. Exploiting such vulnerabilities enables attackers to gain unauthorized access to the back-end databases by altering the original SQL statements through manipulating user input. Testing web applications for identifying SQL injection vulnerabilities before deployment is essential to get rid of them. However, checking such vulnerabilities by hand is very tedious, difficult, and time-consuming. Web vulnerability static analysis tools are software tools for automatically identifying the root cause of SQL injection vulnerabilities in web applications source code. In this paper, we test and evaluate three free/open source static analysis tools using eight web applications with numerous known vulnerabilities, primarily for false negative rates. The evaluation results were compared and analysed, and they indicate a need to improve the tools.


2011 ◽  
pp. 456-477 ◽  
Author(s):  
Vassilis Papataxiarhis ◽  
Vassileios Tsetsos ◽  
Isambo Karali ◽  
Panagiotis Stamatopoulos

Embedding rules into Web applications, and distributed applications in general, seems to constitute a significant task in order to accommodate desired expressivity features in such environments. Various methodologies and reasoning modules have been proposed to manage rules and knowledge on the Web. The main objective of the chapter is to survey related work in this area and discuss relevant theories, methodologies and tools that can be used to develop rule-based applications for the Web. The chapter deals with both ways that have been formally defined for modeling a domain of interest: the first based on standard logics while the second one stemmed from the logic programming perspective. Furthermore, a comparative study that evaluates the reasoning engines and the various knowledge representation methodologies, focusing on rules, is presented.


Symmetry ◽  
2019 ◽  
Vol 11 (7) ◽  
pp. 926
Author(s):  
Kyoungsoo Bok ◽  
Junwon Kim ◽  
Jaesoo Yoo

Various resource description framework (RDF) partitioning methods have been studied for the efficient distributed processing of a large RDF graph. The RDF graph has symmetrical characteristics because subject and object can be used interchangeably if predicate is changed. This paper proposes a dynamic partitioning method of RDF graphs to support load balancing in distributed environments where data insertion and change continue to occur. The proposed method generates clusters and subclusters by considering the usage frequency of the RDF graph that are used by queries as the criteria to perform graph partitioning. It creates a cluster by grouping RDF subgraphs with higher usage frequency while creating a subcluster with lower usage frequency. These clusters and subclusters conduct load balancing by using the mean frequency of queries for the distributed server and conduct graph data partitioning by considering the size of the data stored in each distributed server. It also minimizes the number of edge-cuts connected to clusters and subclusters to minimize communication costs between servers. This solves the problem of data concentration to specific servers due to ongoing data changes and additions and allows efficient load balancing among servers. The performance results show that the proposed method significantly outperforms the existing partitioning methods in terms of query performance time in a distributed server.


Literator ◽  
2008 ◽  
Vol 29 (1) ◽  
pp. 21-42 ◽  
Author(s):  
S. Pilon ◽  
M.J. Puttkammer ◽  
G.B. Van Huyssteen

The development of a hyphenator and compound analyser for Afrikaans The development of two core-technologies for Afrikaans, viz. a hyphenator and a compound analyser is described in this article. As no annotated Afrikaans data existed prior to this project to serve as training data for a machine learning classifier, the core-technologies in question are first developed using a rule-based approach. The rule-based hyphenator and compound analyser are evaluated and the hyphenator obtains an fscore of 90,84%, while the compound analyser only reaches an f-score of 78,20%. Since these results are somewhat disappointing and/or insufficient for practical implementation, it was decided that a machine learning technique (memory-based learning) will be used instead. Training data for each of the two core-technologies is then developed using “TurboAnnotate”, an interface designed to improve the accuracy and speed of manual annotation. The hyphenator developed using machine learning has been trained with 39 943 words and reaches an fscore of 98,11% while the f-score of the compound analyser is 90,57% after being trained with 77 589 annotated words. It is concluded that machine learning (specifically memory-based learning) seems an appropriate approach for developing coretechnologies for Afrikaans.


Sign in / Sign up

Export Citation Format

Share Document