Computing how-provenance for SPARQL queries via query rewriting

2021 ◽  
Vol 14 (13) ◽  
pp. 3389-3401
Author(s):  
Daniel Hernández ◽  
Luis Galárraga ◽  
Katja Hose

Over the past few years, we have witnessed the emergence of large knowledge graphs built by extracting and combining information from multiple sources. This has propelled many advances in query processing over knowledge graphs, however the aspect of providing provenance explanations for query results has so far been mostly neglected. We therefore propose a novel method, SPARQLprov, based on query rewriting, to compute how-provenance polynomials for SPARQL queries over knowledge graphs. Contrary to existing works, SPARQLprov is system-agnostic and can be applied to standard and already deployed SPARQL engines without the need of customized extensions. We rely on spm-semirings to compute polynomial annotations that respect the property of commutation with homomorphisms on monotonic and non-monotonic SPARQL queries without aggregate functions. Our evaluation on real and synthetic data shows that SPARQLprov over standard engines incurs an acceptable runtime overhead w.r.t. the original query, competing with state-of-the-art solutions for how-provenance computation.

Information ◽  
2021 ◽  
Vol 12 (9) ◽  
pp. 366
Author(s):  
Giovanni Garifo ◽  
Giuseppe Futia ◽  
Antonio Vetrò ◽  
Juan Carlos De Martin

Knowledge Graphs (KGs) have emerged as a core technology for incorporating human knowledge because of their capability to capture the relational dimension of information and of its semantic properties. The nature of KGs meets one of the vocational pursuits of academic institutions, which is sharing their intellectual output, especially publications. In this paper, we describe and make available the Polito Knowledge Graph (PKG) –which semantically connects information on more than 23,000 publications and 34,000 authors– and Geranium, a semantic platform that leverages the properties of the PKG to offer advanced services for search and exploration. In particular, we describe the Geranium recommendation system, which exploits Graph Neural Networks (GNNs) to suggest collaboration opportunities between researchers of different disciplines. This work integrates the state of the art because we use data from a real application in the scholarly domain, while the current literature still explores the combination of KGs and GNNs in a prototypal context using synthetic data. The results shows that the fusion of these technologies represents a promising approach for recommendation and metadata inference in the scholarly domain.


Author(s):  
Carl E. Henderson

Over the past few years it has become apparent in our multi-user facility that the computer system and software supplied in 1985 with our CAMECA CAMEBAX-MICRO electron microprobe analyzer has the greatest potential for improvement and updating of any component of the instrument. While the standard CAMECA software running on a DEC PDP-11/23+ computer under the RSX-11M operating system can perform almost any task required of the instrument, the commands are not always intuitive and can be difficult to remember for the casual user (of which our laboratory has many). Given the widespread and growing use of other microcomputers (such as PC’s and Macintoshes) by users of the microprobe, the PDP has become the “oddball” and has also fallen behind the state-of-the-art in terms of processing speed and disk storage capabilities. Upgrade paths within products available from DEC are considered to be too expensive for the benefits received. After using a Macintosh for other tasks in the laboratory, such as instrument use and billing records, word processing, and graphics display, its unique and “friendly” user interface suggested an easier-to-use system for computer control of the electron microprobe automation. Specifically a Macintosh IIx was chosen for its capacity for third-party add-on cards used in instrument control.


2018 ◽  
Vol 5 (2) ◽  
pp. 207-211
Author(s):  
Nazila Zarghi ◽  
Soheil Dastmalchian Khorasani

Abstract Evidence based social sciences, is one of the state-of- the-art area in this field. It is making decisions on the basis of conscientious, explicit and judicious use of the best available evidence from multiple sources. It also could be conducive to evidence based social work, i.e a kind of evidence based practice in some extent. In this new emerging field, the research findings help social workers in different levels of social sciences such as policy making, management, academic area, education, and social settings, etc.When using research in real setting, it is necessary to do critical appraisal, not only for trustingon internal validity or rigor methodology of the paper, but also for knowing in what extent research findings could be applied in real setting. Undoubtedly, the latter it is a kind of subjective judgment. As social sciences findings are highly context bound, it is necessary to pay more attention to this area. The present paper tries to introduce firstly evidence based social sciences and its importance and then propose criteria for critical appraisal of research findings for application in society.


2019 ◽  
Vol 13 (2) ◽  
pp. 14-31
Author(s):  
Mamdouh Alenezi ◽  
Muhammad Usama ◽  
Khaled Almustafa ◽  
Waheed Iqbal ◽  
Muhammad Ali Raza ◽  
...  

NoSQL-based databases are attractive to store and manage big data mainly due to high scalability and data modeling flexibility. However, security in NoSQL-based databases is weak which raises concerns for users. Specifically, security of data at rest is a high concern for the users deployed their NoSQL-based solutions on the cloud because unauthorized access to the servers will expose the data easily. There have been some efforts to enable encryption for data at rest for NoSQL databases. However, existing solutions do not support secure query processing, and data communication over the Internet and performance of the proposed solutions are also not good. In this article, the authors address NoSQL data at rest security concern by introducing a system which is capable to dynamically encrypt/decrypt data, support secure query processing, and seamlessly integrate with any NoSQL- based database. The proposed solution is based on a combination of chaotic encryption and Order Preserving Encryption (OPE). The experimental evaluation showed excellent results when integrated the solution with MongoDB and compared with the state-of-the-art existing work.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
João Lobo ◽  
Rui Henriques ◽  
Sara C. Madeira

Abstract Background Three-way data started to gain popularity due to their increasing capacity to describe inherently multivariate and temporal events, such as biological responses, social interactions along time, urban dynamics, or complex geophysical phenomena. Triclustering, subspace clustering of three-way data, enables the discovery of patterns corresponding to data subspaces (triclusters) with values correlated across the three dimensions (observations $$\times$$ × features $$\times$$ × contexts). With increasing number of algorithms being proposed, effectively comparing them with state-of-the-art algorithms is paramount. These comparisons are usually performed using real data, without a known ground-truth, thus limiting the assessments. In this context, we propose a synthetic data generator, G-Tric, allowing the creation of synthetic datasets with configurable properties and the possibility to plant triclusters. The generator is prepared to create datasets resembling real 3-way data from biomedical and social data domains, with the additional advantage of further providing the ground truth (triclustering solution) as output. Results G-Tric can replicate real-world datasets and create new ones that match researchers needs across several properties, including data type (numeric or symbolic), dimensions, and background distribution. Users can tune the patterns and structure that characterize the planted triclusters (subspaces) and how they interact (overlapping). Data quality can also be controlled, by defining the amount of missing, noise or errors. Furthermore, a benchmark of datasets resembling real data is made available, together with the corresponding triclustering solutions (planted triclusters) and generating parameters. Conclusions Triclustering evaluation using G-Tric provides the possibility to combine both intrinsic and extrinsic metrics to compare solutions that produce more reliable analyses. A set of predefined datasets, mimicking widely used three-way data and exploring crucial properties was generated and made available, highlighting G-Tric’s potential to advance triclustering state-of-the-art by easing the process of evaluating the quality of new triclustering approaches.


2021 ◽  
Vol 9 (1) ◽  
Author(s):  
Aysen Degerli ◽  
Mete Ahishali ◽  
Mehmet Yamac ◽  
Serkan Kiranyaz ◽  
Muhammad E. H. Chowdhury ◽  
...  

AbstractComputer-aided diagnosis has become a necessity for accurate and immediate coronavirus disease 2019 (COVID-19) detection to aid treatment and prevent the spread of the virus. Numerous studies have proposed to use Deep Learning techniques for COVID-19 diagnosis. However, they have used very limited chest X-ray (CXR) image repositories for evaluation with a small number, a few hundreds, of COVID-19 samples. Moreover, these methods can neither localize nor grade the severity of COVID-19 infection. For this purpose, recent studies proposed to explore the activation maps of deep networks. However, they remain inaccurate for localizing the actual infestation making them unreliable for clinical use. This study proposes a novel method for the joint localization, severity grading, and detection of COVID-19 from CXR images by generating the so-called infection maps. To accomplish this, we have compiled the largest dataset with 119,316 CXR images including 2951 COVID-19 samples, where the annotation of the ground-truth segmentation masks is performed on CXRs by a novel collaborative human–machine approach. Furthermore, we publicly release the first CXR dataset with the ground-truth segmentation masks of the COVID-19 infected regions. A detailed set of experiments show that state-of-the-art segmentation networks can learn to localize COVID-19 infection with an F1-score of 83.20%, which is significantly superior to the activation maps created by the previous methods. Finally, the proposed approach achieved a COVID-19 detection performance with 94.96% sensitivity and 99.88% specificity.


Electronics ◽  
2021 ◽  
Vol 10 (12) ◽  
pp. 1407
Author(s):  
Peng Wang ◽  
Jing Zhou ◽  
Yuzhang Liu ◽  
Xingchen Zhou

Knowledge graph embedding aims to embed entities and relations into low-dimensional vector spaces. Most existing methods only focus on triple facts in knowledge graphs. In addition, models based on translation or distance measurement cannot fully represent complex relations. As well-constructed prior knowledge, entity types can be employed to learn the representations of entities and relations. In this paper, we propose a novel knowledge graph embedding model named TransET, which takes advantage of entity types to learn more semantic features. More specifically, circle convolution based on the embeddings of entity and entity types is utilized to map head entity and tail entity to type-specific representations, then translation-based score function is used to learn the presentation triples. We evaluated our model on real-world datasets with two benchmark tasks of link prediction and triple classification. Experimental results demonstrate that it outperforms state-of-the-art models in most cases.


Author(s):  
Mingliang Xu ◽  
Qingfeng Li ◽  
Jianwei Niu ◽  
Hao Su ◽  
Xiting Liu ◽  
...  

Quick response (QR) codes are usually scanned in different environments, so they must be robust to variations in illumination, scale, coverage, and camera angles. Aesthetic QR codes improve the visual quality, but subtle changes in their appearance may cause scanning failure. In this article, a new method to generate scanning-robust aesthetic QR codes is proposed, which is based on a module-based scanning probability estimation model that can effectively balance the tradeoff between visual quality and scanning robustness. Our method locally adjusts the luminance of each module by estimating the probability of successful sampling. The approach adopts the hierarchical, coarse-to-fine strategy to enhance the visual quality of aesthetic QR codes, which sequentially generate the following three codes: a binary aesthetic QR code, a grayscale aesthetic QR code, and the final color aesthetic QR code. Our approach also can be used to create QR codes with different visual styles by adjusting some initialization parameters. User surveys and decoding experiments were adopted for evaluating our method compared with state-of-the-art algorithms, which indicates that the proposed approach has excellent performance in terms of both visual quality and scanning robustness.


Semantic Web ◽  
2021 ◽  
pp. 1-16
Author(s):  
Esko Ikkala ◽  
Eero Hyvönen ◽  
Heikki Rantala ◽  
Mikko Koho

This paper presents a new software framework, Sampo-UI, for developing user interfaces for semantic portals. The goal is to provide the end-user with multiple application perspectives to Linked Data knowledge graphs, and a two-step usage cycle based on faceted search combined with ready-to-use tooling for data analysis. For the software developer, the Sampo-UI framework makes it possible to create highly customizable, user-friendly, and responsive user interfaces using current state-of-the-art JavaScript libraries and data from SPARQL endpoints, while saving substantial coding effort. Sampo-UI is published on GitHub under the open MIT License and has been utilized in several internal and external projects. The framework has been used thus far in creating six published and five forth-coming portals, mostly related to the Cultural Heritage domain, that have had tens of thousands of end-users on the Web.


Sign in / Sign up

Export Citation Format

Share Document