iHDT++: improving HDT for SPARQL triple pattern resolution

2020 ◽  
Vol 39 (2) ◽  
pp. 2249-2261
Author(s):  
Antonio Hernández-Illera ◽  
Miguel A. Martínez-Prieto ◽  
Javier D. Fernández ◽  
Antonio Fariña

RDF self-indexes compress the RDF collection and provide efficient access to the data without a previous decompression (via the so-called SPARQL triple patterns). HDT is one of the reference solutions in this scenario, with several applications to lower the barrier of both publication and consumption of Big Semantic Data. However, the simple design of HDT takes a compromise position between compression effectiveness and retrieval speed. In particular, it supports scan and subject-based queries, but it requires additional indexes to resolve predicate and object-based SPARQL triple patterns. A recent variant, HDT++, improves HDT compression ratios, but it does not retain the original HDT retrieval capabilities. In this article, we extend HDT++ with additional indexes to support full SPARQL triple pattern resolution with a lower memory footprint than the original indexed HDT (called HDT-FoQ). Our evaluation shows that the resultant structure, iHDT++ , requires 70 - 85% of the original HDT-FoQ space (and up to 48 - 72% for an HDT Community variant). In addition, iHDT++ shows significant performance improvements (up to one level of magnitude) for most triple pattern queries, being competitive with state-of-the-art RDF self-indexes.

2010 ◽  
Vol 20 (04) ◽  
pp. 341-357 ◽  
Author(s):  
JOSE CARLOS SANCHO ◽  
DARREN J. KERBYSON ◽  
MICHAEL LANG

The increasing core-count on current and future processors is posing critical challenges to the memory subsystem to efficiently handle concurrent memory requests. The current trend is to increase the number of memory channels available to the processor's memory controller. In this paper we investigate the advantages and disadvantages of this approach from both a technological and an application performance viewpoint. In particular, we explore the trade-off between employing multiple memory channels per memory controller and the use of multiple memory controllers with fewer memory channels. Experiments conducted on two current state-of-the-art multi-core processors, a 6-core AMD Istanbul and a 4-core Intel Nehalem-EP, using the STREAM benchmark and a wide range of production applications. An analytical model of the STREAM performance is used to illustrate the diminishing return obtained when increasing the number of memory channels per memory controller whose effect is also seen in the application performance. In addition, we show that this performance degradation can be efficiently addressed by increasing the ratio of memory controllers to channels while keeping the number of memory channels constant. Significant performance improvements can be achieved in this scheme, up to 28%, in the case of using two memory controllers each with one channel compared with one controller with two memory channels.


2015 ◽  
Vol 52 ◽  
pp. 445-475 ◽  
Author(s):  
Marie-Catherine De Marneffe ◽  
Marta Recasens ◽  
Christopher Potts

A discourse typically involves numerous entities, but few are mentioned more than once. Distinguishing those that die out after just one mention (singleton) from those that lead longer lives (coreferent) would dramatically simplify the hypothesis space for coreference resolution models, leading to increased performance. To realize these gains, we build a classifier for predicting the singleton/coreferent distinction. The model’s feature representations synthesize linguistic insights about the factors affecting discourse entity lifespans (especially negation, modality, and attitude predication) with existing results about the benefits of “surface” (part-of-speech and n-gram-based) features for coreference resolution. The model is effective in its own right, and the feature representations help to identify the anchor phrases in bridging anaphora as well. Furthermore, incorporating the model into two very different state-of-the-art coreference resolution systems, one rule-based and the other learning-based, yields significant performance improvements.


2021 ◽  
Author(s):  
Yao-zhong Zhang ◽  
Sera Hatakeyama ◽  
Kiyoshi Yamaguchi ◽  
Yoichi Furukawa ◽  
Satoru Miyano ◽  
...  

AbstractMotivationDNA methylation is a common epigenetic modification, which is widely associated with various biological processes, such as gene expression, aging, and disease. Nanopore sequencing provides a promising methylation detection approach through monitoring abnormal signal shifts for detecting modified bases in target motif regions. Recently, model-based approaches, especially those with deep learning models, have achieved significant performance improvements on nanopore methylation detection. In this work, we explore using bidirectional encoder representations from transformers (BERT) for doing the task, which can provide non-recurrent neural structures for fast parallel computation.ResultsWe find original BERT architecture does not work as well as the bidirectional recurrent neural network (biRNN) on the nanopore methylation prediction task. Through further analysis, we observe recurrent patterns of positional-signal-shift in the context window surrounding target 5-methylcytosine (5mC) and N6-methyladenine (6mA) motifs. We propose a refined BERT with relative position representation and center hidden units concatenation, which takes account of task-specific characters into modeling. We perform systematic evaluations in-sample and cross-sample. The experiment results show that the refined BERT model can achieve competitive or even better results than the state-of-the-art biRNN model, while the model inference speed is about 6x faster. Besides, on the cross-sample evaluation of datasets from the different research groups, BERT models demonstrate a good generalization performance.AvailabilityThe source code and data are available at https://github.com/yaozhong/[email protected]


2009 ◽  
Vol 36 ◽  
pp. 165-228 ◽  
Author(s):  
B. Motik ◽  
R. Shearer ◽  
I. Horrocks

We present a novel reasoning calculus for the description logic SHOIQ^+---a knowledge representation formalism with applications in areas such as the Semantic Web. Unnecessary nondeterminism and the construction of large models are two primary sources of inefficiency in the tableau-based reasoning calculi used in state-of-the-art reasoners. In order to reduce nondeterminism, we base our calculus on hypertableau and hyperresolution calculi, which we extend with a blocking condition to ensure termination. In order to reduce the size of the constructed models, we introduce anywhere pairwise blocking. We also present an improved nominal introduction rule that ensures termination in the presence of nominals, inverse roles, and number restrictions---a combination of DL constructs that has proven notoriously difficult to handle. Our implementation shows significant performance improvements over state-of-the-art reasoners on several well-known ontologies.


Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1639
Author(s):  
Seungmin Jung ◽  
Jihoon Moon ◽  
Sungwoo Park ◽  
Eenjun Hwang

Recently, multistep-ahead prediction has attracted much attention in electric load forecasting because it can deal with sudden changes in power consumption caused by various events such as fire and heat wave for a day from the present time. On the other hand, recurrent neural networks (RNNs), including long short-term memory and gated recurrent unit (GRU) networks, can reflect the previous point well to predict the current point. Due to this property, they have been widely used for multistep-ahead prediction. The GRU model is simple and easy to implement; however, its prediction performance is limited because it considers all input variables equally. In this paper, we propose a short-term load forecasting model using an attention based GRU to focus more on the crucial variables and demonstrate that this can achieve significant performance improvements, especially when the input sequence of RNN is long. Through extensive experiments, we show that the proposed model outperforms other recent multistep-ahead prediction models in the building-level power consumption forecasting.


Sensors ◽  
2020 ◽  
Vol 20 (20) ◽  
pp. 5748
Author(s):  
Zhibo Zhang ◽  
Qing Chang ◽  
Na Zhao ◽  
Chen Li ◽  
Tianrun Li

The future development of communication systems will create a great demand for the internet of things (IOT), where the overall control of all IOT nodes will become an important problem. Considering the essential issues of miniaturization and energy conservation, in this study, a new data downlink system is designed in which all IOT nodes harvest energy first and then receive data. To avoid the unsolvable problem of pre-locating all positions of vast IOT nodes, a device called the power and data beacon (PDB) is proposed. This acts as a relay station for energy and data. In addition, we model future scenes in which a communication system is assisted by unmanned aerial vehicles (UAVs), large intelligent surfaces (LISs), and PDBs. In this paper, we propose and solve the problem of determining the optimal flight trajectory to reach the minimum energy consumption or minimum time consumption. Four future feasible scenes are analyzed and then the optimization problems are solved based on numerical algorithms. Simulation results show that there are significant performance improvements in energy/time with the deployment of LISs and reasonable UAV trajectory planning.


2011 ◽  
Vol 44 (6) ◽  
pp. 1272-1276 ◽  
Author(s):  
Koichi Momma ◽  
Fujio Izumi

VESTAis a three-dimensional visualization system for crystallographic studies and electronic state calculations. It has been upgraded to the latest version,VESTA 3, implementing new features including drawing the external morphology of crystals; superimposing multiple structural models, volumetric data and crystal faces; calculation of electron and nuclear densities from structure parameters; calculation of Patterson functions from structure parameters or volumetric data; integration of electron and nuclear densities by Voronoi tessellation; visualization of isosurfaces with multiple levels; determination of the best plane for selected atoms; an extended bond-search algorithm to enable more sophisticated searches in complex molecules and cage-like structures; undo and redo in graphical user interface operations; and significant performance improvements in rendering isosurfaces and calculating slices.


2017 ◽  
Vol 107 (04) ◽  
pp. 301-305
Author(s):  
E. Prof. Uhlmann ◽  
F. Kaulfersch

Partikelverstärkte Titanmatrix-Verbundwerkstoffe erlauben erhebliche Leistungssteigerungen im Bereich hochtemperaturbeanspruchter Struktur- und Funktionsbauteile. Die durch die Partikelverstärkung gesteigerte Verschleißbeständigkeit, Festigkeit und Härte bedeuten eine große Herausforderung an die spanende Bearbeitung derartiger Hochleistungswerkstoffe. Mittels Zerspanuntersuchungen beim Fräsen konnten unter Variation der Werkzeuggeometrie, der Schneidstoffe und der Prozessstrategie Parameterbeiche identifiziert werden, mit denen die prozesssichere Zerspanung partikelverstärkter Titanmatrix-Verbundwerkstoffe möglich ist.   Particle-reinforced titanium matrix composites ensure significant performance improvements of structural and functional high-temperature components. However, the high wear resistance, toughness and hardness due to particle reinforcement is a major challenge in machining these high performance materials. By conducting milling experiments with a variation of tool geometry, cutting material and process strategy, process parameters could be identified that enable efficient machining of particle-reinforced titanium matrix composites.


2020 ◽  
Vol 70 (1) ◽  
pp. 60-65 ◽  
Author(s):  
Goran Marković ◽  
Vlada Sokolović

Networks with distributed sensors, e.g. cognitive radio networks or wireless sensor networks enable large-scale deployments of cooperative automatic modulation classification (AMC). Existing cooperative AMC schemes with centralised fusion offer considerable performance increase in comparison to single sensor reception. Previous studies were generally focused on AMC scenarios in which multipath channel is assumed to be static during a signal reception. However, in practical mobile environments, time-correlated multipath channels occur, which induce large negative influence on the existing cooperative AMC solutions. In this paper, we propose two novel cooperative AMC schemes with the additional intra-sensor fusion, and show that these offer significant performance improvements over the existing ones under given conditions.


2021 ◽  
Author(s):  
Dilshad Hassan Sallo ◽  
Gabor Kecskemeti

Discrete Event Simulation (DES) frameworks gained significant popularity to support and evaluate cloud computing environments. They support decision-making for complex scenarios, saving time and effort. The majority of these frameworks lack parallel execution. In spite being a sequential framework, DISSECT-CF introduced significant performance improvements when simulating Infrastructure as a Service (IaaS) clouds. Even with these improvements over the state of the art sequential simulators, there are several scenarios (e.g., large scale Internet of Things or serverless computing systems) which DISSECT-CF would not simulate in a timely fashion. To remedy such scenarios this paper introduces parallel execution to its most abstract subsystem: the event system. The new event subsystem detects when multiple events occur at a specific time instance of the simulation and decides to execute them either on a parallel or a sequential fashion. This decision is mainly based on the number of independent events and the expected workload of a particular event. In our evaluation, we focused exclusively on time management scenarios. While we did so, we ensured the behaviour of the events should be equivalent to realistic, larger-scale simulation scenarios. This allowed us to understand the effects of parallelism on the whole framework, while we also shown the gains of the new system compared to the old sequential one. With regards to scaling, we observed it to be proportional to the number of cores in the utilised SMP host.


Sign in / Sign up

Export Citation Format

Share Document