Cache-efficient sweeping-based interval joins for extended Allen relation predicates

The VLDB Journal ◽

10.1007/s00778-020-00650-5 ◽

2021 ◽

Author(s):

Danila Piatov ◽

Sven Helmer ◽

Anton Dignös ◽

Fabio Persia

Keyword(s):

Data Structure ◽

Experimental Evaluation ◽

State Of The Art ◽

Temporal Databases ◽

Access Method ◽

Wide Range ◽

Interval Relation ◽

Cache Efficient ◽

Join Algorithms ◽

Better Than

AbstractWe develop a family of efficient plane-sweeping interval join algorithms for evaluating a wide range of interval predicates such as Allen’s relationships and parameterized relationships. Our technique is based on a framework, components of which can be flexibly combined in different manners to support the required interval relation. In temporal databases, our algorithms can exploit a well-known and flexible access method, the Timeline Index, thus expanding the set of operations it supports even further. Additionally, employing a compact data structure, the gapless hash map, we utilize the CPU cache efficiently. In an experimental evaluation, we show that our approach is several times faster and scales better than state-of-the-art techniques, while being much better suited for real-time event processing.

Download Full-text

Prime Implicate Generation in Equational Logic

Journal of Artificial Intelligence Research ◽

10.1613/jair.5481 ◽

2017 ◽

Vol 60 ◽

pp. 827-880 ◽

Cited By ~ 1

Author(s):

Mnacho Echenim ◽

Nicolas Peltier ◽

Sophie Tourret

Keyword(s):

Data Structure ◽

Experimental Evaluation ◽

State Of The Art ◽

Equational Logic ◽

First Order ◽

Correctness Proofs ◽

Tree Data ◽

Tree Data Structure

We present an algorithm for the generation of prime implicates in equational logic, that is, of the most general consequences of formulæ containing equations and disequations between first-order terms. This algorithm is defined by a calculus that is proved to be correct and complete. We then focus on the case where the considered clause set is ground, i.e., contains no variables, and devise a specialized tree data structure that is designed to efficiently detect and delete redundant implicates. The corresponding algorithms are presented along with their termination and correctness proofs. Finally, an experimental evaluation of this prime implicate generation method is conducted in the ground case, including a comparison with state-of-the-art propositional and first-order prime implicate generation tools.

Download Full-text

GEMME: A Simple and Fast Global Epistatic Model Predicting Mutational Effects

Molecular Biology and Evolution ◽

10.1093/molbev/msz179 ◽

2019 ◽

Vol 36 (11) ◽

pp. 2604-2619 ◽

Cited By ~ 2

Author(s):

Elodie Laine ◽

Yasaman Karami ◽

Alessandra Carbone

Keyword(s):

Evolutionary History ◽

Recent Progress ◽

State Of The Art ◽

Fast Method ◽

Biological Sequences ◽

Epistatic Model ◽

Input Alignment ◽

Wide Range ◽

History Of ◽

Better Than

Abstract The systematic and accurate description of protein mutational landscapes is a question of utmost importance in biology, bioengineering, and medicine. Recent progress has been achieved by leveraging on the increasing wealth of genomic data and by modeling intersite dependencies within biological sequences. However, state-of-the-art methods remain time consuming. Here, we present Global Epistatic Model for predicting Mutational Effects (GEMME) (www.lcqb.upmc.fr/GEMME), an original and fast method that predicts mutational outcomes by explicitly modeling the evolutionary history of natural sequences. This allows accounting for all positions in a sequence when estimating the effect of a given mutation. GEMME uses only a few biologically meaningful and interpretable parameters. Assessed against 50 high- and low-throughput mutational experiments, it overall performs similarly or better than existing methods. It accurately predicts the mutational landscapes of a wide range of protein families, including viral ones and, more generally, of much conserved families. Given an input alignment, it generates the full mutational landscape of a protein in a matter of minutes. It is freely available as a package and a webserver at www.lcqb.upmc.fr/GEMME/.

Download Full-text

SetSketch

Proceedings of the VLDB Endowment ◽

10.14778/3476249.3476276 ◽

2021 ◽

Vol 14 (11) ◽

pp. 2244-2257

Author(s):

Otmar Ertl

Keyword(s):

Big Data ◽

Data Structure ◽

Data Structures ◽

Similarity Search ◽

State Of The Art ◽

Use Cases ◽

Distributed Environments ◽

Jaccard Similarity ◽

Big Data Applications ◽

Better Than

MinHash and HyperLogLog are sketching algorithms that have become indispensable for set summaries in big data applications. While HyperLogLog allows counting different elements with very little space, MinHash is suitable for the fast comparison of sets as it allows estimating the Jaccard similarity and other joint quantities. This work presents a new data structure called SetSketch that is able to continuously fill the gap between both use cases. Its commutative and idempotent insert operation and its mergeable state make it suitable for distributed environments. Fast, robust, and easy-to-implement estimators for cardinality and joint quantities, as well as the ability to use SetSketch for similarity search, enable versatile applications. The presented joint estimator can also be applied to other data structures such as MinHash, HyperLogLog, or Hyper-MinHash, where it even performs better than the corresponding state-of-the-art estimators in many cases.

Download Full-text

Succinct Encoding of Binary Strings Representing Triangulations

Algorithmica ◽

10.1007/s00453-021-00861-4 ◽

2021 ◽

Author(s):

José Fuentes-Sepúlveda ◽

Diego Seco ◽

Raquel Viaña

Keyword(s):

Information Theory ◽

Data Structure ◽

Experimental Evaluation ◽

Special Class ◽

Spanning Trees ◽

State Of The Art ◽

Succinct Data Structure ◽

Planar Embeddings ◽

Specific Sequences ◽

Binary Strings

AbstractWe consider the problem of designing a succinct data structure for representing the connectivity of planar triangulations. The main result is a new succinct encoding achieving the information-theory optimal bound of 3.24 bits per vertex, while allowing efficient navigation. Our representation is based on the bijection of Poulalhon and Schaeffer (Algorithmica, 46(3):505–527, 2006) that defines a mapping between planar triangulations and a special class of spanning trees, called PS-trees. The proposed solution differs from previous approaches in that operations in planar triangulations are reduced to operations in particular parentheses sequences encoding PS-trees. Existing methods to handle balanced parentheses sequences have to be combined and extended to operate on such specific sequences, essentially for retrieving matching elements. The new encoding supports extracting the d neighbors of a query vertex in O(d) time and testing adjacency between two vertices in O(1) time. Additionally, we provide an implementation of our proposed data structure. In the experimental evaluation, our representation reaches up to 7.35 bits per vertex, improving the space usage of state-of-the-art implementations for planar embeddings.

Download Full-text

Push and Rotate: a Complete Multi-agent Pathfinding Algorithm

Journal of Artificial Intelligence Research ◽

10.1613/jair.4447 ◽

2014 ◽

Vol 51 ◽

pp. 443-492 ◽

Cited By ~ 20

Author(s):

B. de Wilde ◽

A. W. Ter Mors ◽

C. Witteveen

Keyword(s):

State Of The Art ◽

Hard Problem ◽

Post Processing ◽

Wide Range ◽

Np Hard Problem ◽

Multi Agent ◽

Strong Performance ◽

Minimal Sequence ◽

Better Than ◽

Start Location

Multi-agent Pathfinding is a relevant problem in a wide range of domains, for example in robotics and video games research. Formally, the problem considers a graph consisting of vertices and edges, and a set of agents occupying vertices. An agent can only move to an unoccupied, neighbouring vertex, and the problem of finding the minimal sequence of moves to transfer each agent from its start location to its destination is an NP-hard problem. We present Push and Rotate, a new algorithm that is complete for Multi-agent Pathfinding problems in which there are at least two empty vertices. Push and Rotate first divides the graph into subgraphs within which it is possible for agents to reach any position of the subgraph, and then uses the simple push, swap, and rotate operations to find a solution; a post-processing algorithm is also presented that eliminates redundant moves. Push and Rotate can be seen as extending Luna and Bekris's Push and Swap algorithm, which we showed to be incomplete in a previous publication. In our experiments we compare our approach with the Push and Swap, MAPP, and Bibox algorithms. The latter algorithm is restricted to a smaller class of instances as it requires biconnected graphs, but can nevertheless be considered state of the art due to its strong performance. Our experiments show that Push and Swap suffers from incompleteness, MAPP is generally not competitive with Push and Rotate, and Bibox is better than Push and Rotate on randomly generated biconnected instances, while Push and Rotate performs better on grids.

Download Full-text

Research of the activity and viability of spermatozoa at different concentrations and proportions of diluents after incubation

Pig breeding the interdepartmental subject scientific digest ◽

10.37143/0371-4365-2020-74-11 ◽

2020 ◽

pp. 88-95

Author(s):

Svitlana Lobchenko ◽

Tetiana Husar ◽

Viktor Lobchenko

Keyword(s):

Plasma Concentration ◽

Incubation Time ◽

Coefficient Of Variation ◽

Incubation Medium ◽

Wide Range ◽

Series Of Experiments ◽

The Subject ◽

Boar Spermatozoa ◽

Native Plasma ◽

Better Than

The results of studies of the viability of spermatozoa with different incubation time at different concentrations and using different diluents are highlighted in the article. (Un) concentrated spermatozoa were diluented: 1) with their native plasma; 2) medium 199; 3) a mixture of equal volumes of plasma and medium 199. The experiment was designed to generate experimental samples with spermatozoa concentrations prepared according to the method, namely: 0.2; 0.1; 0.05; 0.025 billion / ml. The sperm was evaluated after 2, 4, 6 and 8 hours. The perspective of such a study is significant and makes it possible to research various aspects of the subject in a wide range. In this regard, a series of experiments were conducted in this area. The data obtained are statistically processed and allow us to highlight the results that relate to each stage of the study. In particular, in this article it was found out some regularities between the viability of sperm, the type of diluent and the rate of rarefaction, as evidenced by the data presented in the tables. As a result of sperm incubation, the viability of spermatozoa remains at least the highest trend when sperm are diluted to a concentration of 0.1 billion / ml, regardless of the type of diluent used. To maintain the viability of sperm using this concentration of medium 199 is not better than its native plasma, and its mixture with an equal volume of plasma through any length of time incubation of such sperm. Most often it is at this concentration of sperm that their viability is characterized by the lowest coefficient of variation, regardless of the type of diluent used, which may indicate the greatest stability of the result under these conditions. The viability of spermatozoa with a concentration of 0.1 billion / ml is statistically significantly reduced only after 6 or even 8 hours of incubation. If the sperm are incubated for only 2 hours, regardless of the type of diluent used, the sperm concentrations tested do not affect the viability of the sperm. Key words: boar, spermatozoa, sperm plasma, concentration, incubation, medium 199, activity, viability, rarefaction.

Download Full-text

CLINICAL AND FORENSIC ASPECTS OF PHARMACOBEZOARS

Current Drug Research Reviews ◽

10.2174/2589977512666200217094018 ◽

2020 ◽

Vol 12 ◽

Author(s):

Francisco Basílio ◽

Ricardo Jorge Dinis-Oliveira

Keyword(s):

State Of The Art ◽

Signs And Symptoms ◽

Immediate Release ◽

Diagnosis And Treatment ◽

Blood Concentrations ◽

Forensic Practice ◽

Wide Range ◽

Therapeutic Doses ◽

Complete History ◽

Lethal Blood

Background: Pharmacobezoars are specific types of bezoars formed when medicines, such as tablets, suspensions, and/or drug delivery systems, aggregate and may cause death by occluding airways with tenacious material or by eluting drugs resulting in toxic or lethal blood concentrations. Objective: This work aims to fully review the state-of-the-art regarding pathophysiology, diagnosis, treatment and other relevant clinical and forensic features of pharmacobezoars. Results: patients of a wide range of ages and in both sexes present with signs and symptoms of intoxications or more commonly gastrointestinal obstructions. The exact mechanisms of pharmacobezoar formation are unknown but is likely multifactorial. The diagnosis and treatment depend on the gastrointestinal segment affected and should be personalized to the medication and the underlying factor. A good and complete history, physical examination, image tests, upper endoscopy and surgery through laparotomy of the lower tract are useful for diagnosis and treatment. Conclusion: Pharmacobezoars are rarely seen in clinical and forensic practice. They are related to controlled or immediate-release formulations, liquid or non-digestible substances, in normal or altered digestive motility/anatomy tract, and in overdoses or therapeutic doses, and should be suspected in the presence of risk factors or patients taking drugs which may form pharmacobezoars.

Download Full-text

Computers in Geology - 25 Years of Progress

10.1093/oso/9780195085938.001.0001 ◽

1994 ◽

Keyword(s):

Computer Modeling ◽

Quantitative Methods ◽

State Of The Art ◽

International Association ◽

Mathematical Geology ◽

25Th Anniversary ◽

The Earth ◽

Wide Range ◽

History Of ◽

Mapping Techniques

This volume vividly demonstrates the importance and increasing breadth of quantitative methods in the earth sciences. With contributions from an international cast of leading practitioners, chapters cover a wide range of state-of-the-art methods and applications, including computer modeling and mapping techniques. Many chapters also contain reviews and extensive bibliographies which serve to make this an invaluable introduction to the entire field. In addition to its detailed presentations, the book includes chapters on the history of geomathematics and on R.G.V. Eigen, the "father" of mathematical geology. Written to commemorate the 25th anniversary of the International Association for Mathematical Geology, the book will be sought after by both practitioners and researchers in all branches of geology.

Download Full-text

Effect of Contact Ratio on Spur Gear Dynamic Load With No Tooth Profile Modifications

Journal of Mechanical Design ◽

10.1115/1.2826905 ◽

1996 ◽

Vol 118 (3) ◽

pp. 439-443 ◽

Cited By ~ 34

Author(s):

Chuen-Huei Liou ◽

Hsiang Hsi Lin ◽

F. B. Oswald ◽

D. P. Townsend

Keyword(s):

Dynamic Load ◽

Spur Gear ◽

Tooth Size ◽

Contact Ratio ◽

Gear Dynamics ◽

Wide Range ◽

Gear Contact ◽

Center Distance ◽

Selection Of ◽

Better Than

This paper presents a computer simulation showing how the gear contact ratio affects the dynamic load on a spur gear transmission. The contact ratio can be affected by the tooth addendum, the pressure angle, the tooth size (diametral pitch), and the center distance. The analysis presented in this paper was performed by using the NASA gear dynamics code DANST. In the analysis, the contact ratio was varied over the range 1.20 to 2.40 by changing the length of the tooth addendum. In order to simplify the analysis, other parameters related to contact ratio were held constant. The contact ratio was found to have a significant influence on gear dynamics. Over a wide range of operating speeds, a contact ratio close to 2.0 minimized dynamic load. For low-contact-ratio gears (contact ratio less than two), increasing the contact ratio reduced gear dynamic load. For high-contact-ratio gears (contact ratio equal to or greater than 2.0), the selection of contact ratio should take into consideration the intended operating speeds. In general, high-contact-ratio gears minimized dynamic load better than low-contact-ratio gears.

Download Full-text

In-Memory Interval Joins

The VLDB Journal ◽

10.1007/s00778-020-00639-0 ◽

2021 ◽

Author(s):

Panagiotis Bouros ◽

Nikos Mamoulis ◽

Dimitrios Tsitsigkos ◽

Manolis Terrovitis

Keyword(s):

Parallel Computation ◽

State Of The Art ◽

Complex Data ◽

Plane Sweep ◽

Join Algorithm ◽

Sweep Algorithm ◽

Join Algorithms ◽

Domain Partitioning ◽

Complex Data Structure ◽

Independent Tasks

AbstractThe interval join is a popular operation in temporal, spatial, and uncertain databases. The majority of interval join algorithms assume that input data reside on disk and so, their focus is to minimize the I/O accesses. Recently, an in-memory approach based on plane sweep (PS) for modern hardware was proposed which greatly outperforms previous work. However, this approach relies on a complex data structure and its parallelization has not been adequately studied. In this article, we investigate in-memory interval joins in two directions. First, we explore the applicability of a largely ignored forward scan (FS)-based plane sweep algorithm, for single-threaded join evaluation. We propose four optimizations for FS that greatly reduce its cost, making it competitive or even faster than the state-of-the-art. Second, we study in depth the parallel computation of interval joins. We design a non-partitioning-based approach that determines independent tasks of the join algorithm to run in parallel. Then, we address the drawbacks of the previously proposed hash-based partitioning and suggest a domain-based partitioning approach that does not produce duplicate results. Within our approach, we propose a novel breakdown of the partition-joins into mini-joins to be scheduled in the available CPU threads and propose an adaptive domain partitioning, aiming at load balancing. We also investigate how the partitioning phase can benefit from modern parallel hardware. Our thorough experimental analysis demonstrates the advantage of our novel partitioning-based approach for parallel computation.

Download Full-text