distributed environments
Recently Published Documents


TOTAL DOCUMENTS

568
(FIVE YEARS 50)

H-INDEX

25
(FIVE YEARS 1)

Author(s):  
Akira Kawaguchi ◽  
Nguyen Viet Ha ◽  
Masato Tsuru ◽  
Abbe Mowshowitz ◽  
Masahiro Shibata

2021 ◽  
Author(s):  
HamidReza Arkian ◽  
Guillaume Pierre ◽  
Johan Tordsson ◽  
Erik Elmroth

2021 ◽  
Vol 14 (11) ◽  
pp. 2244-2257
Author(s):  
Otmar Ertl

MinHash and HyperLogLog are sketching algorithms that have become indispensable for set summaries in big data applications. While HyperLogLog allows counting different elements with very little space, MinHash is suitable for the fast comparison of sets as it allows estimating the Jaccard similarity and other joint quantities. This work presents a new data structure called SetSketch that is able to continuously fill the gap between both use cases. Its commutative and idempotent insert operation and its mergeable state make it suitable for distributed environments. Fast, robust, and easy-to-implement estimators for cardinality and joint quantities, as well as the ability to use SetSketch for similarity search, enable versatile applications. The presented joint estimator can also be applied to other data structures such as MinHash, HyperLogLog, or Hyper-MinHash, where it even performs better than the corresponding state-of-the-art estimators in many cases.


2021 ◽  
Author(s):  
S.A. Gorsky ◽  
A.G. Feoktistov

The paper addresses a relevant problem of computation scheduling in scientific applications (distributed applied software packages) executed in distributed environments. Forming an optimal schedule of jobs for executing of applied software (modules) is an NP-hard problem. Therefore, in practice, heuristic methods of scheduling are often used. In this regard, we propose a new static-dynamic algorithm for managing computations in heterogeneous distributed environments. The results of operating the proposed algorithm are simulated in comparison with other scenarios for computing management. They show that applying the algorithm makes it possible to achieve a rational balance between the scheduling time and the computations makespan.


Sign in / Sign up

Export Citation Format

Share Document