Communication Analysis of Distributed Programs

Capturing and examining the causal and concurrent relationships of a distributed system is essential to a wide range of distributed systems applications. Many approaches to gathering this information rely on trace files of executions. The information obtained through tracing is limited to those executions observed. We present a methodology that analyzes the source code of the distributed system. Our analysis considers each process's source code and produces a single comprehensive graph of the system's possible behaviors. The graph, termed the partial order graph (POG), uniquely represents each possible partial order of the system. Causal and concurrent relationships can be extracted relative either to a particular partial order, which is synonymous to a single execution, or to a collection of partial orders. The graph provides a means of reasoning about the system in terms of relationships that will definitely occur, may possible occur, and will never occur. Distributed assert statements provide a means to monitor distributed system executions. By constructing thePOGprior to system execution, the causality information provided by thePOGenables run-time evaluation of the assert statement without relying on traces or addition messages.

Download Full-text

A Simulation Model for Large Scale Distributed Systems

Simulation in Computer Network Design and Modeling ◽

10.4018/978-1-4666-0191-8.ch019 ◽

2012 ◽

pp. 392-426

Author(s):

Ciprian Dobre

Keyword(s):

Distributed Systems ◽

Simulation Model ◽

Distributed System ◽

Large Scale ◽

Job Scheduling ◽

Discrete Event ◽

Realistic Simulation ◽

Decision Points ◽

Wide Range ◽

Design Characteristics

The use of discrete-event simulators in the design and development of Large Scale Distributed Systems (LSDSs) is appealing due to their efficiency and scalability. Their core abstractions of process and event map neatly to the components and interactions of modern-day distributed systems and allow designing realistic simulation scenarios. MONARC 2, a multithreaded, process oriented simulation framework designed for modeling LSDSs, allows the realistic simulation of a wide-range of distributed system technologies, with respect to their specific components and characteristics. This chapter presents the design characteristics of the simulation model proposed in MONARC 2. It starts by first analyzing existing work, outlining the key decision points taken in the design of the MONARC’s simulation model. The model includes the necessary components to describe various actual distributed system technologies and provides the mechanisms to describe concurrent network traffic, evaluate different strategies in data replication, and analyze job scheduling procedures.

Download Full-text

Survey on replication techniques for distributed system

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v9i2.pp1298-1303 ◽

2019 ◽

Vol 9 (2) ◽

pp. 1298

Author(s):

Ahmad Shukri Mohd Noor ◽

Nur Farhah Mat Zian ◽

Fatin Nurhanani M. Shaiful Bahri

Keyword(s):

Distributed Systems ◽

Fault Tolerance ◽

Distributed System ◽

Triangular Grid ◽

Critical Element ◽

Dynamic Nature ◽

Wide Range ◽

The Subject ◽

Quorum Consensus ◽

Computational Resources

Distributed systems mainly provide access to a large amount of data and computational resources through a wide range of interfaces. Besides its dynamic nature, which means that resources may enter and leave the environment at any time, many distributed systems applications will be running in an environment where faults are more likely to occur due to their ever-increasing scales and the complexity. Due to diverse faults and failures conditions, fault tolerance has become a critical element for distributed computing in order for the system to perform its function correctly even in the present of faults. Replication techniques primarily concentrate on the two fault tolerance manners precisely masking the failures as well as reconfigure the system in response. This paper presents a brief survey on different replication techniques such as Read One Write All (ROWA), Quorum Consensus (QC), Tree Quorum (TQ) Protocol, Grid Configuration (GC) Protocol, Two-Replica Distribution Techniques (TRDT), Neighbour Replica Triangular Grid (NRTG) and Neighbour Replication Distributed Techniques (NRDT). These techniques have its own redeeming features and shortcoming which forms the subject matter of this survey.

Download Full-text

XERIS/APEX

ACM SIGAda Ada Letters ◽

10.1145/3463478.3463484 ◽

2021 ◽

Vol 40 (2) ◽

pp. 65-69

Author(s):

Richard Wai

Keyword(s):

Distributed Systems ◽

Real Time ◽

Distributed System ◽

Dynamic Scaling ◽

Distributed Application ◽

Heavy Weight ◽

System Models ◽

In The Wild ◽

On Line ◽

Language Technologies

Modern day cloud native applications have become broadly representative of distributed systems in the wild. However, unlike traditional distributed system models with conceptually static designs, cloud-native systems emphasize dynamic scaling and on-line iteration (CI/CD). Cloud-native systems tend to be architected around a networked collection of distinct programs ("microservices") that can be added, removed, and updated in real-time. Typically, distinct containerized programs constitute individual microservices that then communicate among the larger distributed application through heavy-weight protocols. Common communication stacks exchange JSON or XML objects over HTTP, via TCP/TLS, and incur significant overhead, particularly when using small size message sizes. Additionally, interpreted/JIT/VM-based languages such as Javascript (NodeJS/Deno), Java, and Python are dominant in modern microservice programs. These language technologies, along with the high-overhead messaging, can impose superlinear cost increases (hardware demands) on scale-out, particularly towards hyperscale and/or with latency-sensitive workloads.

Download Full-text

Distributed System Testbeds: Experimentation with Distributed Systems: Guest Editor's Introduction

Computer ◽

10.1109/mc.1982.1653855 ◽

1982 ◽

Vol 15 (10) ◽

pp. 9-11 ◽

Cited By ~ 4

Author(s):

Berg

Keyword(s):

Distributed Systems ◽

Distributed System

Download Full-text

SHEMAT-Suite: a parallel open source simulator for flow, heat and mass transport in porous media

10.5194/egusphere-egu21-15084 ◽

2021 ◽

Author(s):

Johannes Keller ◽

Johanna Fink ◽

Norbert Klitzsch

Keyword(s):

Porous Media ◽

Mass Transport ◽

Open Source ◽

Source Code ◽

Modular Structure ◽

Numerical Code ◽

Heat And Mass Transport ◽

Transport In Porous Media ◽

Wide Range ◽

Flow Heat

We present SHEMAT-Suite, a numerical code for simulating flow, heat, and mass transport in porous media that has been published as an open source code recently. The functionality of SHEMAT-Suite comprises pure forward computation, deterministic Bayesian inversion, and stochastic Monte Carlo simulation and data assimilation. Additionally, SHEMAT-Suite features a multi-level OpenMP parallelization. Along with the source code of the software, extensive documentation and a suite of test models is provided.SHEMAT-Suite has a modular structure that makes it easy for users to adapt the code to their needs. Most importantly, there is an interface for defining the functional relationship between dynamic variables and subsurface parameters. Additionally, user-defined input and output can be implemented without interfering with the core of the code. Finally, at a deeper level, linear solvers and preconditioners can be added to the code.We present studies that have made use of the code's HPC capabilities. SHEMAT-Suite has been applied to large-scale groundwater models for a wide range of purposes, including studying the formation of convection cells, assessing geothermal potential below an office building, or modeling submarine groundwater discharge since the last ice age. The modular structure of SHEMAT-Suite has also led to diverse applications, such as glacier modeling, simulation of borehole heat exchangers, or Optimal Experimental Design applied to the placing of geothermal boreholes.Further, we present ongoing developments for improving the performance of SHEMAT-Suite, both by refactoring the source code and by interfacing SHEMAT-Suite with up-to-date HPC software. Examples of this include interfacing SHEMAT-Suite with the Portable Data Interface (PDI) for improved data management, interfacing SHEMAT-Suite with PetSC for MPI-parallel solvers, and interfacing SHEMAT-Suite with PDAF for parallel EnKF algorithms.The goal for the open source SHEMAT-Suite is to provide a rigorously tested core code for flow, heat and transport simulation, Bayesian and stochastic inversion, while at the same time enabling a wide range of scientific research through straightforward user interaction.

Download Full-text

LHCb Continuous Integration and Deployment system: a message based approach

EPJ Web of Conferences ◽

10.1051/epjconf/201921405001 ◽

2019 ◽

Vol 214 ◽

pp. 05001 ◽

Cited By ~ 1

Author(s):

Stefan-Gabriel Chitic ◽

Ben Couturier ◽

Marco Clemencic ◽

Joel Closier

Keyword(s):

Distributed System ◽

File System ◽

Source Code ◽

Priority Queues ◽

Integration System ◽

Continuous Integration ◽

System A ◽

Event Based ◽

New System

A continuous integration system is crucial to maintain the quality of the 6 millions lines of C++ and Python source code of the LHCb software in order to ensure consistent builds of the software as well as to run the unit and integration tests. Jenkins automation server is used for this purpose. It builds and tests around 100 configurations and produces in the order of 1500 built artifacts per day which are installed on the CVMFS file system or potentially on the developers’ machines. Faced with a large and growing number of configurations built every day, and in order to ease inter-operation between the continuous integration system and the developers, we decided to put in place a flexible messaging system. As soon as the built artifacts have been produced, the distributed system allows their deployment based on the priority of the configurations. We will describe the architecture of the new system, which is based on RabbitMQ messaging system (and the pika Python client library), and uses priority queues to start the LHCb software integration tests and to drive the installation of the nightly builds on the CVMFS file system. We will also show how the introduction of an event based system can help with the communication of results to developers.

Download Full-text

Resolving pseudosymmetry in tetragonal ZrO2 using electron backscatter diffraction with a modified dictionary indexing approach

Journal of Applied Crystallography ◽

10.1107/s160057672000864x ◽

2020 ◽

Vol 53 (4) ◽

pp. 1060-1072 ◽

Cited By ~ 1

Author(s):

Edward L. Pang ◽

Peter M. Larsen ◽

Christopher A. Schuh

Keyword(s):

Present Method ◽

Electron Backscatter Diffraction ◽

Source Code ◽

Tetragonal Zro2 ◽

Data Set ◽

Orientation Space ◽

Indexing Method ◽

Wide Range ◽

Electron Backscatter ◽

Backscatter Diffraction

Resolving pseudosymmetry has long presented a challenge for electron backscatter diffraction and has been notoriously challenging in the case of tetragonal ZrO2 in particular. In this work, a method is proposed to resolve pseudosymmetry by building upon the dictionary indexing method and augmenting it with the application of global optimization to fit accurate pattern centers, clustering of the Hough-indexed orientations to focus the dictionary in orientation space and interpolation to improve the accuracy of the indexed solution. The proposed method is demonstrated to resolve pseudosymmetry with 100% accuracy in simulated patterns of tetragonal ZrO2, even with high degrees of binning and noise. The method is then used to index an experimental data set, which confirms its ability to efficiently and accurately resolve pseudosymmetry in these materials. The present method can be applied to resolve pseudosymmetry in a wide range of materials, possibly even some more challenging than tetragonal ZrO2. Source code for this implementation is available online.

Download Full-text

Distributed system simulation using infinite product expansions

SIMULATION ◽

10.1177/003754977001500603 ◽

1970 ◽

Vol 15 (6) ◽

pp. 255-263 ◽

Cited By ~ 35

Author(s):

R.E. Goodson

Keyword(s):

Distributed Systems ◽

Distributed System ◽

Transfer Functions ◽

System Simulation ◽

Infinite Product ◽

Characteristic Roots

Infinite product expansions for the transcendental terms in the transfer functions for linear distributed systems are developed. Simulation of the dynamic re sponse of such systems is indicated, using the product expansion. Comparisons are made between the classi cal eigenvalue and product expansion approximations. It is concluded that the product expansion is an ex tremum transient-amplitude-preserving approximation based on the correct characteristic roots.

Download Full-text

A THREE-ROUND ADAPTIVE DIAGNOSTIC ALGORITHM IN A DISTRIBUTED SYSTEM MODELED BY DUAL-CUBES

International Journal of Foundations of Computer Science ◽

10.1142/s0129054114500075 ◽

2014 ◽

Vol 25 (02) ◽

pp. 125-139 ◽

Cited By ~ 1

Author(s):

JHENG-CHENG CHEN ◽

CHIA-JUI LAI ◽

CHANG-HSIUNG TSAI

Keyword(s):

Distributed Systems ◽

Distributed System ◽

Diagnostic Algorithm ◽

Building Blocks ◽

Diagnostic Approach ◽

Diagnostic Model ◽

Test Results ◽

Spanning Subgraph ◽

Test Result ◽

Distributed Computer Systems

Problem diagnosis in large distributed computer systems and networks is a challenging task that requires fast and accurate inferences from huge volumes of data. In this paper, the PMC diagnostic model is considered, based on the diagnostic approach of end-to-end probing technology. A probe is a test transaction whose outcome depends on some of the system's components; diagnosis is performed by selecting appropriate probes and analyzing the results. In the PMC model, every computer can execute a probe to test a dedicated system's components. Furthermore, any test result reported by a faulty probe station is unreliable and the test result reported by fault-free probe station is always correct. The aim of the diagnosis is to locate all faulty components in the system based on collection of the test results. A dual-cube DC(n) is an (n + 1)-regular spanning subgraph of a (2n + 1)-dimensional hypercube. It uses n-dimensional hypercubes as building blocks and returns the main desirable properties of the hypercube so that it is suitable as a topology for distributed systems. In this paper, we first show that the diagnosability of DC(n) is n + 1 and then show that adaptive diagnosis is possible using at most 22n+1 + n tests for a 22n+1-node distributed system modeled by dual-cubes DC(n) in which at most n + 1 processes are faulty. Furthermore, we propose an adaptive diagnostic algorithm for the DC(n) and show that it diagnoses the DC(n) in three testing rounds and at most 22n+1 + O(n3) tests, where each node is scheduled for at most one test in each round.

Download Full-text

Port-Based Modeling of Distributed-Lumped Parameter Systems

Solid State Phenomena ◽

10.4028/www.scientific.net/ssp.164.183 ◽

2010 ◽

Vol 164 ◽

pp. 183-188

Author(s):

Cezary Orlikowski ◽

Rafał Hein

Keyword(s):

Distributed Systems ◽

Transfer Function ◽

Distributed System ◽

Function Method ◽

Input Data ◽

Distributed Parameter ◽

Transfer Function Method ◽

Lumped Parameter ◽

Concise Representation ◽

Distributed Transfer Function

This paper presents a uniform, port-based approach for modeling of both lumped and distributed parameter systems. Port-based model of the distributed system has been defined by application of bond graph methodology and distributed transfer function method (DTFM). The proposed approach combines versatility of port-based modeling and accuracy of distributed transfer function method. A concise representation of lumped-distributed systems has been obtained. The proposed method of modeling enables to formulate input data for computer analysis by application of DTFM.

Download Full-text