scholarly journals The cost of fault tolerance in multi-party communication complexity

Author(s):  
Binbin Chen ◽  
Haifeng Yu ◽  
Yuda Zhao ◽  
Phillip B. Gibbons
2014 ◽  
Vol 61 (3) ◽  
pp. 1-64 ◽  
Author(s):  
Binbin Chen ◽  
Haifeng Yu ◽  
Yuda Zhao ◽  
Phillip B. Gibbons

Author(s):  
Zahid Raza ◽  
Deo P. Vidyarthi

Grid is a parallel and distributed computing network system comprising of heterogeneous computing resources spread over multiple administrative domains that offers high throughput computing. Since the Grid operates at a large scale, there is always a possibility of failure ranging from hardware to software. The penalty paid of these failures may be on a very large scale. System needs to be tolerant to various possible failures which, in spite of many precautions, are bound to happen. Replication is a strategy often used to introduce fault tolerance in the system to ensure successful execution of the job, even when some of the computational resources fail. Though replication incurs a heavy cost, a selective degree of replication can offer a good compromise between the performance and the cost. This chapter proposes a co-scheduler that can be integrated with main scheduler for the execution of the jobs submitted to computational Grid. The main scheduler may have any performance optimization criteria; the integration of co-scheduler will be an added advantage towards fault tolerance. The chapter evaluates the performance of the co-scheduler with the main scheduler designed to minimize the turnaround time of a modular job by introducing module replication to counter the effects of node failures in a Grid. Simulation study reveals that the model works well under various conditions resulting in a graceful degradation of the scheduler’s performance with improving the overall reliability offered to the job.


Author(s):  
Ghalem Belalem ◽  
Said Limam

Cloud computing refers to both the applications delivered as services over the Internet and the hardware and systems software in the datacenters that provide those services. Failures of any type are common in current datacenters, partly due to the number of nodes. Fault tolerance has become a major task for computer engineers and software developers because the occurrence of faults increases the cost of using resources and to meet the user expectations, the most fundamental user expectation is, of course, that his or her application correctly finishes independent of faults in the node. This paper proposes a fault tolerant architecture to Cloud Computing that uses an adaptive Checkpoint mechanism to assure that a task running can correctly finish in spite of faults in the nodes in which it is running. The proposed fault tolerant architecture is simultaneously transparent and scalable.


2017 ◽  
Vol 17 (1&2) ◽  
pp. 106-116
Author(s):  
Jop Briet ◽  
Jeroen Zuiddam

After Bob sends Alice a bit, she responds with a lengthy reply. At the cost of a factor of two in the total communication, Alice could just as well have given Bob her two possible replies at once without listening to him at all, and have him select which one applies. Motivated by a conjecture stating that this form of “round elimination” is impossible in exact quantum communication complexity, we study the orthogonal rank and a symmetric variant thereof for a certain family of Cayley graphs. The orthogonal rank of a graph is the smallest number d for which one can label each vertex with a nonzero d-dimensional complex vector such that adjacent vertices receive orthogonal vectors. We show an exp(n) lower bound on the orthogonal rank of the graph on {0, 1} n in which two strings are adjacent if they have Hamming distance at least n/2. In combination with previous work, this implies an affirmative answer to the above conjecture.


2013 ◽  
Vol 280 (1765) ◽  
pp. 20131151 ◽  
Author(s):  
T. Dávid-Barrett ◽  
R. I. M. Dunbar

Sociality is primarily a coordination problem. However, the social (or communication) complexity hypothesis suggests that the kinds of information that can be acquired and processed may limit the size and/or complexity of social groups that a species can maintain. We use an agent-based model to test the hypothesis that the complexity of information processed influences the computational demands involved. We show that successive increases in the kinds of information processed allow organisms to break through the glass ceilings that otherwise limit the size of social groups: larger groups can only be achieved at the cost of more sophisticated kinds of information processing that are disadvantageous when optimal group size is small. These results simultaneously support both the social brain and the social complexity hypotheses.


Author(s):  
Qihao Shan ◽  
Sanaz Mostaghim

AbstractMulti-option collective decision-making is a challenging task in the context of swarm intelligence. In this paper, we extend the problem of collective perception from simple binary decision-making of choosing the color in majority to estimating the most likely fill ratio from a series of discrete fill ratio hypotheses. We have applied direct comparison (DC) and direct modulation of voter-based decisions (DMVD) to this scenario to observe their performances in a discrete collective estimation problem. We have also compared their performances against an Individual Exploration baseline. Additionally, we propose a novel collective decision-making strategy called distributed Bayesian belief sharing (DBBS) and apply it to the above discrete collective estimation problem. In the experiments, we explore the performances of considered collective decision-making algorithms in various parameter settings to determine the trade-off among accuracy, speed, message transfer and reliability in the decision-making process. Our results show that both DC and DMVD outperform the Individual Exploration baseline, but both algorithms exhibit different trade-offs with respect to accuracy and decision speed. On the other hand, DBBS exceeds the performances of all other considered algorithms in all four metrics, at the cost of higher communication complexity.


Quantum ◽  
2020 ◽  
Vol 4 ◽  
pp. 286
Author(s):  
Shima Bab Hadiashar ◽  
Ashwin Nayak

We revisit the task of visible compression of an ensemble of quantum states with entanglement assistance in the one-shot setting. The protocols achieving the best compression use many more qubits of shared entanglement than the number of qubits in the states in the ensemble. Other compression protocols, with potentially larger communication cost, have entanglement cost bounded by the number of qubits in the given states. This motivates the question as to whether entanglement is truly necessary for compression, and if so, how much of it is needed. Motivated by questions in communication complexity, we lift certain restrictions that are placed on compression protocols in tasks such as state-splitting and channel simulation. We show that an ensemble of the form designed by Jain, Radhakrishnan, and Sen (ICALP'03) saturates the known bounds on the sum of communication and entanglement costs, even with the relaxed compression protocols we study. The ensemble and the associated one-way communication protocol have several remarkable properties. The ensemble is incompressible by more than a constant number of qubits without shared entanglement, even when constant error is allowed. Moreover, in the presence of shared entanglement, the communication cost of compression can be arbitrarily smaller than the entanglement cost. The quantum information cost of the protocol can thus be arbitrarily smaller than the cost of compression without shared entanglement. The ensemble can also be used to show the impossibility of reducing, via compression, the shared entanglement used in two-party protocols for computing Boolean functions.


CGS-accumulation (Consistent Global State Accumulation) is one of the commonly used method to provide fault tolerance in distributed systems so that the system can operate even if one or more components have failed. However, mobile computing systems are constrained by low bandwidth, mobility, lack of stable storage, frequent disconnections and limited battery life. Hence CGS- accumulation etiquettes which have lesser reinstatement- points are favored in mobile environment. In this paper, we propose a minimum-method coordinated CGS-accumulation etiquette for deterministic distributed applications on mobile computing systems. We eliminate useless reinstatement-points as well as blocking of methods during reinstatement-points at the cost of logging anti- messages of very few messages during CGS-accumulation. We also try to minimize the loss of CGS-accumulation effort when any method miscarries to capture its reinstatement-point in an instigation. In this way, we take care of excessive disappointments during CGS-accumulation. We make logging of anti-messages of very few messages only during CGS-accumulation. We also strive to minimize loss of CGS-accumulation effort.


Author(s):  
Said Limam ◽  
Ghalem Belalem

Cloud computing has become a significant technology and a great solution for providing a flexible, on-demand, and dynamically scalable computing infrastructure for many applications. Cloud computing also presents a significant technology trends. With the cloud computing technology, users use a variety of devices to access programs, storage, and application-development platforms over the Internet, via services offered by cloud computing providers. The probability of failure occur during the execution becomes stronger when the number of node increases; since it is impossible to fully prevent failures, one solution is to implement fault tolerance mechanisms. Fault tolerance has become a major task for computer engineers and software developers because the occurrence of faults increases the cost of using resources. In this paper, the authors have proposed an approach that is a combination of migration and checkpoint mechanism. The checkpoint mechanism minimizes the time lost and reduces the effect of failures on application execution while the migration mechanism guarantee the continuity of application execution and avoid any loss due to hardware failure in a way transparent and efficient. The results obtained by the simulation show the effectiveness of our approaches to fault tolerance in term of execution time and masking effects of failures.


Sign in / Sign up

Export Citation Format

Share Document