cluster architecture
Recently Published Documents


TOTAL DOCUMENTS

145
(FIVE YEARS 29)

H-INDEX

13
(FIVE YEARS 2)

2021 ◽  
Vol 2131 (3) ◽  
pp. 032114
Author(s):  
M Reznikov ◽  
Y Fedosenko

Abstract Within the framework of a computationally complex canonical scheduling problem, formulated by an optimization model for one-processor servicing of a finite deterministic flow of objects, a scheme of computational process of an algorithm of discrete dynamic programming in cluster implementation is considered. Variants of balancing of computational subtasks over network cluster array are investigated, purposed to reduce the volume and intensity of intranetwork interaction. It has been established that for practical improvement of efficiency of cluster algorithm, it is required not to increase the uniformity of distribution of subtasks among the cluster nodes, but to minimize the network traffic between the cluster nodes. Balancing options are proposed that allow to significantly increase localization of data in network computing. Experimental results are analytically confirmed, showing the scaling limits of implementation of discrete dynamic programming algorithms on a cluster architecture. The method for choosing the number of computational nodes and dimension of the problem being solved, which provide a threefold reduction in overhead costs for network exchange, is shown. The results obtained make it possible to objectively substantiate the choice of methodological and algorithmic approaches when choosing computer tools developing architectural and technological solutions for dispatching systems support in inland water transport.


2021 ◽  
Vol 23 (08) ◽  
pp. 931-935
Author(s):  
Ajay Kumar Bansal ◽  
◽  
Manmohan Sharma ◽  
Ashu Gupta ◽  
◽  
...  

Modern computing systems are generally enormous in scale, consisting of hundreds to thousands of heterogeneous machine nodes, to meet rising demands for Cloud services. MapReduce and other parallel computing frameworks are frequently used on such cluster architecture to offer consumers dependable and timely services. However, Cloud workloads’ complex features, such as multi-dimensional resource requirements and dynamically changing system settings, such as dynamic node performance, are posing new difficulties for providers in terms of both customer experience and system efficiency. The straggler problem occurs when a small subset of parallelized jobs takes an excessively long time to execute in contrast to their siblings, resulting in a delayed job response and the possibility of late-timing failure. Speculative execution is the state-of-the-art method to straggler mitigation. Speculative execution has been used in numerous real-world systems with a variety of implementation improvements, but the results of this thesis’ research demonstrate that it is typically wasteful. The failure rate of speculative execution might be as high as 71 percent, according to different data center production trace logs. Straggler mitigation is a difficult task in and of itself: 1) stragglers may have varying degrees of severity in parallel job execution; 2) whether a task should be considered a straggler is highly subjective, depending on various application and system conditions; 3) the efficiency of speculative execution would be improved if dynamic node quality could be adequately modeled and predicted; 4) Other sorts of stragglers, such as those generated by data skews, are beyond speculative execution’s capabilities.


2021 ◽  
Vol 11 (4) ◽  
pp. 281-285
Author(s):  
Mahyar Shahsavari ◽  
◽  
Jonathan Beaumont ◽  
David Thomas ◽  
Andrew D. Brown

Spiking Neural Networks (SNNs) are known as a branch of neuromorphic computing and are currently used in neuroscience applications to understand and model the biological brain. SNNs could also potentially be used in many other application domains such as classification, pattern recognition, and autonomous control. This work presents a highly-scalable hardware platform called POETS, and uses it to implement SNN on a very large number of parallel and reconfigurable FPGA-based processors. The current system consists of 48 FPGAs, providing 3072 processing cores and 49152 threads. We use this hardware to implement up to four million neurons with one thousand synapses. Comparison to other similar platforms shows that the current POETS system is twenty times faster than the Brian simulator, and at least two times faster than SpiNNaker.


2021 ◽  
Author(s):  
Mahyar Shahsavari ◽  
David Thomas ◽  
Andrew Brown ◽  
Wayne Luk

2021 ◽  
Vol 18 (4) ◽  
pp. 1-22
Author(s):  
Jerzy Proficz

Two novel algorithms for the all-gather operation resilient to imbalanced process arrival patterns (PATs) are presented. The first one, Background Disseminated Ring (BDR), is based on the regular parallel ring algorithm often supplied in MPI implementations and exploits an auxiliary background thread for early data exchange from faster processes to accelerate the performed all-gather operation. The other algorithm, Background Sorted Linear synchronized tree with Broadcast (BSLB), is built upon the already existing PAP-aware gather algorithm, that is, Background Sorted Linear Synchronized tree (BSLS), followed by a regular broadcast distributing gathered data to all participating processes. The background of the imbalanced PAP subject is described, along with the PAP monitoring and evaluation topics. An experimental evaluation of the algorithms based on a proposed mini-benchmark is presented. The mini-benchmark was performed over 2,000 times in a typical HPC cluster architecture with homogeneous compute nodes. The obtained results are analyzed according to different PATs, data sizes, and process numbers, showing that the proposed optimization works well for various configurations, is scalable, and can significantly reduce the all-gather elapsed times, in our case, up to factor 1.9 or 47% in comparison with the best state-of-the-art solution.


Author(s):  
Hai

In this paper, a new Raspberry PI supercomputer cluster architecture is proposed. Generally, to gain speed at petaflops and exaflops, typical modern supercomputers based on 2009-2018 computing technologies must consume between 6 MW and 20 MW of electrical power, almost all of which is converted into heat, requiring high cost for cooling technology and Cooling Towers. The management of heat density has remained a key issue for most centralized supercomputers. In our proposed architecture, supercomputers with highly energy-efficient mobile ARM processors are a new choice as it enables them to address performance, power, and cost issues. With ARM’s recent introduction of its energy-efficient 64-bit CPUs targeting servers, Raspberry Pi cluster module-based supercomputing is now within reach. But how is the performance of supercomputers-based mobile multicore processors? Obtained experimental results reported on the proposed approach indicate the lower electrical power and higher performance in comparison with the previous approaches.


2021 ◽  
Vol 40 (5) ◽  
pp. 8727-8740
Author(s):  
Rajvir Singh ◽  
C. Rama Krishna ◽  
Rajnish Sharma ◽  
Renu Vig

Dynamic and frequent re-clustering of nodes along with data aggregation is used to achieve energy-efficient operation in wireless sensor networks. But dynamic cluster formation supports data aggregation only when clusters can be formed using any set of nodes that lie in close proximity to each other. Frequent re-clustering makes network management difficult and adversely affects the use of energy efficient TDMA-based scheduling for data collection within the clusters. To circumvent these issues, a centralized Fixed-Cluster Architecture (FCA) has been proposed in this paper. The proposed scheme leads to a simplified network implementation for smart spaces where it makes more sense to aggregate data that belongs to a cluster of sensors located within the confines of a designated area. A comparative study is done with dynamic clusters formed with a distributive Low Energy Adaptive Clustering Hierarchy (LEACH) and a centralized Harmonic Search Algorithm (HSA). Using uniform cluster size for FCA, the results show that it utilizes the available energy efficiently by providing stability period values that are 56% and 41% more as compared to LEACH and HSA respectively.


Sign in / Sign up

Export Citation Format

Share Document