Designing High-End Computing Systems with InfiniBand and High-Speed Ethernet

Author(s):  
D.K. Panda ◽  
S. Sur ◽  
P. Balaji
Keyword(s):  
Science ◽  
2017 ◽  
Vol 358 (6369) ◽  
pp. 1423-1427 ◽  
Author(s):  
Feng Rao ◽  
Keyuan Ding ◽  
Yuxing Zhou ◽  
Yonghui Zheng ◽  
Mengjiao Xia ◽  
...  

Operation speed is a key challenge in phase-change random-access memory (PCRAM) technology, especially for achieving subnanosecond high-speed cache memory. Commercialized PCRAM products are limited by the tens of nanoseconds writing speed, originating from the stochastic crystal nucleation during the crystallization of amorphous germanium antimony telluride(Ge2Sb2Te5). Here, we demonstrate an alloying strategy to speed up the crystallization kinetics. The scandium antimony telluride (Sc0.2Sb2Te3) compound that we designed allows a writing speed of only 700 picoseconds without preprogramming in a large conventional PCRAM device. This ultrafast crystallization stems from the reduced stochasticity of nucleation through geometrically matched and robust scandium telluride (ScTe) chemical bonds that stabilize crystal precursors in the amorphous state. Controlling nucleation through alloy design paves the way for the development of cache-type PCRAM technology to boost the working efficiency of computing systems.


2021 ◽  
Vol 19 (4) ◽  
pp. 111-117
Author(s):  
N. E. Sapozhnikov ◽  
V. G. Zolotykh ◽  
A. S. Zakharov

At present, digital signal processing involves enormous amounts of computations with large-bit arrays, which are carried out in real time. In connection with the need to solve more and more complex problems, constantly growing requirements are imposed on the main parameters of digital processors (speed, reliability, power consumption, etc.), which determine the computing capabilities of systems with digital signal processing. In turn, the rapid development of microelectronics, its successes make it possible to create more and more high-performance computing systems, which makes it possible to solve more and more complex problems, including in the military sphere. The production of the latest information technology means is a technological task that can be solved exclusively by economically developed countries. Bringing domestic microelectronics to the current world level requires significant investments. Therefore, the study and research of discrete nodes and devices is of the direct practical importance. When developing promising computers, new technological approaches should be applied: minimizing power consumption, maintaining modularity and high computational density within a single node, creating high-speed data transmission with the lowest delays, creating an efficient storage system, and choosing the best types of memory. One of such possible approaches is the use of a non-positional form of information presentation in computing systems for national and military purposes. This gives a number of advantages, the main ones of which are: a decrease (by orders of magnitude) in the hardware volume of computing devices, an increase in the speed of calculations, an increase in the noise immunity of communication channels. To use the above method, it is proposed to include a probabilistic arithmetic device in the information processing device that performs basic arithmetic operations (addition, multiplication, exponentiation, subtraction, division), which are performed without the use of additional algorithms and mechanisms, in contrast to “classical" digital representation of binary information, where all operations are performed on the basis of the addition operation.


2020 ◽  
Vol 18 (03) ◽  
pp. 2050002
Author(s):  
Meysam Rashno ◽  
Majid Haghparast ◽  
Mohammad Mosleh

In recent years, there has been an increasing tendency towards designing circuits based on reversible logic, and has received much attention because of preventing internal power dissipation. In digital computing systems, multiplier circuits are one of the most fundamental and practical circuits used in the development of a wide range of hardware such as arithmetic circuits and Arithmetic Logic Unit (ALU). Vedic multiplier, which is based on Urdhva Tiryakbhayam (UT) algorithm, has many applications in circuit designing because of its high speed in performing multiplication compared to other multipliers. In Vedic multipliers, partial products are obtained through vertical and cross multiplication. In this paper, we propose four [Formula: see text] reversible Vedic multiplier blocks and use each one of them in its right place. Then, we propose a [Formula: see text] reversible Vedic multiplier using the four aforementioned multipliers. We prove that our design leads to better results in terms of quantum cost, number of constant inputs and number of garbage outputs, compared to the previous ones. We also expand our proposed design to [Formula: see text] multipliers which enable us to develop our proposed design in every dimension. Moreover, we propose a formula in order to calculate the quantum cost of our proposed [Formula: see text] reversible Vedic multiplier, which allows us to calculate the quantum cost even before designing the multiplier.


2013 ◽  
Vol 22 (06) ◽  
pp. 1350043 ◽  
Author(s):  
MINGDA ZHANG ◽  
SHUGANG WEI

Modular multiplication is a very important arithmetic operation in residue-based real-time computing systems. In this paper, we present multipliers using a modified binary tree of the modulo m signed-digit (SD) number adders where m = 2n + μ(μ = ±1, 0). To simplify the residue SD adder, new addition rules are used for generating the intermediate sum and carry with an 1-bit binary encoded number representation. By using the new encoding method, the proposed residue addition requires less hardware and shorter delay time than previous one. A modulo m multiplier can be implemented by a binary modulo m adder tree which has a depth of log 2 n. In order to introduce a binary SD adder tree with the new addition rules, two novel modulo m adders have been proposed in this paper. Finally, the evaluation apparently shows that the proposed two kinds of modulo m adders are performed more efficiency by comparing with the modulo SD adder which is mentioned in our previous work, and a new binary SD adder tree structure has been proposed.


Photonics ◽  
2021 ◽  
Vol 8 (2) ◽  
pp. 31
Author(s):  
Nikolaos-Panteleimon (Pandelis) Diamantopoulos ◽  
Suguru Yamaoka ◽  
Takuro Fujii ◽  
Hidetaka Nishi ◽  
Koji Takeda ◽  
...  

Near-future upgrades of intra data center networks and high-performance computing systems would require optical interconnects capable of operating at beyond 100 Gbps/lane. In order for this evolution to be achieved in a sustainable way, high-speed yet energy-efficient transceivers are in need. Towards this direction we have previously demonstrated directly-modulated lasers (DMLs) capable of operating at 50 Gbps/lane with sub-pJ/bit efficiencies based on our novel membrane-III-V-on-Si technology. However, there exists an inherent tradeoff between modulation speed and power consumption due to the carrier-photon dynamics in DMLs. In this work, we alleviate this tradeoff by introducing photon–photon resonance dynamics in our energy-efficient membrane DMLs-on-Si design and demonstrate a device with a maximum 3-dB bandwidth of 47.5 GHz. This denotes a bandwidth increase of more than 2x times compared to our previous membrane DMLs-on-Si. Moreover, the DML is capable of delivering 60-GBaud PAM-4 signals under Ethernet’s KP4-FEC threshold (net data rate of 113.42 Gbps) over 2-km of standard single-mode fiber transmission. DC energy-efficiencies of 0.17 pJ/bit at 25 °C and 0.34 pJ/bit at 50 °C have been achieved for the > 100-Gbps signals. Deploying such DMLs in an integrated multichannel transceiver should ensure a smooth evolution towards Terabit-class Ethernet links and on-board optics subsystems.


2019 ◽  
Vol 29 (3) ◽  
pp. 33-40
Author(s):  
A. E. Ometov ◽  
A. A. Vinogradov ◽  
A. S. Vorobiev

The article describes the experiments carried out during the post-silicone verification of Elbrus-8CB microprocessor – one of the important stages of the verification process, which mostly determines the possibility of creating high-performance computing systems consisting of several microprocessors of this series. The interprocessor communication channels of the Elbrus-8CB microprocessor were investigated and some hypotheses were put forward about the reasons for their low operating speed. Experiments conducted to validate these hypotheses are made with intermediate conclusions based on their results. The built-in testing mechanism of CEI-6G and PCIe 2.0 physical levels was described alongside with its operating modes and testing algorithm. Several studies were carried out to ensure the correctness of the testing mechanism. This led to modifications of the initial testing method. The final conclusions about the reasons for the incorrect operation of interprocessor communications were made, and recommendations were given to improve the high-speed communications signals attenuation parameters and the level of their interference immunity. The relevance of this study for the production of modern high-performance computing systems can be traced not only in the growing interest of designers to this problem, but also in tightening of the requirements of the physical layers manufacturers.


2021 ◽  
Vol 1 (1) ◽  
pp. 194-207
Author(s):  
S. S. Shevelev

Context. Modern general purpose computers are capable of implementing any algorithm, but when solving certain problems in terms of processing speed they cannot compete with specialized computing modules. Specialized devices have high performance, effectively solve the problems of processing arrays, artificial intelligence tasks, and are used as control devices. The use of specialized microprocessor modules that implement the processing of character strings, logical and numerical values, represented as integers and real numbers, makes it possible to increase the speed of performing arithmetic operations by using parallelism in data processing. Objective. To develop principles for constructing microprocessor modules for a modular computing system with a reconfigurable structure, an arithmetic-symbolic processor, specialized computing devices, switching systems capable of configuring microprocessors and specialized computing modules into a multi-pipeline structure to increase the speed of performing arithmetic and logical operations, high-speed design algorithms specialized processors-accelerators of symbol processing. To develop algorithms, structural and functional diagrams of specialized mathematical modules that perform arithmetic operations in direct codes on neural-like elements and systems for decentralized control of the operation of blocks. Method. An information graph of the computational process of a modular system with a reconstructed structure has been built. Structural and functional diagrams, algorithms that implement the construction of specialized modules for performing arithmetic and logical operations, search operations and functions for replacing occurrences in processed words have been developed. Software has been developed for simulating the operation of an arithmetic-symbolic processor, specialized computing modules, and switching systems. Results. A block diagram of a reconfigurable computing modular system has been developed, which consists of compatible functional modules, it is capable of static and dynamic reconfiguration, has a parallel structure for connecting the processor and computing modules through the use of interface channels. The system consists of an arithmetic-symbolic processor, specialized computing modules and switching systems, performs specific tasks of symbolic information processing, arithmetic and logical operations. Conclusions. The architecture of reconfigurable computing systems can change dynamically during their operation. It becomes possible to adapt the architecture of a computing system to the structure of the problem being solved, to create problem-oriented computers, the structure of which corresponds to the structure of the problem being solved. As the main computing element in reconfigurable computing systems, not universal microprocessors are used, but programmable logic integrated circuits, which are combined using high-speed interfaces into a single computing field. Reconfigurable multipipeline computing systems based on fields are an effective tool for solving streaming information processing and control problems.


2021 ◽  
Vol 12 (7) ◽  
pp. 350-357
Author(s):  
S. S. Shevelev ◽  

A device has been developed that performs logical and arithmetic operations, which can be used to create high-performance, high-speed computing systems. Specialized blocks perform logical operations: AND, OR, NOT, arith­metic operations: addition and subtraction of binary numbers. Arithmetic operations are performed in direct fixed-point codes. The device is presented in the form of a structural scheme, structural and functional schemes of blocks and an algorithm for the operation of the device.


Author(s):  
L. D. Lopez ◽  
D. K. McElfresh ◽  
R. Melanson ◽  
D. Vacar

Abstract The need for high bandwidth, high speed interconnects with optimum routing through computer backplanes has led to the use of optical interconnects in multiprocessor computing systems [1]. Most of the current commercially available optical interfaces are based upon 850nm vertical-cavity surface-emitting lasers (VCSELs). Extensive studies conducted by the VCSEL manufacturers show that the reliability of these devices continues to improve [2-4]. In order to understand the risks and implications of using VCSELbased modules in computer systems, we have conducted an experiment designed to provide insight into the emission degradation and failure of VCSEL devices. In this paper we briefly describe the experiment and review the results of the subsequent failure analysis on degraded VCSEL arrays.


Sign in / Sign up

Export Citation Format

Share Document