Towards High-performance X25519/448 Key Agreement in General Purpose GPUs

Author(s):  
Jiankuo Dong ◽  
Fangyu Zheng ◽  
Juanjuan Cheng ◽  
Jingqiang Lin ◽  
Wuqiong Pan ◽  
...  
2014 ◽  
Vol 36 (4) ◽  
pp. 790-798
Author(s):  
Kai ZHANG ◽  
Shu-Ming CHEN ◽  
Yao-Hua WANG ◽  
Xi NING

2011 ◽  
Vol 28 (1) ◽  
pp. 1-14 ◽  
Author(s):  
W. van Straten ◽  
M. Bailes

Abstractdspsr is a high-performance, open-source, object-oriented, digital signal processing software library and application suite for use in radio pulsar astronomy. Written primarily in C++, the library implements an extensive range of modular algorithms that can optionally exploit both multiple-core processors and general-purpose graphics processing units. After over a decade of research and development, dspsr is now stable and in widespread use in the community. This paper presents a detailed description of its functionality, justification of major design decisions, analysis of phase-coherent dispersion removal algorithms, and demonstration of performance on some contemporary microprocessor architectures.


Author(s):  
Sheng Kang ◽  
Guofeng Chen ◽  
Chun Wang ◽  
Ruiquan Ding ◽  
Jiajun Zhang ◽  
...  

With the advent of big data and cloud computing solutions, enterprise demand for servers is increasing. There is especially high growth for Intel based x86 server platforms. Today’s datacenters are in constant pursuit of high performance/high availability computing solutions coupled with low power consumption and low heat generation and the ability to manage all of this through advanced telemetry data gathering. This paper showcases one such solution of an updated rack and server architecture that promises such improvements. The ability to manage server and data center power consumption and cooling more completely is critical in effectively managing datacenter costs and reducing the PUE in the data center. Traditional Intel based 1U and 2U form factor servers have existed in the data center for decades. These general purpose x86 server designs by the major OEM’s are, for all practical purposes, very similar in their power consumption and thermal output. Power supplies and thermal designs for server in the past have not been optimized for high efficiency. In addition, IT managers need to know more information about servers in order to optimize data center cooling and power use, an improved server/rack design needs to be built to take advantage of more efficient power supplies or PDU’s and more efficient means of cooling server compute resources than from traditional internal server fans. This is the constant pursuit of corporations looking at new ways to improving efficiency and gaining a competitive advantage. A new way to optimize power consumption and improve cooling is a complete redesign of the traditional server rack. Extracting internal server power supplies and server fans and centralizing these within the rack aims to achieve this goal. This type of design achieves an entirely new low power target by utilizing centralized, high efficiency PDU’s that power all servers within the rack. Cooling is improved by also utilizing large efficient rack based fans for airflow to all servers. Also, opening up the server design is to allow greater airflow across server components for improved cooling. This centralized power supply breaks through the traditional server power limits. Rack based PDU’s can adjust the power efficiency to a more optimum point. Combine this with the use of online + offline modes within one single power supply. Cold backup makes data center power to achieve optimal power efficiency. In addition, unifying the mechanical structure and thermal definitions within the rack solution for server cooling and PSU information allows IT to collect all server power and thermal information centrally for improved ease in analyzing and processing.


2014 ◽  
Vol 2014 ◽  
pp. 1-13 ◽  
Author(s):  
Mouna Baklouti ◽  
Mohamed Abid

To meet the high performance demands of embedded multimedia applications, embedded systems are integrating multiple processing units. However, they are mostly based on custom-logic design methodology. Designing parallel multicore systems using available standards intellectual properties yet maintaining high performance is also a challenging issue. Softcore processors and field programmable gate arrays (FPGAs) are a cheap and fast option to develop and test such systems. This paper describes a FPGA-based design methodology to implement a rapid prototype of parametric multicore systems. A study of the viability of making the SoC using the NIOS II soft-processor core from Altera is also presented. The NIOS II features a general-purpose RISC CPU architecture designed to address a wide range of applications. The performance of the implemented architecture is discussed, and also some parallel applications are used for testing speedup and efficiency of the system. Experimental results demonstrate the performance of the proposed multicore system, which achieves better speedup than the GPU (29.5% faster for the FIR filter and 23.6% faster for the matrix-matrix multiplication).


2014 ◽  
Vol 596 ◽  
pp. 276-279
Author(s):  
Xiao Hui Pan

Graph component labeling, which is a subset of the general graph coloring problem, is a computationally expensive operation in many important applications and simulations. A number of data-parallel algorithmic variations to the component labeling problem are possible and we explore their use with general purpose graphical processing units (GPGPUs) and with the CUDA GPU programming language. We discuss implementation issues and performance results on CPUs and GPUs using CUDA. We evaluated our system with real-world graphs. We show how to consider different architectural features of the GPU and the host CPUs and achieve high performance.


Author(s):  
Mário Pereira Vestias

High-performance reconfigurable computing systems integrate reconfigurable technology in the computing architecture to improve performance. Besides performance, reconfigurable hardware devices also achieve lower power consumption compared to general-purpose processors. Better performance and lower power consumption could be achieved using application-specific integrated circuit (ASIC) technology. However, ASICs are not reconfigurable, turning them application specific. Reconfigurable logic becomes a major advantage when hardware flexibility permits to speed up whatever the application with the same hardware module. The first and most common devices utilized for reconfigurable computing are fine-grained FPGAs with a large hardware flexibility. To reduce the performance and area overhead associated with the reconfigurability, coarse-grained reconfigurable solutions has been proposed as a way to achieve better performance and lower power consumption. In this chapter, the authors provide a description of reconfigurable hardware for high-performance computing.


Author(s):  
Mário Pereira Vestias

High-Performance Reconfigurable Computing systems integrate reconfigurable technology in the computing architecture to improve performance. Besides performance, reconfigurable hardware devices also achieve lower power consumption compared to General-Purpose Processors. Better performance and lower power consumption could be achieved using Application Specific Integrated Circuit (ASIC) technology. However, ASICs are not reconfigurable, turning them application specific. Reconfigurable logic becomes a major advantage when hardware flexibility permits to speed up whatever the application with the same hardware module. The first and most common devices utilized for reconfigurable computing are fine-grained FPGAs with a large hardware flexibility. To reduce the performance and area overhead associated with the reconfigurability, coarse-grained reconfigurable solutions has been proposed as a way to achieve better performance and lower power consumption. In this chapter we will provide a description of reconfigurable hardware for high performance computing.


Author(s):  
Domingo Benitez

Many accelerator-based computers have demonstrated that they can be faster and more energy-efficient than traditional high-performance multi-core computers. Two types of programmable accelerators are available in high-performance computing: general-purpose accelerators such as GPUs, and customizable accelerators such as FPGAs, although general-purpose accelerators have received more attention. This chapter reviews the state-of-the-art and current trends of high-performance customizable computers (HPCC) and their use in Computational Science and Engineering (CSE). A top-down approach is used to be more accessible to the non-specialists. The “top view” is provided by a taxonomy of customizable computers. This abstract view is accompanied with a performance comparison of common CSE applications on HPCC systems and high-performance microprocessor-based computers. The “down view” examines software development, describing how CSE applications are programmed on HPCC computers. Additionally, a cost analysis and an example illustrate the origin of the benefits. Finally, the future of the high-performance customizable computing is analyzed.


Author(s):  
Hantao Zhang

The theory of combinatorial designs has always been a rich source of structured, parametrized families of SAT instances. On one hand, design theory provides interesting problems for testing various SAT solvers; on the other hand, high-performance SAT solvers provide an alternative tool for attacking open problems in design theory, simply by encoding problems as propositional formulas, and then searching for their models using off-the-shelf general purpose SAT solvers. This chapter presents several case studies of using SAT solvers to solve hard design theory problems, including quasigroup problems, Ramsey numbers, Van der Waerden numbers, covering arrays, Steiner systems, and Mendelsohn designs. It is shown that over a hundred of previously-open design theory problems were solved by SAT solvers, thus demonstrating the significant power of modern SAT solvers. Moreover, the chapter provides a list of 30 open design theory problems for the developers of SAT solvers to test their new ideas and weapons.


Sign in / Sign up

Export Citation Format

Share Document