A high performance block compression algorithm for small systems-software and hardware implementations

Author(s):  
A. de la Cruz Nogueiras ◽  
M. Gamez Lau ◽  
A. Cerdeira Altuzarra ◽  
M. Estrada del Cueto ◽  
P. Goga
Author(s):  
Nikolay Kondratyuk ◽  
Vsevolod Nikolskiy ◽  
Daniil Pavlov ◽  
Vladimir Stegailov

Classical molecular dynamics (MD) calculations represent a significant part of the utilization time of high-performance computing systems. As usual, the efficiency of such calculations is based on an interplay of software and hardware that are nowadays moving to hybrid GPU-based technologies. Several well-developed open-source MD codes focused on GPUs differ both in their data management capabilities and in performance. In this work, we analyze the performance of LAMMPS, GROMACS and OpenMM MD packages with different GPU backends on Nvidia Volta and AMD Vega20 GPUs. We consider the efficiency of solving two identical MD models (generic for material science and biomolecular studies) using different software and hardware combinations. We describe our experience in porting the CUDA backend of LAMMPS to ROCm HIP that shows considerable benefits for AMD GPUs comparatively to the OpenCL backend.


2011 ◽  
Vol 20 (03) ◽  
pp. 349-373 ◽  
Author(s):  
NADIA NEDJAH ◽  
RODRIGO MARTINS DA SILVA ◽  
LUIZA DE MACEDO MOURELLE

There are several possible implementations of artificial neural network that are based either on software or hardware systems. Software implementations are rather inefficient due to the fact that the intrinsic parallelism of the underlying computation is usually not taken advantage of in a mono-processor kind of computing system. Existing hardware implementations of ANNs are efficient as the dedicated datapath used is optimized and the hardware is usually parallel. Hardware implementations of ANNs may be either digital, analog, or even hybrid. Digital implementations of ANNs tend to be of high complexity, thus of high cost, and somehow imprecise due to the use of lookup table for the activation function. On the other hand, analog implementation of ANNs are generally very simple and much more precise. In this paper, we focus on possible analog implementations of ANNs. The neuron is based on a simple operational amplifier. The reviewed implementations allow for the use of both negative and positive synaptic weights. An alternative implementation permits the realization of the training process.


2021 ◽  
Vol 27 (12) ◽  
pp. 625-633
Author(s):  
N. N. Levchenko ◽  
◽  
D. N. Zmejev ◽  

When developing high-performance multiprocessor computing systems, much attention is paid to ensuring uninterrupted operation, both in terms of hardware and software. In traditional computing systems, software is the main focus in address­ing these issues. The article discusses the solution to the issue of ensuring uninterrupted operation for the parallel dataflow computing system (PDCS), which implements the dataflow computational model with a dynamically formed context. Due to the features of the PDCS, it is proposed to implement this type of control in hardware, which will increase its efficiency, since the computational process will be controlled in dynamics, and not only in statics.


2018 ◽  
Vol 14 (11) ◽  
pp. 117
Author(s):  
Bo Qiu

To realize the design of mobile 4G gateway of ZigBee wireless sensor network (WSN), a scheme of wireless remote monitoring based on ZigBee and general packet radio service (GPRS) WSN gateway system is proposed. The scheme combines the advantages of short distance, low power consumption, low cost and long distance popular communication of ZigBee technology, and uses the system architecture of ZigBee + GPRS + Android. On this hardware platform, the transplantation of Android system and the development of related hardware device drivers are designed and implemented, so as to build the software platform of the system. Based on the software and hardware platform of the system, the related applications are designed and realized according to the function requirements of the system, and the software and hardware platform and the application program are tested and analyzed. The test results show that the system runs steadily and has good performance. To sum up, the hardware platform has the advantages of low energy consumption, high performance and scalability.


2012 ◽  
Vol 433-440 ◽  
pp. 629-634
Author(s):  
Xiu Feng Wang ◽  
Gang Cui

Based on ARM,multifuncttion GPS/GPRS automobile security system was designed,which focused on security and detectiong.The system combines with detection technology, sensor technology, GPS technology, GPRS technology and digital filtering technology.This system was discussed from software and hardware. Related math model was established. Corresponding test was been brought forward to improve system reliability. Average filtering method and median filtering algorithm have been used to inhibiting interference signals with variety of frequency in software. In this paper,the advantages of this system have been discussed.The embeded system realizes the function with characteristics of low cost, high performance, real-time and reliability ,it has high practical value.


Author(s):  
Moez Ben HajHmida ◽  
Antonio Congiusta

Knowledge discovery has become a necessary task in scientific, life sciences, and business fields, both for the growing amount of data being collected and for the complexity of the analysis that need to be performed on it. Classic data mining techniques, developed for centralized sites, often reveal themselves inadequate, due to some unique characteristics of today’s data sources. In such cases, sequential approaches to data mining cannot provide for scalability, in terms of the data dimensionality, size, and runtime performance. Moreover, the increasing trend towards decentralized business organizations, distribution of users, software, and hardware systems magnifies the need for more advanced and flexible approaches and solutions. Life science is one of the application areas that best resemble such scenario. This chapter presents the state of the art about the major data mining techniques, systems and approaches. A detailed taxonomy is drawn by analyzing and comparing parallel, distributed and Grid-based data mining methods, with a particular focus on the exploitation of large and remotely dispersed datasets and/or high-performance computers.


Sign in / Sign up

Export Citation Format

Share Document