On combining GUI desktop GIS with computer clusters & cloud resources, the role of programming skills and the state of the art in GUI driven GIS HPC applications

Author(s):  
Sebastian M. Ernst

<p>The Free and Open Source Software (FOSS) ecosystem around Geographic Information System (GIS) is currently seeing rapid growth – similar to FOSS ecosystems in other scientific disciplines. At the same time, <span>t</span>he <span>need</span> of broad programming and software development skills appears to become a common theme for potential (scientific) users. There is a rather clear boundary between what can be done with Graphical User Interface applications such as QGIS only on the one hand side and contemporary software libraries on the other hand side – if one <span>actually has the required </span><span>skillet</span><span> to us</span><span>e the latter</span><span>. Practical experience shows that more and more types of research require far more than </span><span>just </span><span>rudimentary software development skills. </span><span>Those can be hard to </span><span>acquire</span><span> and distract from the actual scientific work at hand</span><span>. </span><span>For instance t</span><span>he installation, integration and deployment of much desired software libraries </span><span>from the field of high-performance computing (HPC) </span><span>for </span><span>e.g. g</span><span>eneral-purpose computing on graphics processing units </span><span>(GPGPU)</span> <span>or computations on clusters or cloud resources </span><span>is very often becoming an obstacle on its own. </span><span>Recent advances in packaging and deployment systems around popular programming language ecosystems such as Python enable a new kind of thinking, </span><span>however. </span><span>Desktop GUI applications can now much more easily be combined with the mentioned type of libraries, which lowers the entry barrier to HPC applications and the handling of large quantities of data drastically. This work aims at providing an overview of the state of the art in this field and showcasing possible techniques. </span></p>

1992 ◽  
Vol 36 (5) ◽  
pp. 821-828 ◽  
Author(s):  
K. H. Brown ◽  
D. A. Grose ◽  
R. C. Lange ◽  
T. H. Ning ◽  
P. A. Totta

2021 ◽  
Vol 14 (4) ◽  
pp. 1-28
Author(s):  
Tao Yang ◽  
Zhezhi He ◽  
Tengchuan Kou ◽  
Qingzheng Li ◽  
Qi Han ◽  
...  

Field-programmable Gate Array (FPGA) is a high-performance computing platform for Convolution Neural Networks (CNNs) inference. Winograd algorithm, weight pruning, and quantization are widely adopted to reduce the storage and arithmetic overhead of CNNs on FPGAs. Recent studies strive to prune the weights in the Winograd domain, however, resulting in irregular sparse patterns and leading to low parallelism and reduced utilization of resources. Besides, there are few works to discuss a suitable quantization scheme for Winograd. In this article, we propose a regular sparse pruning pattern in the Winograd-based CNN, namely, Sub-row-balanced Sparsity (SRBS) pattern, to overcome the challenge of the irregular sparse pattern. Then, we develop a two-step hardware co-optimization approach to improve the model accuracy using the SRBS pattern. Based on the pruned model, we implement a mixed precision quantization to further reduce the computational complexity of bit operations. Finally, we design an FPGA accelerator that takes both the advantage of the SRBS pattern to eliminate low-parallelism computation and the irregular memory accesses, as well as the mixed precision quantization to get a layer-wise bit width. Experimental results on VGG16/VGG-nagadomi with CIFAR-10 and ResNet-18/34/50 with ImageNet show up to 11.8×/8.67× and 8.17×/8.31×/10.6× speedup, 12.74×/9.19× and 8.75×/8.81×/11.1× energy efficiency improvement, respectively, compared with the state-of-the-art dense Winograd accelerator [20] with negligible loss of model accuracy. We also show that our design has 4.11× speedup compared with the state-of-the-art sparse Winograd accelerator [19] on VGG16.


2010 ◽  
Vol 18 (1) ◽  
pp. 1-33 ◽  
Author(s):  
Andre R. Brodtkorb ◽  
Christopher Dyken ◽  
Trond R. Hagen ◽  
Jon M. Hjelmervik ◽  
Olaf O. Storaasli

Node level heterogeneous architectures have become attractive during the last decade for several reasons: compared to traditional symmetric CPUs, they offer high peak performance and are energy and/or cost efficient. With the increase of fine-grained parallelism in high-performance computing, as well as the introduction of parallelism in workstations, there is an acute need for a good overview and understanding of these architectures. We give an overview of the state-of-the-art in heterogeneous computing, focusing on three commonly found architectures: the Cell Broadband Engine Architecture, graphics processing units (GPUs), and field programmable gate arrays (FPGAs). We present a review of hardware, available software tools, and an overview of state-of-the-art techniques and algorithms. Furthermore, we present a qualitative and quantitative comparison of the architectures, and give our view on the future of heterogeneous computing.


Author(s):  
Marc Casas ◽  
Wilfried N Gansterer ◽  
Elias Wimmer

We investigate the usefulness of gossip-based reduction algorithms in a high-performance computing (HPC) context. We compare them to state-of-the-art deterministic parallel reduction algorithms in terms of fault tolerance and resilience against silent data corruption (SDC) as well as in terms of performance and scalability. New gossip-based reduction algorithms are proposed, which significantly improve the state-of-the-art in terms of resilience against SDC. Moreover, a new gossip-inspired reduction algorithm is proposed, which promises a much more competitive runtime performance in an HPC context than classical gossip-based algorithms, in particular for low accuracy requirements.


2021 ◽  
Author(s):  
Paul F. Baumeister ◽  
Lars Hoffmann

Abstract. Remote sensing observations in the mid-infrared spectral region (4–15 μm) play a key role in monitoring the composition of the Earth's atmosphere. Mid-infrared spectral measurements from satellite, aircraft, balloon and ground-based instruments provide information on pressure and temperature, trace gases as well as aerosols and clouds. As state-of-the-art instruments deliver a vast amount of data on a global scale, their analysis, however, may require advanced methods and high-performance computing capacities for data processing. A large amount of computing time is usually spent on evaluating the radiative transfer equation. Line-by-line calculations of infrared radiative transfer are considered to be most accurate, but they are also most time-consuming. Here, we discuss the emissivity growth approximation (EGA), which can accelerate infrared radiative transfer calculations by several orders of magnitude compared with line-by-line calculations. As future satellite missions will likely depend on Exascale computing systems to process their observational data in due time, we think that the utilization of graphical processing units (GPUs) for the radiative transfer calculations and satellite retrievals is a logical next step in further accelerating and improving the efficiency of data processing. Focusing on the EGA method, we first discuss the implementation of infrared radiative transfer calculations on GPU-based computing systems in detail. Second, we discuss distinct features of our implementation of the EGA method, in particular regarding the memory needs, performance, and scalability on state-of-the-art GPU systems. As we found our implementation to perform about an order of magnitude more energy-efficient on GPU-accelerated architectures compared to CPU, we conclude that our approach provides various future opportunities for this high-throughput problem.


2007 ◽  
Author(s):  
Thomas Sure ◽  
Lambert Danner ◽  
Peter Euteneuer ◽  
Gerhard Hoppen ◽  
Armin Pausch ◽  
...  

Author(s):  
Yang Song ◽  
Haoliang Wang ◽  
Tolga Soyata

To allow mobile devices to support resource intensive applications beyond their capabilities, mobile-cloud offloading is introduced to extend the resources of mobile devices by leveraging cloud resources. In this chapter, we will survey the state-of-the-art in VM-based mobile-cloud offloading techniques including their software and architectural aspects in detail. For the software aspects, we will provide the current improvements to different layers of various virtualization systems, particularly focusing on mobile-cloud offloading. Approaches at different offloading granularities will be reviewed and their advantages and disadvantages will be discussed. For the architectural support aspects of the virtualization, three platforms including Intel x86, ARM and NVidia GPUs will be reviewed in terms of their special architectural designs to accommodate virtualization and VM-based offloading.


2014 ◽  
Vol 24 (02) ◽  
pp. 1550019
Author(s):  
Osama Al-Khaleel ◽  
Zakaria Al-Qudah ◽  
Mohammad Al-Khaleel ◽  
Raed Bani-Hani ◽  
Christos Papachristou ◽  
...  

This paper proposes two high performance binary-to-binary coded decimal (BCD) conversion algorithms for use in BCD multiplication. These algorithms are based on splitting the 7-bit binary partial product of two BCD digits into two groups, computing the contribution of each group to the equivalent BCD partial product, and adding these contributions to compute the final BCD partial product. Designs for the proposed architectures and their implementations targeting both ASIC and FPGA are compared with others. Implementations of BCD array multipliers using both our conversion circuits and existing conversion circuits have been performed. The synthesis results for both ASIC and FPGA show that the proposed designs are faster and occupying less area than the state-of-the-art conversion circuits. Furthermore, the results obtained from comparing BCD multipliers of various sizes show that the enhancement in the area of the conversion circuit grows into a sizable area improvement in the multiplier circuit.


Sign in / Sign up

Export Citation Format

Share Document