Performance Evaluation of Scientific Applications on Modern Parallel Vector Systems

Author(s):  
Jonathan Carter ◽  
Leonid Oliker ◽  
John Shalf
Author(s):  
Leonid Oliker ◽  
Rupak Biswas ◽  
Julian Borrill ◽  
Andrew Canning ◽  
Jonathan Carter ◽  
...  

2008 ◽  
Vol 18 (04) ◽  
pp. 453-469 ◽  
Author(s):  
KEVIN J. BARKER ◽  
KEI DAVIS ◽  
ADOLFY HOISIE ◽  
DARREN J. KERBYSON ◽  
MIKE LANG ◽  
...  

In this work we present an initial performance evaluation of Intel's latest, second-generation quad-core processor, Nehalem, and provide a comparison to first-generation AMD and Intel quad-core processors Barcelona and Tigerton. Nehalem is the first Intel processor to implement a NUMA architecture incorporating QuickPath Interconnect for interconnecting processors within a node, and the first to incorporate an integrated memory controller. We evaluate the suitability of these processors in quad-socket compute nodes as building blocks for large-scale scientific computing clusters. Our analysis of intra-processor and intra-node scalability of microbenchmarks, and a range of large-scale scientific applications, indicates that quad-core processors can deliver an improvement in performance of up to 4x over a single core depending on the workload being processed. However, scalability can be less when considering a full node. We show that Nehalem outperforms Barcelona on memory-intensive codes by a factor of two for a Nehalem node with 8 cores and a Barcelona node containing 16 cores. Further optimizations are possible with Nehalem, including the use of Simultaneous Multithreading, which improves the performance of some applications by up to 50%.


Author(s):  
Andrew V. Adinetz ◽  
Paul F. Baumeister ◽  
Hans Böttiger ◽  
Thorsten Hater ◽  
Thilo Maurer ◽  
...  

Author(s):  
Sameer Shende ◽  
Allen D. Malony ◽  
Alan Morris ◽  
Steven Parker ◽  
J. Davison de St. Germain

2006 ◽  
Vol 9 (2) ◽  
Author(s):  
Carlos Figueira ◽  
Emilio Hernandez ◽  
Eduardo Blanco

Performance evaluation of applications running on a Grid is a challenging task. Grid’s resources are heterogeneous in nature, often shared, and dynamic, all of which have important implications on the performance of an application executing on the Grid. For instance, applications performance will suffer from perturbation induced by external load on the network or computational nodes. Also, re- sources allocated to applications may vary between different executions. In this paper, we propose a simple framework that takes into account these factors to allow users to gain knowledge of fundamental performance characteristics of their parallel applications. This framework was incorporated in SUMA, a Grid-enabled platform for the execution of scientific applications in Java. We show some results of the utilization of this framework, which was tested by analyzing and tuning a parallel application.


Sign in / Sign up

Export Citation Format

Share Document