scholarly journals A Programming Model Performance Study Using the NAS Parallel Benchmarks

2010 ◽  
Vol 18 (3-4) ◽  
pp. 153-167 ◽  
Author(s):  
Hongzhang Shan ◽  
Filip Blagojević ◽  
Seung-Jai Min ◽  
Paul Hargrove ◽  
Haoqiang Jin ◽  
...  

Harnessing the power of multicore platforms is challenging due to the additional levels of parallelism present. In this paper we use the NAS Parallel Benchmarks to study three programming models, MPI, OpenMP and PGAS to understand their performance and memory usage characteristics on current multicore architectures. To understand these characteristics we use the Integrated Performance Monitoring tool and other ways to measure communication versus computation time, as well as the fraction of the run time spent in OpenMP. The benchmarks are run on two different Cray XT5 systems and an Infiniband cluster. Our results show that in general the three programming models exhibit very similar performance characteristics. In a few cases, OpenMP is significantly faster because it explicitly avoids communication. For these particular cases, we were able to re-write the UPC versions and achieve equal performance to OpenMP. Using OpenMP was also the most advantageous in terms of memory usage. Also we compare performance differences between the two Cray systems, which have quad-core and hex-core processors. We show that at scale the performance is almost always slower on the hex-core system because of increased contention for network resources.

2014 ◽  
Vol 984-985 ◽  
pp. 1357-1363
Author(s):  
M. Vinothini ◽  
M. Manikandan

During real time there are problems in transmitting video directly to the client. One of the main problems is, intermediate intelligent proxy can easily hack the data as the transmitter fails to address authentication, and fails to provide security guarantees. Hence we provide steganography and cryptography mechanisms like secure-code, IP address and checksum for authentication and AES algorithm with secret key for security. Although the hacker hacks the video during transmission, he cannot view the information. Based on IP address and secure-code, the authenticated user only can get connected to the transmitter and view the information. For further improvement in security, the video is converted into frames and these frames are split into groups and separate shared key is applied to each group of frames for encryption and decryption. This secured communication process is applied in image processing modules like face detection, edge detection and color object detection. To reduce the computation time multi-core CPU processing is utilized. Using multi-core, the tasks are processed in parallel fashion.


2010 ◽  
Vol 14 (10) ◽  
pp. 2153-2165 ◽  
Author(s):  
S. Uhlenbrook ◽  
Y. Mohamed ◽  
A. S. Gragne

Abstract. Understanding catchment hydrological processes is essential for water resources management, in particular in data scarce regions. The Gilgel Abay catchment (a major tributary into Lake Tana, source of the Blue Nile) is undergoing intensive plans for water management, which is part of larger development plans in the Blue Nile basin in Ethiopia. To obtain a better understanding of the water balance dynamics and runoff generation mechanisms and to evaluate model transferability, catchment modeling has been conducted using the conceptual hydrological model HBV. Accordingly, the catchment of the Gilgel Abay has been divided into two gauged sub-catchments (Upper Gilgel Abay and Koga) and the un-gauged part of the catchment. All available data sets were tested for stationarity, consistency and homogeneity and the data limitations (quality and quantity) are discussed. Manual calibration of the daily models for three different catchment representations, i.e. (i) lumped, (ii) lumped with multiple vegetation zones, and (iii) semi-distributed with multiple vegetation and elevation zones, showed good to satisfactory model performances with Nash-Sutcliffe efficiencies Reff > 0.75 and > 0.6 for the Upper Gilgel Abay and Koga sub-catchments, respectively. Better model results could not be obtained with manual calibration, very likely due to the limited data quality and model insufficiencies. Increasing the computation time step to 15 and 30 days improved the model performance in both sub-catchments to Reff > 0.8. Model parameter transferability tests have been conducted by interchanging parameters sets between the two gauged sub-catchments. Results showed poor performances for the daily models (0.30 < Reff < 0.67), but better performances for the 15 and 30 days models, Reff > 0.80. The transferability tests together with a sensitivity analysis using Monte Carlo simulations (more than 1 million model runs per catchment representation) explained the different hydrologic responses of the two sub-catchments, which seems to be mainly caused by the presence of dambos in Koga sub-catchment. It is concluded that daily model transferability is not feasible, while it can produce acceptable results for the 15 and 30 days models. This is very useful for water resources planning and management, but not sufficient to capture detailed hydrological processes in an ungauged area.


2018 ◽  
Vol 2018 ◽  
pp. 1-7 ◽  
Author(s):  
Syaripuddin ◽  
Herry Suprajitno ◽  
Fatmawati

Quadratic programming with interval variables is developed from quadratic programming with interval coefficients to obtain optimum solution in interval form, both the optimum point and optimum value. In this paper, a two-level programming approach is used to solve quadratic programming with interval variables. Procedure of two-level programming is transforming the quadratic programming model with interval variables into a pair of classical quadratic programming models, namely, the best optimum and worst optimum problems. The procedure to solve the best and worst optimum problems is also constructed to obtain optimum solution in interval form.


2021 ◽  
Vol 15 (4) ◽  
pp. 518-523
Author(s):  
Ratko Stanković ◽  
Diana Božić

Improvements achieved by applying linear programming models in solving optimization problems in logistics cannot always be expressed by physically measurable values (dimensions), but in non-dimensional values. Therefore, it may be difficult to present the actual benefits of the improvements to the stake holders of the system being optimized. In this article, a possibility of applying simulation modelling in quantifying results of optimizing cross dock terminal gates allocation is outlined. Optimal solution is obtained on the linear programming model by using MS Excel spreadsheet optimizer, while the results are quantified on the simulation model, by using Rockwell Automation simulation software. Input data are collected from a freight forwarding company in Zagreb, specialized in groupage transport (Less Than Truckload - LTL).


2010 ◽  
Vol 13 (03) ◽  
pp. 383-390 ◽  
Author(s):  
R.P.. P. Batycky ◽  
M.. Förster ◽  
M.R.. R. Thiele ◽  
K.. Stüben

Summary We present the parallelization of a commercial streamline simulator to multicore architectures based on the OpenMP programming model and its performance on various field examples. This work is a continuation of recent work by Gerritsen et al. (2009) in which a research streamline simulator was extended to parallel execution. We identified that the streamline-transport step represents approximately 40-80% of the total run time. It is exactly this step that is straightforward to parallelize owing to the independent solution of each streamline that is at the heart of streamline simulation. Because we are working with an existing large serial code, we used specialty software to quickly and easily identify variables that required particular handling for implementing the parallel extension. Minimal rewrite to existing code was required to extend the streamline-transport step to OpenMP. As part of this work, we also parallelized additional run-time code, including the gravity-line solver and some simple routines required for constructing the pressure matrix. Overall, the run-time fraction of code parallelized ranged from 0.50 to 0.83, depending on the transport physics being considered. We tested our parallel simulator on a variety of large models including SPE 10, Forties-a UK oil/water model, Judy Creek-a Canadian waterflood/water-alternating-gas (WAG) model, and a South American black-oil model. We noted overall speedup factors from 1.8 to 3.3x for eight threads. In terms of real time, this implies that large-scale streamline simulation models as tested here can be simulated in less than 4 hours. We found speedup results to be reasonable when compared with Amdahl's ideal scaling law. Beyond eight threads, we observed minimal speedups because of memory bandwidth limits on our test machine.


Author(s):  
Javier Conejero ◽  
Sandra Corella ◽  
Rosa M Badia ◽  
Jesus Labarta

Task-based programming has proven to be a suitable model for high-performance computing (HPC) applications. Different implementations have been good demonstrators of this fact and have promoted the acceptance of task-based programming in the OpenMP standard. Furthermore, in recent years, Apache Spark has gained wide popularity in business and research environments as a programming model for addressing emerging big data problems. COMP Superscalar (COMPSs) is a task-based environment that tackles distributed computing (including Clouds) and is a good alternative for a task-based programming model for big data applications. This article describes why we consider that task-based programming models are a good approach for big data applications. The article includes a comparison of Spark and COMPSs in terms of architecture, programming model, and performance. It focuses on the differences that both frameworks have in structural terms, on their programmability interface, and in terms of their efficiency by means of three widely known benchmarking kernels: Wordcount, Kmeans, and Terasort. These kernels enable the evaluation of the more important functionalities of both programming models and analyze different work flows and conditions. The main results achieved from this comparison are (1) COMPSs is able to extract the inherent parallelism from the user code with minimal coding effort as opposed to Spark, which requires the existing algorithms to be adapted and rewritten by explicitly using their predefined functions, (2) it is an improvement in terms of performance when compared with Spark, and (3) COMPSs has shown to scale better than Spark in most cases. Finally, we discuss the advantages and disadvantages of both frameworks, highlighting the differences that make them unique, thereby helping to choose the right framework for each particular objective.


2020 ◽  
Vol 10 (7) ◽  
pp. 2359
Author(s):  
Sajad Mohammadi ◽  
Hamidreza Karami ◽  
Mohammad Azadifar ◽  
Farhad Rachidi

An open accelerator (OpenACC)-aided graphics processing unit (GPU)-based finite difference time domain (FDTD) method is presented for the first time for the 3D evaluation of lightning radiated electromagnetic fields along a complex terrain with arbitrary topography. The OpenACC directive-based programming model is used to enhance the computational performance, and the results are compared with those obtained by using a CPU-based model. It is shown that OpenACC GPUs can provide very accurate results, and they are more than 20 times faster than CPUs. The presented results support the use of OpenACC not only in relation to lightning electromagnetics problems, but also to large-scale realistic electromagnetic compatibility (EMC) applications in which computation time efficiency is a critical factor.


2019 ◽  
Vol 12 (2) ◽  
pp. 356
Author(s):  
Jingjing Hu ◽  
Youfang Huang

Purpose: The overstocked goods flow in the hub of hub-and-spoke logistics network should be disposed of in time, to reduce delay loss and improve the utilization rate of logistics network resources. The problem we need to solve is to let logistics network cooperate by sharing network resources to shunt goods from one hub-and-spoke network to another hub-and-spoke network.Design/methodology/approach: This paper proposes the hub shunting cooperation between two hub-and-spoke networks. Firstly, a hybrid integer programming model was established to describe the problem, and then a multi-layer genetic algorithm was designed to solve it and two hub-and-spoke networks are expressed by different gene segments encoded by genes. The network data of two third-party logistics companies in southern and northern China are used for example analysis at the last step. Findings: The hub-and-spoke networks of the two companies were constructed simultaneously. The transfer cost coefficient between two networks and the volume of cargo flow in the network have an impact on the computation of hubs that needed to be shunt and the corresponding cooperation hubs in the other network.Originality/value: Previous researches on hub-and-spoke logistics network focus on one logistics network, while we study the cooperation and interaction between two hub-and-spoke networks. It shows that two hub-and-spoke network can cooperate across the network to shunt the goods in the hub and improve the operation efficiency of the logistics network. 


Sign in / Sign up

Export Citation Format

Share Document