Platform Agnostic Streaming Data Application Performance Models

Author(s):  
Clayton J. Faber ◽  
Tom Plano ◽  
Samatha Kodali ◽  
Zhili Xiao ◽  
Abhishek Dwaraki ◽  
...  

Mission Performance Models (MPM) are important to the design of modern digital avionic systems because the flight deck information is no longer obvious. In large-scale dynamic systems, necessary responses to the incoming information model should be a direct correspondence. A Mission Performance Model is an abstract representation of the activity clusters necessary to achieve mission success. The three core activity clusters are trajectory management, energy management, and attitude control and will be covered in detail. Their combined performance characteristics highlight the vehicle's kinematic attributes, which then anticipates unstable conditions. Six MPM are necessary for the effective design and employment of a modern mission-ready flight deck. We describe MPM and their structure, purpose, and operational application. Performance models have many important uses including training system definition and design, avionic system design, and safety programs.


2013 ◽  
Vol 23 (04) ◽  
pp. 1340008 ◽  
Author(s):  
LAURA CARRINGTON ◽  
MICHAEL LAURENZANO ◽  
ANANTA TIWARI

The analysis and understanding of large-scale application behavior is critical for effectively utilizing existing HPC resources and making design decisions for upcoming systems. In this work we utilize the information about the behavior of an MPI application at a series of smaller core counts to characterize its behavior at a much larger core count. Our methodology first captures the application's behavior via a set of features that are important for both performance and energy (cache hit rates, floating point intensity, ILP, etc.). We then find the best statistical fit from among a set of canonical functions in terms of how these features change across a series of small core counts. The models for a given feature can then be utilized to generate an extrapolated trace of the application at scale. The accuracy of the extrapolated traces is evaluated by calculating the error of the extrapolated trace relative to an actual trace for two large-scale applications, UH3D and SPECFEM3D. The accuracy of the fully extrapolated traces is further evaluated by comparing the results of building performance models using both the extrapolated trace along with an actual trace in order to predict application performance. For these two full-scale HPC applications, performance models built using the extrapolated traces predicted the runtime with absolute relative errors of less than 5%.


2021 ◽  
Author(s):  
Jeanne Alcantara

Apache Spark enables a big data application—one that takes massive data as input and may produce massive data along its execution—to run in parallel on multiple nodes. Hence, for a big data application, performance is a vital issue. This project analyzes a WordCount application using Apache Spark, where the impact on the execution time and average utilization is assessed. To facilitate this assessment, the number of executor cores and the size of executor memory are varied across different sizes of data that the application has to process, and the different number of nodes in the cluster that the application runs on. It is concluded that different pairs (data size, number of nodes in the cluster) require different number of executor cores and different size of executor memory to obtain optimum results for execution time and average node utilization.


2021 ◽  
Author(s):  
Jeanne Alcantara

Apache Spark enables a big data application—one that takes massive data as input and may produce massive data along its execution—to run in parallel on multiple nodes. Hence, for a big data application, performance is a vital issue. This project analyzes a WordCount application using Apache Spark, where the impact on the execution time and average utilization is assessed. To facilitate this assessment, the number of executor cores and the size of executor memory are varied across different sizes of data that the application has to process, and the different number of nodes in the cluster that the application runs on. It is concluded that different pairs (data size, number of nodes in the cluster) require different number of executor cores and different size of executor memory to obtain optimum results for execution time and average node utilization.


2001 ◽  
Vol 12 (03) ◽  
pp. 341-363 ◽  
Author(s):  
JENNIFER M. SCHOPF ◽  
FRANCINE BERMAN

Prediction is a critical component in the achievement of application execution performance. The development of adequate and accurate prediction models is especially difficult in local-area clustered environments where resources are distributed and performance varies due to the presence of other users in the system. This paper discusses the use of stochastic values to parameterize cluster application performance models. Stochastic values represent a range of likely behavior and can be used effectively as model parameters. We describe two representations for stochastic model parameters and demonstrate their effectiveness in predicting the behavior of several applications under different workloads on a contended network of workstations.


2014 ◽  
Vol 22 (2) ◽  
pp. 75-91 ◽  
Author(s):  
Robert Gerstenberger ◽  
Maciej Besta ◽  
Torsten Hoefler

Modern interconnects offer remote direct memory access (RDMA) features. Yet, most applications rely on explicit message passing for communications albeit their unwanted overheads. The MPI-3.0 standard defines a programming interface for exploiting RDMA networks directly, however, it's scalability and practicability has to be demonstrated in practice. In this work, we develop scalable bufferless protocols that implement the MPI-3.0 specification. Our protocols support scaling to millions of cores with negligible memory consumption while providing highest performance and minimal overheads. To arm programmers, we provide a spectrum of performance models for all critical functions and demonstrate the usability of our library and models with several application studies with up to half a million processes. We show that our design is comparable to, or better than UPC and Fortran Coarrays in terms of latency, bandwidth and message rate. We also demonstrate application performance improvements with comparable programming complexity.


Sign in / Sign up

Export Citation Format

Share Document