The Predictable Execution Model in Practice

2021 ◽  
Vol 20 (5) ◽  
pp. 1-25
Author(s):  
Björn Forsberg ◽  
Marco Solieri ◽  
Marko Bertogna ◽  
Luca Benini ◽  
Andrea Marongiu

Adoption of multi- and many-core processors in real-time systems has so far been slowed down, if not totally barred, due do the difficulty in providing analytical real-time guarantees on worst-case execution times. The Predictable Execution Model (PREM) has been proposed to solve this problem, but its practical support requires significant code refactoring, a task better suited for a compilation tool chain than human programmers. Implementing a PREM compiler presents significant challenges to conform to PREM requirements, such as guaranteed upper bounds on memory footprint and the generation of efficient schedulable non-preemptive regions. This article presents a comprehensive description on how a PREM compiler can be implemented, based on several years of experience from the community. We provide accumulated insights on how to best balance conformance to real-time requirements and performance and present novel techniques that extend the applicability from simple benchmark suites to real-world applications. We show that code transformed by the PREM compiler enables timing predictable execution on modern commercial off-the-shelf hardware, providing novel insights on how PREM can protect 99.4% of memory accesses on random replacement policy caches at only 16% performance loss on benchmarks from the PolyBench benchmark suite. Finally, we show that the requirements imposed on the programming model are well-aligned with current coding guidelines for timing critical software, promoting easy adoption.

2000 ◽  
Vol 8 (3) ◽  
pp. 143-162 ◽  
Author(s):  
Dimitrios S. Nikolopoulos ◽  
Theodore S. Papatheodorou ◽  
Constantine D. Polychronopoulos ◽  
Jesús Labarta ◽  
Eduard Ayguadé

This paper makes two important contributions. First, the paper investigates the performance implications of data placement in OpenMP programs running on modern NUMA multiprocessors. Data locality and minimization of the rate of remote memory accesses are critical for sustaining high performance on these systems. We show that due to the low remote-to-local memory access latency ratio of contemporary NUMA architectures, reasonably balanced page placement schemes, such as round-robin or random distribution, incur modest performance losses. Second, the paper presents a transparent, user-level page migration engine with an ability to gain back any performance loss that stems from suboptimal placement of pages in iterative OpenMP programs. The main body of the paper describes how our OpenMP runtime environment uses page migration for implementing implicit data distribution and redistribution schemes without programmer intervention. Our experimental results verify the effectiveness of the proposed framework and provide a proof of concept that it is not necessary to introduce data distribution directives in OpenMP and warrant the simplicity or the portability of the programming model.


Author(s):  
Alan Grigg ◽  
Lin Guan

This chapter describes a real-time system performance analysis approach known as reservation-based analysis (RBA). The scalability of RBA is derived from an abstract (target-independent) representation of system software components, their timing and resource requirements and run-time scheduling policies. The RBA timing analysis framework provides an evolvable modeling solution that can be instigated in early stages of system design, long before the software and hardware components have been developed, and continually refined through successive stages of detailed design, implementation and testing. At each stage of refinement, the abstract model provides a set of best-case and worst-case timing ‘guarantees’ that will be delivered subject to a set of scheduling ‘obligations’ being met by the target system implementation. An abstract scheduling model, known as the rate-based execution model then provides an implementation reference model with which compliance will ensure that the imposed set of timing obligations will be met by the target system.


2019 ◽  
Vol 14 (3) ◽  
pp. 738-753
Author(s):  
Mina Mikhail ◽  
Mohammed El-Beheiry ◽  
Nahid Afia

Purpose The purpose of this paper is to develop a decision tool that enables supply chain (SC) architects to design resilient SC networks (SCNs). Two resilience design determinants are considered: SC density and node criticality. The effect of considering these determinants on network structures is highlighted based on the ability to resist disruptions and how SC performance is affected. Design/methodology/approach A mixed-integer non-linear programming model is proposed as a proactive strategy to develop resilient structures; design determinants are formulated and considered as constraints. An upper limit is set for each determinant, and resistance capacity and performance of the developed structures are evaluated. These upper limits are then changed until SC performance stabilizes in case of no disruption. Findings Resilient SCN structures are achieved at relatively low design determinants levels on the expense of profit and without experiencing shortage in case of no disruption. This reduction in profit can be minimized on setting counter values for the two determinants; relatively higher SC density with lower node criticality or vice versa. At very low SC density levels, the design model will reduce the number of open facilities largely leading to only one facility open at each echelon; therefore, shortage occurs and vulnerability to disruption increases. On the other hand, at high determinants levels, SC vulnerability also increases as a result of having more geographically clustered structures with higher inbound and outbound flows for each facility. Originality/value In this paper, a novel proactive decision tool is adopted to design resilient SCNs. Previous literature used metrics for SC density and node criticality to assess resilience; in this research, determinants are incorporated directly as constraints in the design model. Results give insight to SC architects on how to set determinant values to reach resilient structures with minimum performance loss in case of no disruption.


2020 ◽  
Author(s):  
Prachi Sharma ◽  
Arkid Bera ◽  
Anu Gupta

<div> <div> <div> <p>To curb redundant power consumption in portable embedded and real-time applications, processors are equipped with various Dynamic Voltage and Frequency Scaling (DVFS) techniques. The accuracy of the prediction of the operating frequency of any such technique determines how power-efficient it makes a processor for a variety of programs and users. But, in the recent techniques, the focus has been too much on saving power, thus, ignoring the user-satisfaction metric, i.e. performance. The DVFS technique used to save power, in turn, introduces unwanted latency due to the high complexity of the algorithm. Also, many of the modern DVFS techniques provide feedback manually triggered by the user to change the frequency to conserve energy efficiently, thus, further increasing the reaction time. In this paper, we imple- ment a novel Artificial Neural Networks-driven frequency scaling methodology, which makes it possible to save power and boost performance at the same time, implicitly i.e. without any feedback from the user. Also, to make the system more inclusive concerning the kinds of processes run on it, we trained the ANN not only for CPU-intensive programs but also on the ones that are more memory-bound, i.e. which have frequent memory accesses during its average CPU cycle. The proposed technique has been evaluated on Intel i7-4720HQ Haswell processor and has shown performance boost by up to 20%, SoC power savings up to 16%, and Performance per Watt improvement by up to 30%, as compared to the existing DVFS technique. An open-source memory-intensive benchmark kit called Mibench was used to verify the utility of the suggested technique. </p> </div> </div> </div>


2012 ◽  
pp. 637-668
Author(s):  
Alan Grigg ◽  
Lin Guan

This chapter describes a real-time system performance analysis approach known as reservation-based analysis (RBA). The scalability of RBA is derived from an abstract (target-independent) representation of system software components, their timing and resource requirements and run-time scheduling policies. The RBA timing analysis framework provides an evolvable modeling solution that can be instigated in early stages of system design, long before the software and hardware components have been developed, and continually refined through successive stages of detailed design, implementation and testing. At each stage of refinement, the abstract model provides a set of best-case and worst-case timing ‘guarantees’ that will be delivered subject to a set of scheduling ‘obligations’ being met by the target system implementation. An abstract scheduling model, known as the rate-based execution model then provides an implementation reference model with which compliance will ensure that the imposed set of timing obligations will be met by the target system.


2020 ◽  
Author(s):  
Prachi Sharma ◽  
Arkid Bera ◽  
Anu Gupta

<div> <div> <div> <p>To curb redundant power consumption in portable embedded and real-time applications, processors are equipped with various Dynamic Voltage and Frequency Scaling (DVFS) techniques. The accuracy of the prediction of the operating frequency of any such technique determines how power-efficient it makes a processor for a variety of programs and users. But, in the recent techniques, the focus has been too much on saving power, thus, ignoring the user-satisfaction metric, i.e. performance. The DVFS technique used to save power, in turn, introduces unwanted latency due to the high complexity of the algorithm. Also, many of the modern DVFS techniques provide feedback manually triggered by the user to change the frequency to conserve energy efficiently, thus, further increasing the reaction time. In this paper, we imple- ment a novel Artificial Neural Networks-driven frequency scaling methodology, which makes it possible to save power and boost performance at the same time, implicitly i.e. without any feedback from the user. Also, to make the system more inclusive concerning the kinds of processes run on it, we trained the ANN not only for CPU-intensive programs but also on the ones that are more memory-bound, i.e. which have frequent memory accesses during its average CPU cycle. The proposed technique has been evaluated on Intel i7-4720HQ Haswell processor and has shown performance boost by up to 20%, SoC power savings up to 16%, and Performance per Watt improvement by up to 30%, as compared to the existing DVFS technique. An open-source memory-intensive benchmark kit called Mibench was used to verify the utility of the suggested technique. </p> </div> </div> </div>


2008 ◽  
Vol 17 (3) ◽  
pp. 87-92
Author(s):  
Leonard L. LaPointe

Abstract Loss of implicit linguistic competence assumes a loss of linguistic rules, necessary linguistic computations, or representations. In aphasia, the inherent neurological damage is frequently assumed by some to be a loss of implicit linguistic competence that has damaged or wiped out neural centers or pathways that are necessary for maintenance of the language rules and representations needed to communicate. Not everyone agrees with this view of language use in aphasia. The measurement of implicit language competence, although apparently necessary and satisfying for theoretic linguistics, is complexly interwoven with performance factors. Transience, stimulability, and variability in aphasia language use provide evidence for an access deficit model that supports performance loss. Advances in understanding linguistic competence and performance may be informed by careful study of bilingual language acquisition and loss, the language of savants, the language of feral children, and advances in neuroimaging. Social models of aphasia treatment, coupled with an access deficit view of aphasia, can salve our restless minds and allow pursuit of maximum interactive communication goals even without a comfortable explanation of implicit linguistic competence in aphasia.


Author(s):  
Afef Hfaiedh ◽  
Ahmed Chemori ◽  
Afef Abdelkrim

In this paper, the control problem of a class I of underactuated mechanical systems (UMSs) is addressed. The considered class includes nonlinear UMSs with two degrees of freedom and one control input. Firstly, we propose the design of a robust integral of the sign of the error (RISE) control law, adequate for this special class. Based on a change of coordinates, the dynamics is transformed into a strict-feedback (SF) form. A Lyapunov-based technique is then employed to prove the asymptotic stability of the resulting closed-loop system. Numerical simulation results show the robustness and performance of the original RISE toward parametric uncertainties and disturbance rejection. A comparative study with a conventional sliding mode control reveals a significant robustness improvement with the proposed original RISE controller. However, in real-time experiments, the amplification of the measurement noise is a major problem. It has an impact on the behaviour of the motor and reduces the performance of the system. To deal with this issue, we propose to estimate the velocity using the robust Levant differentiator instead of the numerical derivative. Real-time experiments were performed on the testbed of the inertia wheel inverted pendulum to demonstrate the relevance of the proposed observer-based RISE control scheme. The obtained real-time experimental results and the obtained evaluation indices show clearly a better performance of the proposed observer-based RISE approach compared to the sliding mode and the original RISE controllers.


Sign in / Sign up

Export Citation Format

Share Document