New Optimal Solutions for Real-Time Reconfigurable Periodic Asynchronous Operating System Tasks with Minimizations of Response Time

2012 ◽  
Vol 1 (4) ◽  
pp. 88-131 ◽  
Author(s):  
Hamza Gharsellaoui ◽  
Mohamed Khalgui ◽  
Samir Ben Ahmed

Scheduling tasks is an essential requirement in most real-time and embedded systems, but leads to unwanted central processing unit (CPU) overheads. The authors present a real-time schedulability algorithm for preemptable, asynchronous and periodic reconfigurable task systems with arbitrary relative deadlines, scheduled on a uniprocessor by an optimal scheduling algorithm based on the earliest deadline first (EDF) principles and on the dynamic reconfiguration. A reconfiguration scenario is assumed to be a dynamic automatic operation allowing addition, removal or update of operating system’s (OS) functional asynchronous tasks. When such a scenario is applied to save the system at the occurrence of hardware-software faults, or to improve its performance, some real-time properties can be violated. The authors propose an intelligent agent-based architecture where a software agent is used to satisfy the user requirements and to respect time constraints. The agent dynamically provides precious technical solutions for users when these constraints are not verified, by removing tasks according to predefined heuristic, or by modifying the worst case execution times (WCETs), periods, and deadlines of tasks in order to meet deadlines and to minimize their response time. They implement the agent to support these services which are applied to a Blackberry Bold 9700 and to a Volvo system and present and discuss the results of experiments.

2013 ◽  
pp. 236-274
Author(s):  
Hamza Gharsellaoui ◽  
Atef Gharbi ◽  
Olfa Mosbahi ◽  
Mohamed Khalgui ◽  
Antonio Valentini

This chapter deals with Reconfigurable Uniprocessor embedded Real-Time Systems to be classically implemented by different OS tasks that we suppose independent, asynchronous, and periodic in order to meet functional and temporal properties described in user requirements. The authors define a schedulability algorithm for preemptable, asynchronous, and periodic reconfigurable task systems with arbitrary relative deadlines, scheduled on a uniprocessor by an optimal scheduling algorithm based on the EDF principles and on the dynamic reconfiguration. Two forms of automatic reconfigurations are assumed to be applied at run-time: Addition-Remove of tasks and just modifications of their temporal parameters: WCET and/or Periods. Nevertheless, when such a scenario is applied to save the system at the occurrence of hardware-software faults, or to improve its performance, some real-time properties can be violated. The authors define a new semantic of the reconfiguration where a crucial criterion to consider is the automatic improvement of the system’s feasibility at run-time by using an Intelligent Agent that automatically checks the system’s feasibility after any reconfiguration scenario to verify if all tasks meet the required deadlines. Indeed, if a reconfiguration scenario is applied at run-time, then the Intelligent Agent dynamically provides otherwise precious technical solutions for users to remove some tasks according to predefined heuristic (based on soft or hard task), or by modifying the Worst Case Execution Times (WCETs), periods, and/or deadlines of tasks that violate corresponding constraints by new ones, in order to meet deadlines and to minimize their response time. To handle all possible reconfiguration solutions, they propose an agent-based architecture that applies automatic reconfigurations in order to re-obtain the system’s feasibility and to satisfy user requirements. Therefore, the authors developed the tool RT-Reconfiguration to support these contributions that they apply to a Blackberry Bold 9700 and to a Volvo system as running example systems and we apply the Real-Time Simulator Cheddar to check the whole system behavior and to evaluate the performance of the algorithm (detailed descriptions are available at the Website: http://beru.univ-brest.fr/~singhoff/cheddar). The authors present simulations of this architecture where they evaluate the agent that they implemented. In addition, the authors present and discuss the results of experiments that compare the accuracy and the performance of their algorithm with others.


2021 ◽  
Author(s):  
Hongjie Zheng ◽  
Hanyu Chang ◽  
Yongqiang Yuan ◽  
Qingyun Wang ◽  
Yuhao Li ◽  
...  

<p>Global navigation satellite systems (GNSS) have been playing an indispensable role in providing positioning, navigation and timing (PNT) services to global users. Over the past few years, GNSS have been rapidly developed with abundant networks, modern constellations, and multi-frequency observations. To take full advantages of multi-constellation and multi-frequency GNSS, several new mathematic models have been developed such as multi-frequency ambiguity resolution (AR) and the uncombined data processing with raw observations. In addition, new GNSS products including the uncalibrated phase delay (UPD), the observable signal bias (OSB), and the integer recovery clock (IRC) have been generated and provided by analysis centers to support advanced GNSS applications.</p><p>       However, the increasing number of GNSS observations raises a great challenge to the fast generation of multi-constellation and multi-frequency products. In this study, we proposed an efficient solution to realize the fast updating of multi-GNSS real-time products by making full use of the advanced computing techniques. Firstly, instead of the traditional vector operations, the “level-3 operations” (matrix by matrix) of Basic Liner Algebra Subprograms (BLAS) is used as much as possible in the Least Square (LSQ) processing, which can improve the efficiency due to the central processing unit (CPU) optimization and faster memory data transmission. Furthermore, most steps of multi-GNSS data processing are transformed from serial mode to parallel mode to take advantage of the multi-core CPU architecture and graphics processing unit (GPU) computing resources. Moreover, we choose the OpenBLAS library for matrix computation as it has good performances in parallel environment.</p><p>       The proposed method is then validated on a 3.30 GHz AMD CPU with 6 cores. The result demonstrates that the proposed method can substantially improve the processing efficiency for multi-GNSS product generation. For the precise orbit determination (POD) solution with 150 ground stations and 128 satellites (GPS/BDS/Galileo/GLONASS/QZSS) in ionosphere-free (IF) mode, the processing time can be shortened from 50 to 10 minutes, which can guarantee the hourly updating of multi-GNSS ultra-rapid orbit products. The processing time of uncombined POD can also be reduced by about 80%. Meanwhile, the multi-GNSS real-time clock products can be easily generated in 5 seconds or even higher sampling rate. In addition, the processing efficiency of UPD and OSB products can also be increased by 4-6 times.</p>


Sensors ◽  
2020 ◽  
Vol 20 (2) ◽  
pp. 534 ◽  
Author(s):  
Yuan He ◽  
Shunyi Zheng ◽  
Fengbo Zhu ◽  
Xia Huang

The truncated signed distance field (TSDF) has been applied as a fast, accurate, and flexible geometric fusion method in 3D reconstruction of industrial products based on a hand-held laser line scanner. However, this method has some problems for the surface reconstruction of thin products. The surface mesh will collapse to the interior of the model, resulting in some topological errors, such as overlap, intersections, or gaps. Meanwhile, the existing TSDF method ensures real-time performance through significant graphics processing unit (GPU) memory usage, which limits the scale of reconstruction scene. In this work, we propose three improvements to the existing TSDF methods, including: (i) a thin surface attribution judgment method in real-time processing that solves the problem of interference between the opposite sides of the thin surface; we distinguish measurements originating from different parts of a thin surface by the angle between the surface normal and the observation line of sight; (ii) a post-processing method to automatically detect and repair the topological errors in some areas where misjudgment of thin-surface attribution may occur; (iii) a framework that integrates the central processing unit (CPU) and GPU resources to implement our 3D reconstruction approach, which ensures real-time performance and reduces GPU memory usage. The proposed results show that this method can provide more accurate 3D reconstruction of a thin surface, which is similar to the state-of-the-art laser line scanners with 0.02 mm accuracy. In terms of performance, the algorithm can guarantee a frame rate of more than 60 frames per second (FPS) with the GPU memory footprint under 500 MB. In total, the proposed method can achieve a real-time and high-precision 3D reconstruction of a thin surface.


Symmetry ◽  
2019 ◽  
Vol 11 (4) ◽  
pp. 585
Author(s):  
Yufei Wu ◽  
Xiaofei Ruan ◽  
Yu Zhang ◽  
Huang Zhou ◽  
Shengyu Du ◽  
...  

The high demand for computational resources severely hinders the deployment of deep learning applications in resource-limited devices. In this work, we investigate the under-studied but practically important network efficiency problem and present a new, lightweight architecture for hand pose estimation. Our architecture is essentially a deeply-supervised pruned network in which less important layers and branches are removed to achieve a higher real-time inference target on resource-constrained devices without much accuracy compromise. We further make deployment optimization to facilitate the parallel execution capability of central processing units (CPUs). We conduct experiments on NYU and ICVL datasets and develop a demo1 using the RealSense camera. Experimental results show our lightweight network achieves an average running time of 32 ms (31.3 FPS, the original is 22.7 FPS) before deployment optimization. Meanwhile, the model is only about half parameters size of the original one with 11.9 mm mean joint error. After the further optimization with OpenVINO, the optimized model can run at 56 FPS on CPUs in contrast to 44 FPS running on a graphics processing unit (GPU) (Tensorflow) and it can achieve the real-time goal.


Electronics ◽  
2020 ◽  
Vol 9 (9) ◽  
pp. 1477
Author(s):  
Hongyang Guo ◽  
Qing Li ◽  
Yangjie Xu ◽  
Yongmei Huang ◽  
Shengping Du

In the line of sight correction system, the response time of the liquid crystal spatial light modulator under the normal driving voltage is too long to affect system performance. On the issues, an overdriving method based on a Field-Programmable Gate Array (FPGA) is established. The principle of the overdrive is to use a higher voltage difference to achieve a faster response speed of liquid crystal. In this scheme, the overdriving look-up table is used to seek the response time of the quantized phase, and the liquid crystal electrode is driven by Pulse–Width Modulation (PWM). All the processes are performed in FPGA, which releases the central processing unit (CPU) memory and responds faster. Adequate simulations and experiments are introduced to demonstrate the proposed method. The overdriving experiment shows that the rising response time is reduced from 530 ms to 34 ms, and the falling time is from 360 ms to 38 ms under the overdriving voltage. Typical light tracks are imitated to evaluate the performance of the line of sight correction platform. Results show that using the overdrive the −3 dB rejection frequency was increased from 1.1 Hz to 2.6 Hz. The suppression ability of the overdrive is about −20 dB at 0.1 Hz, however the normal-driving suppression ability is only about −13 dB.


Electronics ◽  
2018 ◽  
Vol 7 (11) ◽  
pp. 274 ◽  
Author(s):  
Heoncheol Lee ◽  
Kipyo Kim ◽  
Yongsung Kwon ◽  
Eonpyo Hong

This paper addresses the real-time optimization problem of the message-chain structure to maximize the throughput in data communications based on half-duplex command-response protocols. This paper proposes a new variant of the particle swarm optimization (PSO) algorithm to resolve real-time optimization, which is implemented on field programmable gate arrays (FPGA) to be performed faster in parallel and to avoid the delays caused by other tasks on a central processing unit. The proposed method was verified by finding the optimal message-chain structure much faster than the original PSO, as well as reliably with different system and algorithm parameters.


Electronics ◽  
2019 ◽  
Vol 8 (8) ◽  
pp. 866 ◽  
Author(s):  
Heoncheol Lee ◽  
Kipyo Kim

This paper addresses the real-time optimization problem to find the most efficient and reliable message chain structure in data communications based on half-duplex command–response protocols such as MIL-STD-1553B communication systems. This paper proposes a real-time Monte Carlo optimization method implemented on field programmable gate arrays (FPGA) which can not only be conducted very quickly but also avoid the conflicts with other tasks on a central processing unit (CPU). Evaluation results showed that the proposed method can consistently find the optimal message chain structure within a quite small and deterministic time, which was much faster than the conventional Monte Carlo optimization method on a CPU.


SIMULATION ◽  
2016 ◽  
Vol 93 (1) ◽  
pp. 69-84 ◽  
Author(s):  
Shailesh Tamrakar ◽  
Paul Richmond ◽  
Roshan M D’Souza

Agent-based models (ABMs) are increasingly being used to study population dynamics in complex systems, such as the human immune system. Previously, Folcik et al. (The basic immune simulator: an agent-based model to study the interactions between innate and adaptive immunity. Theor Biol Med Model 2007; 4: 39) developed a Basic Immune Simulator (BIS) and implemented it using the Recursive Porous Agent Simulation Toolkit (RePast) ABM simulation framework. However, frameworks such as RePast are designed to execute serially on central processing units and therefore cannot efficiently handle large model sizes. In this paper, we report on our implementation of the BIS using FLAME GPU, a parallel computing ABM simulator designed to execute on graphics processing units. To benchmark our implementation, we simulate the response of the immune system to a viral infection of generic tissue cells. We compared our results with those obtained from the original RePast implementation for statistical accuracy. We observe that our implementation has a 13× performance advantage over the original RePast implementation.


Sign in / Sign up

Export Citation Format

Share Document