Optimization Techniques for Verification of Out-of-Order Execution Machines

Journal of Electrical and Computer Engineering ◽

10.1155/2010/515021 ◽

2010 ◽

Vol 2010 ◽

pp. 1-7

Author(s):

Sudarshan K. Srinivasan

Keyword(s):

Computational Complexity ◽

Optimization Technique ◽

Optimization Techniques ◽

Deadlock Detection ◽

Direct Impact ◽

Instruction Set ◽

Instruction Set Architecture ◽

Speed Up ◽

Order Execution

We develop two optimization techniques,flush-machineand collapsed flushing, to improve the efficiency of automatic refinement-abased verification of out-of-order (ooo) processor models. Refinement is a notion of equivalence that can be used to check that an ooo processor correctly implements all behaviors of its instruction set architecture (ISA), including deadlock detection. The optimization techniques work by reducing the computational complexity of the refinement map, a function central to refinement proofs that maps ooo processor model states to ISA states. This has a direct impact on the efficiency of verification, which is studied using 23 ooo processor models.Flush-machine, is a novel optimization technique. Collapsed flushing has been employed previously in the context of in-order processors. We show how to apply collapsed flushing for ooo processor models. Using both the optimizations together, we can handle 9 ooo models that could not be verified using standard flushing. Also, the optimizations provided a speed up of 23.29 over standard flushing.

Download Full-text

Addressing Mode and Bit Extensions to the Thumb-2 Instruction Set Architecture

European Journal of Electrical Engineering and Computer Science ◽

10.24018/ejece.2021.5.2.308 ◽

2021 ◽

Vol 5 (2) ◽

pp. 13-17

Author(s):

Dae-Hwan Kim

Keyword(s):

Data Processing ◽

Embedded Processors ◽

Instruction Set ◽

Instruction Set Architecture ◽

Type Conversion ◽

Processing Instruction ◽

Aggregated Data ◽

Processing Operation ◽

Speed Up ◽

Zero Extension

Thumb-2 is the most recent instruction set architecture for ARM processors which are one of the most widely used embedded processors. In this paper, two extensions are proposed to improve the performance of the Thumb-2 instruction set architecture, which are addressing mode extensions and sign/zero extensions combined with data processing instructions. To speed up access to an element of an aggregated data, the proposed approach first introduces three new addressing modes for load and store instructions. They are register-plus-immediate offset addressing mode, negative register offset addressing mode, and post-increment register offset addressing mode. Register-plus-immediate offset addressing mode permits two offsets and negative register offset allows offset to be a negative value of a register content. Post-increment register offset mode automatically modifies the offset address after the memory operation. The second is the sign/zero extension combined with a data processing instruction which allows the result of a data processing operation to be sign/zero extended to accelerate a type conversion. Several least frequently used instructions are reduced to provide the encoding space for the new extensions. Experiments show that the proposed approach improves performance by an average of 8.6% when compared to the Thumb-2 instruction set architecture.

Download Full-text

Application-Specific Instruction Set Architecture for an Ultralight Hardware Security Module

2020 IEEE International Symposium on Hardware Oriented Security and Trust (HOST) ◽

10.1109/host45689.2020.9300292 ◽

2020 ◽

Author(s):

Ahmed A. Ayoub ◽

Mark D. Aagaard

Keyword(s):

Hardware Security ◽

Instruction Set ◽

Instruction Set Architecture ◽

Specific Instruction ◽

Security Module ◽

Application Specific

Download Full-text

Solving the Real Power Limitations in the Dynamic Economic Dispatch of Large-Scale Thermal Power Units under the Effects of Valve-Point Loading and Ramp-Rate Limitations

Sustainability ◽

10.3390/su13031274 ◽

2021 ◽

Vol 13 (3) ◽

pp. 1274

Author(s):

Loau Al-Bahrani ◽

Mehdi Seyedmahmoudian ◽

Ben Horan ◽

Alex Stojcevski

Keyword(s):

Large Scale ◽

Thermal Power ◽

Optimization Technique ◽

Economic Dispatch ◽

Pso Algorithm ◽

Search Space ◽

Economic Benefits ◽

Optimization Techniques ◽

Ramp Rate ◽

Dynamic Economic Dispatch

Few non-traditional optimization techniques are applied to the dynamic economic dispatch (DED) of large-scale thermal power units (TPUs), e.g., 1000 TPUs, that consider the effects of valve-point loading with ramp-rate limitations. This is a complicated multiple mode problem. In this investigation, a novel optimization technique, namely, a multi-gradient particle swarm optimization (MG-PSO) algorithm with two stages for exploring and exploiting the search space area, is employed as an optimization tool. The M particles (explorers) in the first stage are used to explore new neighborhoods, whereas the M particles (exploiters) in the second stage are used to exploit the best neighborhood. The M particles’ negative gradient variation in both stages causes the equilibrium between the global and local search space capabilities. This algorithm’s authentication is demonstrated on five medium-scale to very large-scale power systems. The MG-PSO algorithm effectively reduces the difficulty of handling the large-scale DED problem, and simulation results confirm this algorithm’s suitability for such a complicated multi-objective problem at varying fitness performance measures and consistency. This algorithm is also applied to estimate the required generation in 24 h to meet load demand changes. This investigation provides useful technical references for economic dispatch operators to update their power system programs in order to achieve economic benefits.

Download Full-text

Investigating the Potential of Network Optimization for a Constrained Object Detection Problem

Journal of Imaging ◽

10.3390/jimaging7040064 ◽

2021 ◽

Vol 7 (4) ◽

pp. 64

Author(s):

Tanguy Ophoff ◽

Cédric Gullentops ◽

Kristof Van Beeck ◽

Toon Goedemé

Keyword(s):

Computational Complexity ◽

Object Detection ◽

Network Optimization ◽

Real Life ◽

Optimization Techniques ◽

Training Data ◽

Single Shot ◽

Standard Object ◽

Number Of Classes

Object detection models are usually trained and evaluated on highly complicated, challenging academic datasets, which results in deep networks requiring lots of computations. However, a lot of operational use-cases consist of more constrained situations: they have a limited number of classes to be detected, less intra-class variance, less lighting and background variance, constrained or even fixed camera viewpoints, etc. In these cases, we hypothesize that smaller networks could be used without deteriorating the accuracy. However, there are multiple reasons why this does not happen in practice. Firstly, overparameterized networks tend to learn better, and secondly, transfer learning is usually used to reduce the necessary amount of training data. In this paper, we investigate how much we can reduce the computational complexity of a standard object detection network in such constrained object detection problems. As a case study, we focus on a well-known single-shot object detector, YoloV2, and combine three different techniques to reduce the computational complexity of the model without reducing its accuracy on our target dataset. To investigate the influence of the problem complexity, we compare two datasets: a prototypical academic (Pascal VOC) and a real-life operational (LWIR person detection) dataset. The three optimization steps we exploited are: swapping all the convolutions for depth-wise separable convolutions, perform pruning and use weight quantization. The results of our case study indeed substantiate our hypothesis that the more constrained a problem is, the more the network can be optimized. On the constrained operational dataset, combining these optimization techniques allowed us to reduce the computational complexity with a factor of 349, as compared to only a factor 9.8 on the academic dataset. When running a benchmark on an Nvidia Jetson AGX Xavier, our fastest model runs more than 15 times faster than the original YoloV2 model, whilst increasing the accuracy by 5% Average Precision (AP).

Download Full-text

Optimal Allocation of Multiple Types of Distributed Generations in Radial Distribution Systems Using a Hybrid Technique

Sustainability ◽

10.3390/su13126644 ◽

2021 ◽

Vol 13 (12) ◽

pp. 6644

Author(s):

Ali Selim ◽

Salah Kamel ◽

Amal A. Mohamed ◽

Ehab E. Elattar

Keyword(s):

Power System ◽

Radial Distribution ◽

Distribution Systems ◽

Optimal Allocation ◽

Optimization Technique ◽

Mathematical Formulation ◽

Optimization Techniques ◽

Sensitivity Factor ◽

Hybrid Technique ◽

Analytical Technique

In recent years, the integration of distributed generators (DGs) in radial distribution systems (RDS) has received considerable attention in power system research. The major purpose of DG integration is to decrease the power losses and improve the voltage profiles that directly lead to improving the overall efficiency of the power system. Therefore, this paper proposes a hybrid optimization technique based on analytical and metaheuristic algorithms for optimal DG allocation in RDS. In the proposed technique, the loss sensitivity factor (LSF) is utilized to reduce the search space of the DG locations, while the analytical technique is used to calculate initial DG sizes based on a mathematical formulation. Then, a metaheuristic sine cosine algorithm (SCA) is applied to identify the optimal DG allocation based on the LSF and analytical techniques instead of using random initialization. To prove the superiority and high performance of the proposed hybrid technique, two standard RDSs, IEEE 33-bus and 69-bus, are considered. Additionally, a comparison between the proposed techniques, standard SCA, and other existing optimization techniques is carried out. The main findings confirmed the enhancement in the convergence of the proposed technique compared with the standard SCA and the ability to allocate multiple DGs in RDS.

Download Full-text

Model checking to find vulnerabilities in an instruction set architecture

2016 IEEE International Symposium on Hardware Oriented Security and Trust (HOST) ◽

10.1109/hst.2016.7495566 ◽

2016 ◽

Author(s):

Chris Bradfield ◽

Cynthia Sturton

Keyword(s):

Model Checking ◽

Instruction Set ◽

Instruction Set Architecture

Download Full-text

Optimization of the Parameters of RISE Feedback Controller Using Genetic Algorithm

Mathematical Problems in Engineering ◽

10.1155/2016/3863147 ◽

2016 ◽

Vol 2016 ◽

pp. 1-9

Author(s):

Fayiz Abu Khadra ◽

Jaber Abu Qudeiri ◽

Mohammed Alkahtani

Keyword(s):

Genetic Algorithm ◽

Numerical Simulations ◽

Control Algorithm ◽

Chaotic Systems ◽

Optimization Technique ◽

Optimization Techniques ◽

Algorithm Optimization ◽

External Disturbances ◽

Van Der Pol ◽

Control Methodology

A control methodology based on a nonlinear control algorithm and optimization technique is presented in this paper. A controller called “the robust integral of the sign of the error” (in short, RISE) is applied to control chaotic systems. The optimum RISE controller parameters are obtained via genetic algorithm optimization techniques. RISE control methodology is implemented on two chaotic systems, namely, the Duffing-Holms and Van der Pol systems. Numerical simulations showed the good performance of the optimized RISE controller in tracking task and its ability to ensure robustness with respect to bounded external disturbances.

Download Full-text

Ethiopian Banknote Recognition Using Convolutional Neural Network and Its Prototype Development Using Embedded Platform

Journal of Sensors ◽

10.1155/2022/4505089 ◽

2022 ◽

Vol 2022 ◽

pp. 1-18

Author(s):

Dereje Tekilu Aseffa ◽

Harish Kalla ◽

Satyasis Mishra

Keyword(s):

Optimization Technique ◽

Research Work ◽

Optimization Techniques ◽

Magnetic Sensors ◽

Batch Size ◽

Raspberry Pi ◽

Prototype Development ◽

Embedded Platform ◽

Artificial Intelligence Technology ◽

Banknote Recognition

Money transactions can be performed by automated self-service machines like ATMs for money deposits and withdrawals, banknote counters and coin counters, automatic vending machines, and automatic smart card charging machines. There are four important functions such as banknote recognition, counterfeit banknote detection, serial number recognition, and fitness classification which are furnished with these devices. Therefore, we need a robust system that can recognize banknotes and classify them into denominations that can be used in these automated machines. However, the most widely available banknote detectors are hardware systems that use optical and magnetic sensors to detect and validate banknotes. These banknote detectors are usually designed for specific country banknotes. Reprogramming such a system to detect banknotes is very difficult. In addition, researchers have developed banknote recognition systems using deep learning artificial intelligence technology like CNN and R-CNN. However, in these systems, dataset used for training is relatively small, and the accuracy of banknote recognition is found smaller. The existing systems also do not include implementation and its development using embedded systems. In this research work, we collected various Ethiopian currencies with different ages and conditions and applied various optimization techniques for CNN architects to identify the fake notes. Experimental analysis has been demonstrated with different models of CNN such as InceptionV3, MobileNetV2, XceptionNet, and ResNet50. MobileNetV2 with RMSProp optimization technique with batch size 32 is found to be a robust and reliable Ethiopian banknote detector and achieved superior accuracy of 96.4% in comparison to other CNN models. Selected model MobileNetV2 with RMSProp optimization has been implemented through an embedded platform by utilizing Raspberry Pi 3 B+ and other peripherals. Further, real-time identification of fake notes in a Web-based user interface (UI) has also been proposed in the research.

Download Full-text

Adaptive Day-Ahead Prediction of Resilient Power Distribution Network Partitions

10.36227/techrxiv.13518611 ◽

2021 ◽

Author(s):

Chinmay Shah ◽

Richard Wies

Keyword(s):

Power Distribution ◽

Renewable Energy Sources ◽

Optimization Technique ◽

Distribution Network ◽

Distribution Networks ◽

Optimization Techniques ◽

Power Distribution Networks ◽

Power Distribution Network ◽

High Penetration ◽

Islanded Mode

The conventional power distribution network is being transformed drastically due to high penetration of renewable energy sources (RES) and energy storage. The optimal scheduling and dispatch is important to better harness the energy from intermittent RES. Traditional centralized optimization techniques limit the size of the problem and hence distributed techniques are adopted. The distributed optimization technique partitions the power distribution network into sub-networks which solves the local sub problem and exchanges information with the neighboring sub-networks for the global update. This paper presents an adaptive spectral graph partitioning algorithm based on vertex migration while maintaining computational load balanced for synchronization, active power balance and sub-network resiliency. The parameters that define the resiliency metrics of power distribution networks are discussed and leveraged for better operation of sub-networks in grid connected mode as well as islanded mode. The adaptive partition of the IEEE 123-bus network into resilient sub-networks is demonstrated in this paper.

Download Full-text

Instruction-set architecture exploration strategies for deeply clustered VLIW ASIPs

2013 2nd Mediterranean Conference on Embedded Computing (MECO) ◽

10.1109/meco.2013.6601361 ◽

2013 ◽

Cited By ~ 7

Author(s):

Roel Jordans ◽

Rosilde Corvino ◽

Lech Jozwiak ◽

Henk Corporaal

Keyword(s):

Instruction Set ◽

Instruction Set Architecture ◽

Architecture Exploration

Download Full-text