Implementation of a DPU-Based Intelligent Thermal Imaging Hardware Accelerator on FPGA

Thermal imaging has many applications that all leverage from the heat map that can be constructed using this type of imaging. It can be used in Internet of Things (IoT) applications to detect the features of surroundings. In such a case, Deep Neural Networks (DNNs) can be used to carry out many visual analysis tasks which can provide the system with the capacity to make decisions. However, due to their huge computational cost, such networks are recommended to exploit custom hardware platforms to accelerate their inference as well as reduce the overall energy consumption of the system. In this work, an energy adaptive system is proposed, which can intelligently configure itself based on the battery energy level. Besides achieving a maximum speed increase that equals 6.38X, the proposed system achieves significant energy that is reduced by 97.81% compared to a conventional general-purpose CPU.

Download Full-text

An Alternate Algorithm for (3x3) Median Filtering of Digital Images

INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY ◽

10.24297/ijct.v1i1.6732 ◽

2012 ◽

Vol 2 (1) ◽

pp. 7-9 ◽

Cited By ~ 2

Author(s):

Satinderjit Singh

Keyword(s):

Median Filter ◽

Computational Cost ◽

Spatial Coherence ◽

General Purpose ◽

Median Filtering ◽

Basic Algorithm ◽

Temporal Complexity ◽

Filter Kernel ◽

One Step ◽

High Computational Cost

Median filtering is a commonly used technique in image processing. The main problem of the median filter is its high computational cost (for sorting N pixels, the temporal complexity is O(NÂ·log N), even with the most efficient sorting algorithms). When the median filter must be carried out in real time, the software implementation in general-purpose processorsdoes not usually give good results. This Paper presents an efficient algorithm for median filtering with a 3x3 filter kernel with only about 9 comparisons per pixel using spatial coherence between neighboring filter computations. The basic algorithm calculates two medians in one step and reuses sorted slices of three vertical neighboring pixels. An extension of this algorithm for 2D spatial coherence is also examined, which calculates four medians per step.

Download Full-text

A Review of Algorithms and Hardware Implementations for Spiking Neural Networks

Journal of Low Power Electronics and Applications ◽

10.3390/jlpea11020023 ◽

2021 ◽

Vol 11 (2) ◽

pp. 23

Author(s):

Duy-Anh Nguyen ◽

Xuan-Tu Tran ◽

Francesca Iacopi

Keyword(s):

Neural Networks ◽

Computational Cost ◽

Superior Performance ◽

Training Algorithms ◽

Current State ◽

Hardware Implementations ◽

Neuromorphic Hardware ◽

High Level ◽

Event Based ◽

Hardware Platforms

Deep Learning (DL) has contributed to the success of many applications in recent years. The applications range from simple ones such as recognizing tiny images or simple speech patterns to ones with a high level of complexity such as playing the game of Go. However, this superior performance comes at a high computational cost, which made porting DL applications to conventional hardware platforms a challenging task. Many approaches have been investigated, and Spiking Neural Network (SNN) is one of the promising candidates. SNN is the third generation of Artificial Neural Networks (ANNs), where each neuron in the network uses discrete spikes to communicate in an event-based manner. SNNs have the potential advantage of achieving better energy efficiency than their ANN counterparts. While generally there will be a loss of accuracy on SNN models, new algorithms have helped to close the accuracy gap. For hardware implementations, SNNs have attracted much attention in the neuromorphic hardware research community. In this work, we review the basic background of SNNs, the current state and challenges of the training algorithms for SNNs and the current implementations of SNNs on various hardware platforms.

Download Full-text

Multirow Intersection Cuts Based on the Infinity Norm

INFORMS Journal on Computing ◽

10.1287/ijoc.2020.1027 ◽

2021 ◽

Author(s):

Álinson S. Xavier ◽

Ricardo Fukasawa ◽

Laurent Poirrier

Keyword(s):

Optimization Problems ◽

Valid Inequalities ◽

Linear Optimization ◽

Computational Cost ◽

General Purpose ◽

Mixed Integer ◽

Infinity Norm ◽

Intersection Cuts ◽

Linear Optimization Problems ◽

Mixed Integer Linear Optimization

When generating multirow intersection cuts for mixed-integer linear optimization problems, an important practical question is deciding which intersection cuts to use. Even when restricted to cuts that are facet defining for the corner relaxation, the number of potential candidates is still very large, especially for instances of large size. In this paper, we introduce a subset of intersection cuts based on the infinity norm that is very small, works for relaxations having arbitrary number of rows and, unlike many subclasses studied in the literature, takes into account the entire data from the simplex tableau. We describe an algorithm for generating these inequalities and run extensive computational experiments in order to evaluate their practical effectiveness in real-world instances. We conclude that this subset of inequalities yields, in terms of gap closure, around 50% of the benefits of using all valid inequalities for the corner relaxation simultaneously, but at a small fraction of the computational cost, and with a very small number of cuts. Summary of Contribution: Cutting planes are one of the most important techniques used by modern mixed-integer linear programming solvers when solving a variety of challenging operations research problems. The paper advances the state of the art on general-purpose multirow intersection cuts by proposing a practical and computationally friendly method to generate them.

Download Full-text

Analysis of texture features for wood defect classification

Bulletin of Electrical Engineering and Informatics ◽

10.11591/eei.v9i1.1553 ◽

2020 ◽

Vol 9 (1) ◽

pp. 121-128

Author(s):

Nur Dalila Abdullah ◽

Ummi Raba'ah Hashim ◽

Sabrina Ahmad ◽

Lizawati Salahuddin

Keyword(s):

Classification Accuracy ◽

Visual Inspection ◽

Visual Analysis ◽

Computational Cost ◽

Texture Features ◽

Feature Analysis ◽

Defect Classification ◽

Grey Level ◽

Automated Visual Inspection ◽

Accuracy Measures

Selecting important features in classifying wood defects remains a challenging issue to the automated visual inspection domain. This study aims to address the extraction and analysis of features based on statistical texture on images of wood defects. A series of procedures including feature extraction using the Grey Level Dependence Matrix (GLDM) and feature analysis were executed in order to investigate the appropriate displacement and quantisation parameters that could significantly classify wood defects. Samples were taken from the KembangSemangkuk (KSK), Meranti and Merbau wood species. Findings from visual analysis and classification accuracy measures suggest that the feature set with the displacement parameter, d=2, and quantisation level, q=128, shows the highest classification accuracy. However, to achieve less computational cost, the feature set with quantisation level, q=32, shows acceptable performance in terms of classification accuracy.

Download Full-text

Multithreaded Programming of Reconfigurable Embedded Systems

Reconfigurable Embedded Control Systems ◽

10.4018/978-1-60960-086-0.ch002 ◽

2011 ◽

pp. 31-54

Author(s):

Jason Agron ◽

David Andrews ◽

Markus Happe ◽

Enno Lübbers ◽

Marco Platzner

Keyword(s):

Development Time ◽

Programming Model ◽

General Purpose ◽

System Level ◽

Performance Ratio ◽

Multithreaded Programming ◽

Price Performance ◽

Flexible Hardware ◽

Custom Hardware ◽

Application Requirements

Embedded and Real-Time (ERTS) systems have continued to expand at a vigorous rate. Designers of ERTS systems are continually challenged to provide new capabilities that can meet the expanding requirements and increased computational needs of each new proposed application, but at a decreasing price/performance ratio. Conventional solutions using general purpose processors or custom ASICs are less and less able to satisfy the contradictory requirements in performance, flexibility, power, development time, and cost. This chapter introduces the concept of generating semi-custom platforms driven from a traditional multithreaded programming model. This approach offers the advantage of achieving productivity levels close to those associated with software by using an established programming model but with a performance level close to custom hardware through the use of a flexible hardware platform capable of adapting to specialized application requirements. We discuss the underlying concepts, requirements and advantages of multithreading in the context of reconfigurable hardware, and present two approaches which provide multithreading support to hardware and software components at the operating system level.

Download Full-text

Tool-Workpiece Temperature for Continuous and Interrupted Cutting of ASSAB 760 Steel with Dry Machining Process

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.476-478.392 ◽

2012 ◽

Vol 476-478 ◽

pp. 392-396

Author(s):

M. Azuddin

Keyword(s):

Feed Rate ◽

Thermal Imaging ◽

Temperature Increase ◽

Cutting Speed ◽

Machining Process ◽

Depth Of Cut ◽

Workpiece Temperature ◽

Speed Increase ◽

Interrupted Cutting ◽

Imaging Camera

Temperature generated on the tool-workpiece has significant effect on the cutting performance. This paper present the tool-workpiece temperature result recorded by thermal imaging camera with various cutting parameter applied. The machining was done on ASSAB 720 steel workpiece for continuous and interrupted cutting. Generally, as the cutting speed, feed rate and depth of cut increases, the tool-workpiece temperature for both continuous and interrupted cutting will increase. Specifically, cutting speed increase from 250m/min to 350 m/min, tool-workpiece temperature increase about 25% at each increment level. The tool-workpiece temperature increase about 16% when the feed rate increases from 0.1 mm/rev to 0.2mm/rev. While, 27% increment was recorded when feed rate increase to 0.4 mm/rev. With the increase of depth of cut, the tool-workpiece temperature recorded an increment between 50oC to 65oC for both continuous and interrupted cutting.

Download Full-text

Numerical Prediction of Poly-Dispersed Condensation Using Quadrature Method of Moments and Multi-Fluid Model

Volume 8: Computational Fluid Dynamics (CFD) and Coupled Codes; Nuclear Education, Public Acceptance and Related Issues ◽

10.1115/icone25-67304 ◽

2017 ◽

Author(s):

Anjaneyulu Lankadasu ◽

Laurent Krumenacker ◽

Anil Kumar ◽

Amita Tripathi

Keyword(s):

Method Of Moments ◽

Laval Nozzle ◽

High Efficiency ◽

Fluid Model ◽

Computational Cost ◽

Droplet Size Distribution ◽

General Purpose ◽

Navier Stokes ◽

Quadrature Method ◽

Quadrature Method Of Moments

Accurate prediction of condensation plays an important role in the development of high efficiency turbo-machines working on condensable fluid. Therefore it demands modeling of poly-disperse characteristic of number distribution function while modeling condensation. Two such kind of models are considered in this work and they are namely, quadrature method of moments (QMOM) and multi-fluid method (MFM) models. The vital difference between these two models lies in the method of discretisation of the droplet size distribution. Further, their numerical aspects like ease of implementation in general purpose computational fluid dynamics solvers, accuracy and associated computational cost are discussed. In order to obtain accurate thermodynamic properties, the real gas formulations defined in IAPWS-IF97 are used. These algorithms are applied to the compressible Navier-Stokes solver of Fluidyn MP and tests are carried on Laval nozzle and compared with the experimental measurements.

Download Full-text

Efficient Multirate Simulation of Complex Multibody Systems Based on Free Software

Volume 4: 8th International Conference on Multibody Systems, Nonlinear Dynamics, and Control, Parts A and B ◽

10.1115/detc2011-47306 ◽

2011 ◽

Cited By ~ 4

Author(s):

Tommaso Solcia ◽

Pierangelo Masarati

Keyword(s):

Time Scales ◽

Block Diagram ◽

Computational Cost ◽

Solution Process ◽

General Purpose ◽

Loop Analysis ◽

Horizontal Axis Wind Turbine ◽

Mechatronic Systems ◽

Electric Generator ◽

Analysis And Design

Complex aeroservoelastic and mechatronic systems imply interaction between multidisciplinary or multifield subcomponents, whose dynamics are characterized by problem- and field-specific time scales and frequency ranges. As opposed to what is usually termed monolithic approach to the simulation of coupled problems, where a single formulation (and software solver) directly models the entire problems, the co-simulation approach allows to exploit state-of-art formulations for specific fields by coupling them as appropriate to establish the required interaction between the subcomponents. The interaction problem between the different and even incompatible interfaces of subcomponent domains can be split in spatial and temporal. This work focuses on the latter aspect. In fact, when subdomains require different time scales to achieve the desired trade-off between accuracy and computational cost, multirate methods can be used to avoid the need of a subdomain solver to comply with excessively stringent requirements resulting from another one. Many multirate methods are designed for monoblock systems and used in single-disciplinary simulations (e.g. electric networks). Their application to co-simulation setups may be not straightforward. A key problem in co-simulation, especially when stability and free response of a system are addressed, as in aeroservoelasticity, is related to the numerical stability of the coupled solution process. This work investigates the linear stability properties of a multirate formulation called ‘Double Extrapolation’ (DE) consisting in integrating each subproblem using second-order accurate, L-stable Backward Difference Formulas (BDF) while each subdomain extrapolates the behavior of the other one. It has been chosen because it allows to eliminate most of the idle time of each subdomain solver. The resulting performance gains are illustrated by applying the proposed method to the simulation of a complex aeroservoelastic system consisting in the aeroelastic model of a Horizontal Axis Wind Turbine (HAWT), developed using the general-purpose multibody formulation implemented in the free solver MBDyn, and a dynamic model of the electric generator, modeled in the free general-purpose block-diagram simulation environment ScicosLab. Both modeling environments are real-time capable; thus the proposed system represents an affordable and versatile solution for the hardware-in-the-loop analysis and design of complex multidisciplinary systems.

Download Full-text

Operator-Based Linearization Approach for Modeling of Multiphase Flow with Buoyancy and Capillarity

SPE Journal ◽

10.2118/205378-pa ◽

2021 ◽

pp. 1-18

Author(s):

Xiaocong Lyu ◽

Mark Khait ◽

Denis Voskov

Keyword(s):

Multiphase Flow ◽

Computational Cost ◽

Industrial Applications ◽

General Purpose ◽

Uniform Mesh ◽

Benchmark Tests ◽

Gravity Forces ◽

State Dependent ◽

Highly Nonlinear ◽

Unique Framework

Summary Numerical simulation of coupled multiphase multicomponent flow and transport in porous media is a crucial tool for understanding and forecasting of complex industrial applications related to the subsurface. The discretized governing equations are highly nonlinear and usually need to be solved with Newton’s method, which corresponds with high computational cost and complexity. With the presence of capillary and gravity forces, the nonlinearity of the problem is amplified even further, which usually leads to a higher numerical cost. A recently proposed operator-based linearization (OBL) approach effectively improves the performance of complex physical modeling by transforming the discretized nonlinear conservation equations into a quasilinear form according to state-dependent operators. These operators are approximated by means of a discrete representation on a uniform mesh in physical parameter space. Continuous representation is achieved through the multilinear interpolation. This approach provides a unique framework for the multifidelity representation of physics in general-purpose simulation. The applicability of the OBL approach was demonstrated for various energy subsurface applications with multiphase flow of mass and heat in the presence of buoyancy and diffusive forces. In this work, the OBL approach is extended for multiphase multicomponent systems with capillarity. Through the comparisons with a legacy commercial simulator using a set of benchmark tests, we demonstrate that the extended OBL scheme significantly improves the computational efficiency with the controlled accuracy of approximation and converges to the results of the conventional continuous approach with an increased resolution of parametrization.

Download Full-text

Network Function Virtualization in Content-Centric Networks

10.5753/wpeif.2019.7696 ◽

2019 ◽

Author(s):

José Castillo-Lema ◽

Augusto José Venâncio Neto ◽

Flavio de Oliveira Silva ◽

Sergio Takeo Kofuji

Keyword(s):

Ip Networks ◽

General Purpose ◽

Virtual Network ◽

Network Function Virtualization ◽

Network Function ◽

The Past ◽

Central Controller ◽

Network Functions ◽

Extensive Effort ◽

Hardware Platforms

Network Functions Virtualization (NFV) offers an alternative way to design, deploy, and manage networking functions and services by leveraging virtualization technologies to consolidate network functions into general-purpose hardware platforms. On the past years extensive effort has been made to evolve and mature NFV tecnologies over IP networks. However, little or no attempts at all have been made to incorporate NFV into Information-Centric Networks (ICN). This work explores the use and implementation of virtual Network Funtions (VNFS)in Content-Centric Networks (CCN), and proposes the use of the Named Function Networking (NFN) paradigm as means to implement network functions and services in this kind of networks, distributing the network functions and services through the networks nodes and providing flexibility to dynamically place functions in the network as required and without the need of a central controller.

Download Full-text