approximation techniques
Recently Published Documents


TOTAL DOCUMENTS

466
(FIVE YEARS 134)

H-INDEX

32
(FIVE YEARS 7)

2022 ◽  
Vol 27 (2) ◽  
pp. 1-33
Author(s):  
Zahra Ebrahimi ◽  
Dennis Klar ◽  
Mohammad Aasim Ekhtiyar ◽  
Akash Kumar

The rapid evolution of error-resilient programs intertwined with their quest for high throughput has motivated the use of Single Instruction, Multiple Data (SIMD) components in Field-Programmable Gate Arrays (FPGAs). Particularly, to exploit the error-resiliency of such applications, Cross-layer approximation paradigm has recently gained traction, the ultimate goal of which is to efficiently exploit approximation potentials across layers of abstraction. From circuit- to application-level, valuable studies have proposed various approximation techniques, albeit linked to four drawbacks: First, most of approximate multipliers and dividers operate only in SISD mode. Second, imprecise units are often substituted, merely in a single kernel of a multi-kernel application, with an end-to-end analysis in Quality of Results (QoR) and not in the gained performance. Third, state-of-the-art (SoA) strategies neglect the fact that each kernel contributes differently to the end-to-end QoR and performance metrics. Therefore, they lack in adopting a generic methodology for adjusting the approximation knobs to maximize performance gains for a user-defined quality constraint. Finally, multi-level techniques lack in being efficiently supported, from application-, to architecture-, to circuit-level, in a cohesive cross-layer hierarchy. In this article, we propose Plasticine , a cross-layer methodology for multi-kernel applications, which addresses the aforementioned challenges by efficiently utilizing the synergistic effects of a chain of techniques across layers of abstraction. To this end, we propose an application sensitivity analysis and a heuristic that tailor the precision at constituent kernels of the application by finding the most tolerable degree of approximations for each of consecutive kernels, while also satisfying the ultimate user-defined QoR. The chain of approximations is also effectively enabled in a cross-layer hierarchy, from application- to architecture- to circuit-level, through the plasticity of SIMD multiplier-dividers, each supporting dynamic precision variability along with hybrid functionality. The end-to-end evaluations of Plasticine  on three multi-kernel applications employed in bio-signal processing, image processing, and moving object tracking for Unmanned Air Vehicles (UAV) demonstrate 41%–64%, 39%–62%, and 70%–86% improvements in area, latency, and Area-Delay-Product (ADP), respectively, over 32-bit fixed precision, with negligible loss in QoR. To springboard future research in reconfigurable and approximate computing communities, our implementations will be available and open-sourced at https://cfaed.tu-dresden.de/pd-downloads.


2022 ◽  
Vol 27 (2) ◽  
pp. 1-19
Author(s):  
Tiancong Bu ◽  
Kaige Yan ◽  
Jingweijia Tan

Dense SLAM is an important application on an embedded environment. However, embedded platforms usually fail to provide enough computation resources for high-accuracy real-time dense SLAM, even with high-parallelism architecture such as GPUs. To tackle this problem, one solution is to design proper approximation techniques for dense SLAM on embedded GPUs. In this work, we propose two novel approximation techniques, critical data identification and redundant branch elimination. We also analyze the error characteristics of the other two techniques—loop skipping and thread approximation. Then, we propose SLaPP, an online adaptive approximation controller, which aims to control the error to be under an acceptable threshold. The evaluation shows SLaPP can achieve 2.0× performance speedup and 30% energy saving on average compared to the case without approximation.


2022 ◽  
Vol 18 (1) ◽  
pp. 1-27
Author(s):  
Ran Xu ◽  
Rakesh Kumar ◽  
Pengcheng Wang ◽  
Peter Bai ◽  
Ganga Meghanath ◽  
...  

Videos take a lot of time to transport over the network, hence running analytics on the live video on embedded or mobile devices has become an important system driver. Considering such devices, e.g., surveillance cameras or AR/VR gadgets, are resource constrained, although there has been significant work in creating lightweight deep neural networks (DNNs) for such clients, none of these can adapt to changing runtime conditions, e.g., changes in resource availability on the device, the content characteristics, or requirements from the user. In this article, we introduce ApproxNet, a video object classification system for embedded or mobile clients. It enables novel dynamic approximation techniques to achieve desired inference latency and accuracy trade-off under changing runtime conditions. It achieves this by enabling two approximation knobs within a single DNN model rather than creating and maintaining an ensemble of models, e.g., MCDNN [MobiSys-16]. We show that ApproxNet can adapt seamlessly at runtime to these changes, provides low and stable latency for the image and video frame classification problems, and shows the improvement in accuracy and latency over ResNet [CVPR-16], MCDNN [MobiSys-16], MobileNets [Google-17], NestDNN [MobiCom-18], and MSDNet [ICLR-18].


2022 ◽  
Author(s):  
Jiling Ding ◽  
Weihai Zhang

Abstract This paper considers the prescribed performance tracking control for high-order uncertain nonlinear systems. For any initial system condition, a state feedback control is designed, which guarantees the prescribed tracking performance and the boundedness of closed-loop signals. The proposed controller can be implemented without using any approximation techniques for estimating unknown nonlinearities. In this respect, a significant advantage of this article is that the explosion of complexity is avoided, which is raised by backstepping-like approaches that are typically employed to the control of uncertain nonlinear systems, and a low-complexity controller is achieved. Moreover, contrary to the existing results in existing literature, the restrictions on powers of high-order nonlinear systems are relaxed to make the considered problem having stronger theoretical and practical values. The effectiveness of the proposed scheme is verified by some simulation results.


Author(s):  
Sunghan Kim ◽  
Ki-Ahm Lee

AbstractThis article is concerned with uniform $$C^{1,\alpha }$$ C 1 , α and $$C^{1,1}$$ C 1 , 1 estimates in periodic homogenization of fully nonlinear elliptic equations. The analysis is based on the compactness method, which involves linearization of the operator at each approximation step. Due to the nonlinearity of the equations, the linearized operators involve the Hessian of correctors, which appear in the previous step. The involvement of the Hessian of the correctors deteriorates the regularity of the linearized operator, and sometimes even changes its oscillating pattern. These issues are resolved with new approximation techniques, which yield a precise decomposition of the regular part and the irregular part of the homogenization process, along with a uniform control of the Hessian of the correctors in an intermediate level. The approximation techniques are even new in the context of linear equations. Our argument can be applied not only to concave operators, but also to certain class of non-concave operators.


Author(s):  
Ying Hu ◽  
Xiaomin Shi ◽  
Zuo Quan Xu

This paper is concerned with a stochastic linear-quadratic (LQ) optimal control problem on infinite time horizon, with regime switching, random coefficients, and cone control constraint. To tackle the problem, two new extended stochastic Riccati equations (ESREs) on infinite time horizon are introduced. The existence of the nonnegative solutions, in both standard and singular cases, is proved through a sequence of ESREs on finite time horizon. Based on this result and some approximation techniques, we obtain the optimal state feedback control and optimal value for the stochastic LQ problem explicitly. Finally, we apply these results to solve a lifetime portfolio selection problem of tracking a given wealth level with regime switching and portfolio constraint.


Electronics ◽  
2021 ◽  
Vol 11 (1) ◽  
pp. 39
Author(s):  
Ioannis Stratakos ◽  
Vasileios Leon ◽  
Giorgos Armeniakos ◽  
George Lentaris ◽  
Dimitrios Soudris

Every new generation of wireless communication standard aims to improve the overall performance and quality of service (QoS), compared to the previous generations. Increased data rates, numbers and capabilities of connected devices, new applications, and higher data volume transfers are some of the key parameters that are of interest. To satisfy these increased requirements, the synergy between wireless technologies and optical transport will dominate the 5G network topologies. This work focuses on a fundamental digital function in an orthogonal frequency-division multiplexing (OFDM) baseband transceiver architecture and aims at improving the throughput and circuit complexity of this function. Specifically, we consider the high-order QAM demodulation and apply approximation techniques to achieve our goals. We adopt approximate computing as a design strategy to exploit the error resiliency of the QAM function and deliver significant gains in terms of critical performance metrics. Particularly, we take into consideration and explore four demodulation algorithms and develop accurate floating- and fixed-point circuits in VHDL. In addition, we further explore the effects of introducing approximate arithmetic components. For our test case, we consider 64-QAM demodulators, and the results suggest that the most promising design provides bit error rates (BER) ranging from 10−1 to 10−4 for SNR 0–14 dB in terms of accuracy. Targeting a Xilinx Zynq Ultrascale+ ZCU106 (XCZU7EV) FPGA device, the approximate circuits achieve up to 98% reduction in LUT utilization, compared to the accurate floating-point model of the same algorithm, and up to a 122% increase in operating frequency. In terms of power consumption, our most efficient circuit configurations consume 0.6–1.1 W when operating at their maximum clock frequency. Our results show that if the objective is to achieve high accuracy in terms of BER, the prevailing solution is the approximate LLR algorithm configured with fixed-point arithmetic and 8-bit truncation, providing 81% decrease in LUTs and 13% increase in frequency and sustains a throughput of 323 Msamples/s.


2021 ◽  
Vol 11 (23) ◽  
pp. 11294
Author(s):  
Zuo-Cheng Wen ◽  
Zhi-Heng Zhang ◽  
Xiang-Bing Zhou ◽  
Jian-Gang Gu ◽  
Shao-Peng Shen ◽  
...  

Recently, predicting multivariate time-series (MTS) has attracted much attention to obtain richer semantics with similar or better performances. In this paper, we propose a tri-partition alphabet-based state (tri-state) prediction method for symbolic MTSs. First, for each variable, the set of all symbols, i.e., alphabets, is divided into strong, medium, and weak using two user-specified thresholds. With the tri-partitioned alphabet, the tri-state takes the form of a matrix. One order contains the whole variables. The other is a feature vector that includes the most likely occurring strong, medium, and weak symbols. Second, a tri-partition strategy based on the deviation degree is proposed. We introduce the piecewise and symbolic aggregate approximation techniques to polymerize and discretize the original MTS. This way, the symbol is stronger and has a bigger deviation. Moreover, most popular numerical or symbolic similarity or distance metrics can be combined. Third, we propose an along–across similarity model to obtain the k-nearest matrix neighbors. This model considers the associations among the time stamps and variables simultaneously. Fourth, we design two post-filling strategies to obtain a completed tri-state. The experimental results from the four-domain datasets show that (1) the tri-state has greater recall but lower precision; (2) the two post-filling strategies can slightly improve the recall; and (3) the along–across similarity model composed by the Triangle and Jaccard metrics are first recommended for new datasets.


2021 ◽  
Author(s):  
◽  
Petarpa Boonserm

<p>This thesis describes the development of some basic mathematical tools of wide relevance to mathematical physics. Transmission and reflection coefficients are associated with quantum tunneling phenomena, while Bogoliubov coefficients are associated with the mathematically related problem of excitations of a parametric oscillator. While many approximation techniques for these quantities are known, very little is known about rigorous upper and lower bounds. In this thesis four separate problems relating to rigorous bounds on transmission, reflection and Bogoliubov coefficients are considered, divided into four separate themes: Bounding the Bogoliubov coefficients; Bounding the greybody factors for Schwarzschild black holes; Transformation probabilities and the Miller-Good transformation; Analytic bounds on transmission probabilities.</p>


Sign in / Sign up

Export Citation Format

Share Document