Uniform Selection of Vertices for Watermark Embedding in 3-D Polygon Mesh Using IEEE754 Floating Point Representation

Author(s):  
Hitendra Garg ◽  
Krishna Kr. Khandelwal ◽  
Manish Gupta ◽  
Suneeta Agrawal
Author(s):  
Nadia Nedjah ◽  
Rodrigo Martins da Silva ◽  
Luiza de Macedo Mourelle

Artificial Neural Networks (ANNs) is a well known bio-inspired model that simulates human brain capabilities such as learning and generalization. ANNs consist of a number of interconnected processing units, wherein each unit performs a weighted sum followed by the evaluation of a given activation function. The involved computation has a tremendous impact on the implementation efficiency. Existing hardware implementations of ANNs attempt to speed up the computational process. However, these implementations require a huge silicon area that makes it almost impossible to fit within the resources available on a state-of-the-art FPGAs. In this chapter, a hardware architecture for ANNs that takes advantage of the dedicated adder blocks, commonly called MACs, to compute both the weighted sum and the activation function is devised. The proposed architecture requires a reduced silicon area considering the fact that the MACs come for free as these are FPGA’s built-in cores. Our system uses integer (fixed point) mathematics and operates with fractions to represent real numbers. Hence, floating point representation is not employed and any mathematical computation of the ANN hardware is based on combinational circuitry (performing only sums and multiplications). The hardware is fast because it is massively parallel. Besides, the proposed architecture can adjust itself on-the-fly to the user-defined configuration of the neural network, i.e., the number of layers and neurons per layer of the ANN can be settled with no extra hardware changes. This is a very nice characteristic in robot-like systems considering the possibility of the same hardware may be exploited in different tasks. The hardware also requires another system (a software) that controls the sequence of the hardware computation and provides inputs, weights and biases for the ANN in hardware. Thus, a co-design environment is necessary.


Author(s):  
Benjamin P Mastripolito ◽  
Nicholas A. Koskelo ◽  
Dylan A. Weatherred ◽  
David A Pimentel ◽  
Daniel G. Sheppard ◽  
...  

Abstract Applications often require a fast, single-threaded search algorithm over sorted data, typical in table-lookup operations. We explore various search algorithms for a large number of search candidates over a relatively small array of logarithmically-distributed sorted data. These include an innovative hash-based search that takes advantage of floating point representation to bin data by the exponent. Algorithms that can be optimized to take advantage of SIMD vector instructions are of particular interest. We then conduct a case study applying our results and analyzing algorithmic performance with the EOSPAC package. EOSPAC is a table look-up library for manipulation and interpolation of SESAME equation-of-state data. Our investigation results in a couple of algorithms with better performance with a best case eight times speedup over the original EOSPAC Hunt-and-Locate implementation. Our techniques should generalize to other instances of search algorithms seeking to get a performance boost from vectorization.


Author(s):  
Julio Villalba ◽  
Javier Hormigo

AbstractThis article proposes a family of high-radix floating-point representation to efficiently deal with floating-point addition in FPGA devices with no native floating-point support. Since variable shifter implementation (required in any FP adder) has a very high cost in FPGA, high-radix formats considerably reduce the number of possible shifts, decreasing the execution time and area highly. Although the high-radix format produces also a significant penalty in the implementation of multipliers, the experimental results show that the adder improvement overweights the multiplication penalty for most of the practical and common cases (digital filters, matrix multiplications, etc.). We also provide the designer with guidelines on selecting a suitable radix as a function of the ratio between the number of additions and multiplications of the targeted algorithm. For applications with similar numbers of additions and multiplications, the high-radix version may be up to 26% faster and even having a wider dynamic range and using higher number of significant bits. Furthermore, thanks to the proposed efficient converters between the standard IEEE-754 format and our internal high-radix format, the cost of the input/output conversions in FPGA accelerators is negligible.


1980 ◽  
Vol 7 (3) ◽  
pp. 149-155 ◽  
Author(s):  
J.D. Johannes ◽  
C.Dennis Pegden ◽  
F.E. Petry

Sign in / Sign up

Export Citation Format

Share Document