scholarly journals An Efficient Hardware Implementation of CNN-Based Object Trackers for Real-Time Applications

Author(s):  
Al-Hussein El-Shafie ◽  
Mohamed Zaki ◽  
Serag Habib

<div>The object tracking research continues to be active since long period because of the several real-world variations imposed in the tracking process, like occlusion, changing appearance, illumination changes and cluttered background. With wide range of applications, embedded implementations are typically pursed for the tracking systems. Although object trackers based on Convolution Neural Network (CNN) have achieved state-of-the-art performance, they challenge the embedded implementations because of slow speed and large memory requirement. In this paper, we address these limitations on the algorithm-side and the circuitside. On the algorithm side, we adopt interpolation schemes which can significantly reduce the processing time and the memory storage requirements. We also evaluate the approximation of the hardware-expensive computations aiming for an efficient hardware implementation. Moreover, we modified the online-training scheme in order to achieve a constant processing time across all video frames. On the circuit side, we developed a hardware accelerator of the online training stage. We avoid the transposed reading from the external memory to speed-up the data movement with no performance degradation. Our proposed hardware accelerator achieves 45.9 frames-per-second in training the fully connected layers.</div>

2021 ◽  
Author(s):  
Al-Hussein El-Shafie ◽  
Mohamed Zaki ◽  
Serag Habib

<div>The object tracking research continues to be active since long period because of the several real-world variations imposed in the tracking process, like occlusion, changing appearance, illumination changes and cluttered background. With wide range of applications, embedded implementations are typically pursed for the tracking systems. Although object trackers based on Convolution Neural Network (CNN) have achieved state-of-the-art performance, they challenge the embedded implementations because of slow speed and large memory requirement. In this paper, we address these limitations on the algorithm-side and the circuitside. On the algorithm side, we adopt interpolation schemes which can significantly reduce the processing time and the memory storage requirements. We also evaluate the approximation of the hardware-expensive computations aiming for an efficient hardware implementation. Moreover, we modified the online-training scheme in order to achieve a constant processing time across all video frames. On the circuit side, we developed a hardware accelerator of the online training stage. We avoid the transposed reading from the external memory to speed-up the data movement with no performance degradation. Our proposed hardware accelerator achieves 45.9 frames-per-second in training the fully connected layers.</div>


Author(s):  
Daqi Lin ◽  
Elena Vasiou ◽  
Cem Yuksel ◽  
Daniel Kopta ◽  
Erik Brunvand

Bounding volume hierarchies (BVH) are the most widely used acceleration structures for ray tracing due to their high construction and traversal performance. However, the bounding planes shared between parent and children bounding boxes is an inherent storage redundancy that limits further improvement in performance due to the memory cost of reading these redundant planes. Dual-split trees can create identical space partitioning as BVHs, but in a compact form using less memory by eliminating the redundancies of the BVH structure representation. This reduction in memory storage and data movement translates to faster ray traversal and better energy efficiency. Yet, the performance benefits of dual-split trees are undermined by the processing required to extract the necessary information from their compact representation. This involves bit manipulations and branching instructions which are inefficient in software. We introduce hardware acceleration for dual-split trees and show that the performance advantages over BVHs are emphasized in a hardware ray tracing context that can take advantage of such acceleration. We provide details on how the operations needed for decoding dual-split tree nodes can be implemented in hardware and present experiments in a number of scenes with different sizes using path tracing. In our experiments, we have observed up to 31% reduction in render time and 38% energy saving using dual-split trees as compared to binary BVHs representing identical space partitioning.


2021 ◽  
pp. 101-107
Author(s):  
Mohammad Alshehri ◽  

Presently, a precise localization and tracking process becomes significant to enable smartphone-assisted navigation to maximize accuracy in the real-time environment. Fingerprint-based localization is the commonly available model for accomplishing effective outcomes. With this motivation, this study focuses on designing efficient smartphone-assisted indoor localization and tracking models using the glowworm swarm optimization (ILT-GSO) algorithm. The ILT-GSO algorithm involves creating a GSO algorithm based on the light-emissive characteristics of glowworms to determine the location. In addition, the Kalman filter is applied to mitigate the estimation process and update the initial position of the glowworms. A wide range of experiments was carried out, and the results are investigated in terms of distinct evaluation metrics. The simulation outcome demonstrated considerable enhancement in the real-time environment and reduced the computational complexity. The ILT-GSO algorithm has resulted in an increased localization performance with minimal error over the recent techniques.


2013 ◽  
Vol 829 ◽  
pp. 784-789 ◽  
Author(s):  
Mahmoud Zolfaghari ◽  
Mahshid Chireh

ZnO belongs to the II-VI semiconductor group with a direct band-gap of 3.2-3.37 eV in 300K and a high exciton binding energy of 60 meV. It has good transparency, high electron mobility, wide, and strong room-temperature luminescence. These properties have many applications in a wide area of emerging applications. Doping ZnO with the transition metals gives it magnetic property at room temperature hence making it multifunctional material, i.e. coexistence of magnetic, semiconducting and optical properties. The samples can be synthesized in the bulk, thin film, and nanoforms which show a wide range of ferromagnetism properties. Ferromagnetic semiconductors are important materials for spintronic and nonvolatile memory storage applications. Doping of transition metal elements into ZnO offers a feasible means of tailoring the band gap to use it as light emitters and UV detector. As there are controversial on the energy gap value due to change of lattice parameters we have synthesized Mn-doped ZnO nanoparticles by co-precipitation method with different concentrations to study the effect of lattice parameters changes on gap energy. The doped samples were studied by XRD, SEM, FT-IR., and UV-Vis. The XRD patterns confirm doping of Mn into ZnO structure. As Mn concentrations increases the peak due to of Mn impurity in FT-IR spectra becomes more pronounces hence confirming concentrations variation. We find from UV-Vis spectra that the gap energy due to doping concentration increases due to the Goldschmidt-Pauling rule this increase depends on dopant concentrations and increases as impurity amount increases.


2015 ◽  
Vol 12 (1) ◽  
pp. 71-91 ◽  
Author(s):  
Daniel E. O'Leary

ABSTRACT Increasingly, there is interest in using information and communications technology (ICT) to help build a “better world.” As an example, the United Kingdom has initiated an “open data” movement to disclose financial information about federal and local governments and other organizations. This has led to the use of a wide range of technologies (Internet, Databases, Web 2.0, etc.) to facilitate disclosure. However, since there is a huge cost of generating and maintaining open data, there also is a concern: “will anyone do anything with the data?” In a speech in 2009, David Cameron, the Prime Minister of the United Kingdom, used the term “armchair auditor” to describe crowdsourcing analysis of that data. In that speech, Cameron (2009) noted: “Just imagine the effect that an army of armchair auditors is going to have on those expense claims.” Accordingly, as more and more countries and organizations generate open data, those “armchair auditors” could play an increasingly important role: to help crowdsource monitoring of government expenditures. This paper investigates a number of potential benefits and a number of emerging concerns associated with armchair auditors.


Entropy ◽  
2019 ◽  
Vol 21 (10) ◽  
pp. 988 ◽  
Author(s):  
Fazakis ◽  
Kanas ◽  
Aridas ◽  
Karlos ◽  
Kotsiantis

One of the major aspects affecting the performance of the classification algorithms is the amount of labeled data which is available during the training phase. It is widely accepted that the labeling procedure of vast amounts of data is both expensive and time-consuming since it requires the employment of human expertise. For a wide variety of scientific fields, unlabeled examples are easy to collect but hard to handle in a useful manner, thus improving the contained information for a subject dataset. In this context, a variety of learning methods have been studied in the literature aiming to efficiently utilize the vast amounts of unlabeled data during the learning process. The most common approaches tackle problems of this kind by individually applying active learning or semi-supervised learning methods. In this work, a combination of active learning and semi-supervised learning methods is proposed, under a common self-training scheme, in order to efficiently utilize the available unlabeled data. The effective and robust metrics of the entropy and the distribution of probabilities of the unlabeled set, to select the most sufficient unlabeled examples for the augmentation of the initial labeled set, are used. The superiority of the proposed scheme is validated by comparing it against the base approaches of supervised, semi-supervised, and active learning in the wide range of fifty-five benchmark datasets.


2018 ◽  
Vol 27 (08) ◽  
pp. 1850125
Author(s):  
Sakshi ◽  
Ravi Kumar

Adaptive filters have wide range of applications in areas such as echo or interference cancellation, prediction and system identification. Due to high computational complexity of adaptive filters, their hardware implementation is not an easy task. However, it becomes essential in many cases where real-time execution is needed. This paper presents the design and hardware implementation of a variable step size 40 order adaptive filter for de-noising acoustic signals. To ensure an area efficient implementation, a novel structure is being proposed. The proposed structure eliminates the requirement of extra registers for storage of delayed inputs thereby reducing the silicon area. The structure is compared with direct-form and transposed-form structures by adapting the filter coefficients using four different variants of the least means square (LMS) algorithm. Subsequently, the filters are implemented on three different field programmable gate arrays (FPGAs) viz. Spartan 6, Virtex 6 and Virtex 7 to find out the best device family that can be used to implement an Adaptive noise canceller (ANC) by comparing speed, power and area utilization. The synthesis results clearly reveal that ANC designed using the proposed structure has resulted in a reduction in silicon area without incurring any significant overhead in terms of power or delay.


Blood ◽  
2006 ◽  
Vol 108 (11) ◽  
pp. 3646-3646
Author(s):  
Safa Karandish ◽  
Nery Berrios ◽  
Sufira Kiran ◽  
Charisse Ayuste ◽  
Toby Hamblin ◽  
...  

Abstract We recently incorporated the use of the automated Sepax Cell Processing System (Biosafe SA, Switzerland) for red cell and volume reduction of cord blood units (CBU) before cryopreservation. Now that we have been routinely using this new technique in the laboratory for about six months, we decided to compare the results of this method with our standard manual processing method (Rubinstein, et al, PNAS1995,; 92: 10119–22). For both methods, hespan is added to the cells at final concentration of 20% (v/v). With the Sepax system, after addition of hespan, the cell bag is connected to the Sepax tubing set with the final freeze bag pre-attached to the set. After completion of the automated procedure, buffy coat is collected in the final freeze bag. Cryoprotectant solution is then added directly to the freeze bag. In the manual method, buffy coat and white cell rich plasma layer is collected after first centrifugation step and the white cells are separated from the plasma after the second centrifugation step. Cryoprotectant solution is then added to the cells before transfer to the final freeze bag for cryopreservation. The following are summary of results for each method: Table 1 Manual (n=1160) Sepax (n=311) Pre-Processing volume (ml) 107±30 114±28 Pre-Processing TNC (×10e6) 1196±577 1315±519 TNC Recovery (%) 80±8 83±8 TNC Viability (%) (7-AAD) 97±3 98±3 Total CD34 (×10e6) 4.3±4 4.9±3.6 Total DFU (×10e6) 70±0.9 61.5±20 Post Processing RBC Volume (ml) 9±2 7.3±2 Total Processing Time (including Setup) 60 minutes 30 minutes It is important to note that there was not a significant difference in TNC Recovery over a wide range of Pre-Processing Volume (66–206ml) or Pre-Processing TNC (440 – 3559×106). Since the Sepax device is an automated procedure, issues could arise (i.e. short term loss of electrical power) that would require us to reprocess the CBU before cryopreservation. The Sepax system allows for recovery and reprocessing of the cell using the ‘Purge mode’. We used 5 CBU units to evaluate TNC recovery and viability after purging the cells once and reprocessing. Table 2 TNC Recovery (%) TNC Viability (%) First Buffy Coat 85±5 98±0.9 Post Purge and reprocessing 76±5 98±0.9 Although the TNC recovery was lower after the second procedure, it was still within acceptable limits and the viability of the cells had not changed. These data demonstrates that both methods are equivalent with respect to cell recovery. However, the Sepax System substantially reduces processing time and hands-on operator intervention. Additionally system provides, closed-system processing, bar code reading capability and run data print-out suitable for GMP manufacturing settings.


Sign in / Sign up

Export Citation Format

Share Document