AIM: Annealing in Memory for Vision Applications

As the Moore’s law era will draw to a close, some domain-specific architectures even non-Von Neumann systems have been presented to keep the progress. This paper proposes novel annealing in memory (AIM) architecture to implement Ising calculation, which is based on Ising model and expected to accelerate solving combinatorial optimization problem. The Ising model has a symmetrical structure and realizes phase transition by symmetry breaking. AIM draws annealing calculation into memory to reduce the cost of information transfer between calculation unit and the memory, improves the ability of parallel processing by enabling each Static Random-Access Memory (SRAM) array to perform calculations. An approximate probability flipping circuit is proposed to avoid the system getting trapped in local optimum. Bit-serial design incurs only an estimated 4.24% area above the SRAM and allows the accuracy to be easily adjusted. Two vision applications are mapped for acceleration and results show that it can speed up Multi-Object Tracking (MOT) by 780× and Multiple People Head Detection (MPHD) by 161× with only 0.0064% and 0.031% energy consumption respectively over approximate algorithms.

Download Full-text

Sparse Incremental Delta-Bar-Delta for System Identification

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.665.643 ◽

2014 ◽

Vol 665 ◽

pp. 643-646

Author(s):

Ying Liu ◽

Yan Ye ◽

Chun Guang Li

Keyword(s):

System Identification ◽

Cost Function ◽

Learning Algorithm ◽

Learning System ◽

The Other ◽

Sparse System ◽

Speed Up ◽

Sparse System Identification ◽

The Cost ◽

Zero Attractor

Metalearning algorithm learns the base learning algorithm, targeted for improving the performance of the learning system. The incremental delta-bar-delta (IDBD) algorithm is such a metalearning algorithm. On the other hand, sparse algorithms are gaining popularity due to their good performance and wide applications. In this paper, we propose a sparse IDBD algorithm by taking the sparsity of the systems into account. Thenorm penalty is contained in the cost function of the standard IDBD, which is equivalent to adding a zero attractor in the iterations, thus can speed up convergence if the system of interest is indeed sparse. Simulations demonstrate that the proposed algorithm is superior to the competing algorithms in sparse system identification.

Download Full-text

Reliable analog resistive switching behaviors achieved using memristive devices in AlOx/HfOx bilayer structure for neuromorphic systems

Semiconductor Science and Technology ◽

10.1088/1361-6641/ac3cc7 ◽

2021 ◽

Author(s):

Meng Qi ◽

Tianquan Fu ◽

Huadong Yang ◽

ye tao ◽

Chunran Li ◽

...

Keyword(s):

Resistive Switching ◽

Learning Experience ◽

Random Access ◽

Resistive Random Access Memory ◽

Spike Timing ◽

Information Storage ◽

Bilayer Structure ◽

Neuromorphic Computing ◽

Von Neumann ◽

Simulation Based

Abstract Human brain synaptic memory simulation based on resistive random access memory (RRAM) has an enormous potential to replace traditional Von Neumann digital computer thanks to several advantages, including its simple structure, high-density integration, and the capability to information storage and neuromorphic computing. Herein, the reliable resistive switching (RS) behaviors of RRAM are demonstrated by engineering AlOx/HfOx bilayer structure. This allows for uniform multibit information storage. Further, the analog switching behaviors are capable of imitate several synaptic learning functions, including learning experience behaviors, short-term plasticity-long-term plasticity transition, and spike-timing-dependent-plasticity (STDP). In addition, the memristor based on STDP learning rules are implemented in image pattern recognition. These results may offer a promising potential of HfOx-based memristors for future information storage and neuromorphic computing applications.

Download Full-text

Constructiv and Technological Consideration on the Generation of Gear Made by the DLP 3D-Printed Methode

Materiale Plastice ◽

10.37358/mp.19.2.5203 ◽

2019 ◽

Vol 56 (2) ◽

pp. 440-443

Author(s):

Mircea Dorin Vasilescu

Keyword(s):

Optical Microscope ◽

Functional Domain ◽

Conserved Quantities ◽

Technological Parameters ◽

Specific Design ◽

Domain Specific ◽

Cylindrical Gears ◽

3D Printed ◽

Mechanical Elements ◽

The Cost

The aim of the work is conduct to highlight how the technological parameters has influence of 3D printed DLP on the generation of wheel, made from resin type material. In the first part of the paper is presents how to generate in terms of dimensional aspects specific design cylindrical gears, conical and worm gear. Generating elements intended to reduce the cost of manufacturing of these elements. Also are achieve the specific components of this work are put to test with a laboratory test stand which is presented in the paper in the third part of the paper. The tested gears generated by 3D-printed technique made with 3D printed with FDM or DLP technique. After the constructive aspects, proceed to the identification of conserved quantities, which have an impact both in terms of mechanical strength, but his cinematic, in order to achieve a product with kinematic features and good functional domain specific had in mind. The next part is carried out an analysis of the layers are generated using the DLP and FDM method using an optical microscope with magnification up to 500 times, specially adapted in order to achieve both visualization and measurement of specific elements. In the end part, it will highlight the main issues and the specific recommendations made to obtain such constructive mechanical elements.

Download Full-text

Fine-Tuning by Curriculum Learning for Non-Autoregressive Neural Machine Translation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6289 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7839-7846

Author(s):

Junliang Guo ◽

Xu Tan ◽

Linli Xu ◽

Tao Qin ◽

Enhong Chen ◽

...

Keyword(s):

Machine Translation ◽

Fine Tuning ◽

Inference Process ◽

Neural Machine Translation ◽

Training Strategy ◽

Speed Up ◽

Good Improvement ◽

Tuning Process ◽

Translation Accuracy ◽

The Cost

Non-autoregressive translation (NAT) models remove the dependence on previous target tokens and generate all target tokens in parallel, resulting in significant inference speedup but at the cost of inferior translation accuracy compared to autoregressive translation (AT) models. Considering that AT models have higher accuracy and are easier to train than NAT models, and both of them share the same model configurations, a natural idea to improve the accuracy of NAT models is to transfer a well-trained AT model to an NAT model through fine-tuning. However, since AT and NAT models differ greatly in training strategy, straightforward fine-tuning does not work well. In this work, we introduce curriculum learning into fine-tuning for NAT. Specifically, we design a curriculum in the fine-tuning process to progressively switch the training from autoregressive generation to non-autoregressive generation. Experiments on four benchmark translation datasets show that the proposed method achieves good improvement (more than 1 BLEU score) over previous NAT baselines in terms of translation accuracy, and greatly speed up (more than 10 times) the inference process over AT baselines.

Download Full-text

Intrinsic Run-time Row Hammer PUFs: Leveraging the Row Hammer Effect for Run-Time Cryptography and Improved Security

10.20944/preprints201804.0369.v1 ◽

2018 ◽

Author(s):

Nikolaos Athanasios Anagnostopoulos ◽

Tolga Arul ◽

Yufan Fan ◽

Christian Hatzfeld ◽

André Schaller ◽

...

Keyword(s):

Random Access ◽

Physical Unclonable Functions ◽

Generation Times ◽

Speed Up ◽

Run Time ◽

Commercial Off The Shelf ◽

Cost Efficient ◽

Statistical Relationships ◽

The Times ◽

Time Required

Physical Unclonable Functions (PUFs) based on the retention times of the cells of a Dynamic Random Access Memory (DRAM) can be utilised for the implementation of cost-efficient and lightweight cryptographic protocols. However, as recent work has demonstrated, the times needed in order to generate their responses may prohibit their widespread usage. In order to address this issue, the Row Hammer PUF has been proposed by Schaller et al. [1], which leverages the row hammer effect in DRAM modules to reduce the retention times of their cells and, therefore, significantly speed up the generation times for the responses of PUFs based on these retention times. In this work, we extend the work of Schaller et al. by presenting a run-time accessible implementation of this PUF and further reducing the time required for the generation of its responses. Additionally, we also provide a more thorough investigation of the effects of temperature variations on the the Row Hammer PUF and briefly discuss potential statistical relationships between the cells used to implement it. As our results prove, the Row Hammer PUF could potentially provide an adequate level of security for Commercial Off-The-Shelf (COTS) devices, if its dependency on temperature is mitigated, and, may therefore, be commercially adopted in the near future.

Download Full-text

Application Layer Packet Processing Using PISA Switches

Sensors ◽

10.3390/s21238010 ◽

2021 ◽

Vol 21 (23) ◽

pp. 8010

Author(s):

Ismail Butun ◽

Yusuf Tuncel ◽

Kasim Oztoprak

Keyword(s):

Streaming Data ◽

The Novel ◽

Application Layer ◽

Domain Specific ◽

Switch Architecture ◽

Novel Approach ◽

Data Processor ◽

Application Identification ◽

The Cost ◽

Telecommunication Operators

This paper investigates and proposes a solution for Protocol Independent Switch Architecture (PISA) to process application layer data, enabling the inspection of application content. PISA is a novel approach in networking where the switch does not run any embedded binary code but rather an interpreted code written in a domain-specific language. The main motivation behind this approach is that telecommunication operators do not want to be locked in by a vendor for any type of networking equipment, develop their own networking code in a hardware environment that is not governed by a single equipment manufacturer. This approach also eases the modeling of equipment in a simulation environment as all of the components of a hardware switch run the same compatible code in a software modeled switch. The novel techniques in this paper exploit the main functions of a programmable switch and combine the streaming data processor to create the desired effect from a telecommunication operator perspective to lower the costs and govern the network in a comprehensive manner. The results indicate that the proposed solution using PISA switches enables application visibility in an outstanding performance. This ability helps the operators to remove a fundamental gap between flexibility and scalability by making the best use of limited compute resources in application identification and the response to them. The experimental study indicates that, without any optimization, the proposed solution increases the performance of application identification systems 5.5 to 47.0 times. This study promises that DPI, NGFW (Next-Generation Firewall), and such application layer systems which have quite high costs per unit traffic volume and could not scale to a Tbps level, can be combined with PISA to overcome the cost and scalability issues.

Download Full-text

Entanglement cost of generalised measurements

Quantum Information and Computation ◽

10.26421/qic3.5-2 ◽

2003 ◽

Vol 3 (5) ◽

pp. 405-422

Author(s):

R. Jozsa ◽

M. Koashi ◽

N. Linden ◽

S. Popescu ◽

S. Presnell ◽

...

Keyword(s):

Quantum Measurements ◽

Great Circle ◽

Bloch Sphere ◽

Numerical Techniques ◽

Bipartite Entanglement ◽

Von Neumann ◽

Tensor Product Space ◽

Large N ◽

Dimension 2 ◽

The Cost

Bipartite entanglement is one of the fundamental quantifiable resources of quantum information theory. We propose a new application of this resource to the theory of quantum measurements. According to Naimark's theorem any rank 1 generalised measurement (POVM) M may be represented as a von Neumann measurement in an extended (tensor product) space of the system plus ancilla. By considering a suitable average of the entanglements of these measurement directions and minimising over all Naimark extensions, we define a notion of entanglement cost E_{\min}(M) of M. We give a constructive means of characterising all Naimark extensions of a given POVM. We identify various classes of POVMs with zero and non-zero cost and explicitly characterise all POVMs in 2 dimensions having zero cost. We prove a constant upper bound on the entanglement cost of any POVM in any dimension. Hence the asymptotic entanglement cost (i.e. the large n limit of the cost of n applications of M, divided by n) is zero for all POVMs. The trine measurement is defined by three rank 1 elements, with directions symmetrically placed around a great circle on the Bloch sphere. We give an analytic expression for its entanglement cost. Defining a normalised cost of any $d$-dimensional POVM by E_{\min} (M)/\log_2 d, we show (using a combination of analytic and numerical techniques) that the trine measurement is more costly than any other POVM with d>2, or with d=2 and ancilla dimension 2. This strongly suggests that the trine measurement is the most costly of all POVMs.

Download Full-text

Convergence of Simulated Annealing with Feedback Temperature Schedules

Probability in the Engineering and Informational Sciences ◽

10.1017/s0269964800004836 ◽

1997 ◽

Vol 11 (3) ◽

pp. 279-304 ◽

Cited By ~ 4

Author(s):

M. Kolonko ◽

M. T. Tran

Keyword(s):

Simulated Annealing ◽

Cost Function ◽

Job Shop ◽

Job Shop Scheduling ◽

Optimization Method ◽

Search Process ◽

Local Optimum ◽

Fixed Sequence ◽

Temperature Parameter ◽

The Cost

It is well known that the standard simulated annealing optimization method converges in distribution to the minimum of the cost function if the probability a for accepting an increase in costs goes to 0. α is controlled by the “temperature” parameter, which in the standard setup is a fixed sequence of values converging slowly to 0. We study a more general model in which the temperature may depend on the state of the search process. This allows us to adapt the temperature to the landscape of the cost function. The temperature may temporarily rise such that the process can leave a local optimum more easily. We give weak conditions on the temperature schedules such that the process of solutions finally concentrates near the optimal solutions. We also briefly sketch computational results for the job shop scheduling problem.

Download Full-text

Local Best Particle Swarm Optimization Using Crown Jewel Defense Strategy

Critical Developments and Applications of Swarm Intelligence - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-5225-5134-8.ch002 ◽

2018 ◽

pp. 27-52 ◽

Cited By ~ 2

Author(s):

Jiarui Zhou ◽

Junshan Yang ◽

Ling Lin ◽

Zexuan Zhu ◽

Zhen Ji

Keyword(s):

Particle Swarm Optimization ◽

High Efficiency ◽

Particle Swarm ◽

Global Optimum ◽

Population Diversity ◽

Local Optimum ◽

Swarm Optimization ◽

Local Optima ◽

Von Neumann ◽

Swarm Intelligence Algorithm

Particle swarm optimization (PSO) is a swarm intelligence algorithm well known for its simplicity and high efficiency on various problems. Conventional PSO suffers from premature convergence due to the rapid convergence speed and lack of population diversity. It is easy to get trapped in local optima. For this reason, improvements are made to detect stagnation during the optimization and reactivate the swarm to search towards the global optimum. This chapter imposes the reflecting bound-handling scheme and von Neumann topology on PSO to increase the population diversity. A novel crown jewel defense (CJD) strategy is introduced to restart the swarm when it is trapped in a local optimum region. The resultant algorithm named LCJDPSO-rfl is tested on a group of unimodal and multimodal benchmark functions with rotation and shifting. Experimental results suggest that the LCJDPSO-rfl outperforms state-of-the-art PSO variants on most of the functions.

Download Full-text

How Much of What We Learn in Virtual Reality Transfers to Real-World Navigation?

Multisensory Research ◽

10.1163/22134808-20201445 ◽

2020 ◽

Vol 33 (4-5) ◽

pp. 479-503 ◽

Cited By ~ 1

Author(s):

Lukas Hejtmanek ◽

Michael Starrett ◽

Emilio Ferrer ◽

Arne D. Ekstrom

Keyword(s):

Virtual Reality ◽

Real World ◽

Information Transfer ◽

Spatial Knowledge ◽

Desktop Computer ◽

The Real ◽

Head Mounted Display ◽

Spatial Environment ◽

The Cost ◽

Study Participants

Abstract Past studies suggest that learning a spatial environment by navigating on a desktop computer can lead to significant acquisition of spatial knowledge, although typically less than navigating in the real world. Exactly how this might differ when learning in immersive virtual interfaces that offer a rich set of multisensory cues remains to be fully explored. In this study, participants learned a campus building environment by navigating (1) the real-world version, (2) an immersive version involving an omnidirectional treadmill and head-mounted display, or (3) a version navigated on a desktop computer with a mouse and a keyboard. Participants first navigated the building in one of the three different interfaces and, afterward, navigated the real-world building to assess information transfer. To determine how well they learned the spatial layout, we measured path length, visitation errors, and pointing errors. Both virtual conditions resulted in significant learning and transfer to the real world, suggesting their efficacy in mimicking some aspects of real-world navigation. Overall, real-world navigation outperformed both immersive and desktop navigation, effects particularly pronounced early in learning. This was also suggested in a second experiment involving transfer from the real world to immersive virtual reality (VR). Analysis of effect sizes of going from virtual conditions to the real world suggested a slight advantage for immersive VR compared to desktop in terms of transfer, although at the cost of increased likelihood of dropout. Our findings suggest that virtual navigation results in significant learning, regardless of the interface, with immersive VR providing some advantage when transferring to the real world.

Download Full-text