A novel architecture for a high-performance network processing unit: Flexibility at multiple levels of abstraction

Learning at Multiple Levels of Abstraction: The Case of Verb Argument Constructions

PsycEXTRA Dataset ◽

10.1037/e527342012-862 ◽

2007 ◽

Author(s):

Amy Perfors ◽

Charles Kemp ◽

Elizabeth Wonnacott ◽

Joshua B. Tenenbaum

Keyword(s):

Levels Of Abstraction ◽

Multiple Levels

A High Performance Image Processing Unit for On-orbit Servicing

57th International Astronautical Congress ◽

10.2514/6.iac-06-d1.2.03 ◽

2006 ◽

Cited By ~ 1

Author(s):

Hiroshi Yamamoto ◽

Yasufumi Nagai ◽

Shinichi Kimura ◽

Hiroshi Takahashi ◽

Satoko Mizumoto ◽

...

Keyword(s):

Image Processing ◽

High Performance ◽

Processing Unit

Multiple Levels of Abstraction in Algorithmic Problem Solving

Proceedings of the 2017 ACM SIGCSE Technical Symposium on Computer Science Education - SIGCSE '17 ◽

10.1145/3017680.3017801 ◽

2017 ◽

Cited By ~ 5

Author(s):

David Ginat ◽

Yoav Blau

Keyword(s):

Problem Solving ◽

Algorithmic Problem ◽

Levels Of Abstraction ◽

Multiple Levels

A Model for Analysing the Collective Dynamic Behaviour and Characterising the Exploitation of Population-Based Algorithms

Evolutionary Computation ◽

10.1162/evco_a_00107 ◽

2014 ◽

Vol 22 (1) ◽

pp. 159-188 ◽

Cited By ~ 5

Author(s):

Mikdam Turkey ◽

Riccardo Poli

Keyword(s):

Population Dynamics ◽

Dynamic Behaviour ◽

Fitness Landscape ◽

Population Based ◽

Levels Of Abstraction ◽

Collective Dynamic ◽

Model Studies ◽

Proposed Model ◽

Wide Range ◽

Multiple Levels

Several previous studies have focused on modelling and analysing the collective dynamic behaviour of population-based algorithms. However, an empirical approach for identifying and characterising such a behaviour is surprisingly lacking. In this paper, we present a new model to capture this collective behaviour, and to extract and quantify features associated with it. The proposed model studies the topological distribution of an algorithm's activity from both a genotypic and a phenotypic perspective, and represents population dynamics using multiple levels of abstraction. The model can have different instantiations. Here it has been implemented using a modified version of self-organising maps. These are used to represent and track the population motion in the fitness landscape as the algorithm operates on solving a problem. Based on this model, we developed a set of features that characterise the population's collective dynamic behaviour. By analysing them and revealing their dependency on fitness distributions, we were then able to define an indicator of the exploitation behaviour of an algorithm. This is an entropy-based measure that assesses the dependency on fitness distributions of different features of population dynamics. To test the proposed measures, evolutionary algorithms with different crossover operators, selection pressure levels and population handling techniques have been examined, which lead populations to exhibit a wide range of exploitation-exploration behaviours.

Simulation modeling at multiple levels of abstraction

1998 Winter Simulation Conference. Proceedings (Cat. No.98CH36274) ◽

10.1109/wsc.1998.745013 ◽

2002 ◽

Cited By ~ 15

Author(s):

P. Benjamin ◽

M. Erraguntla ◽

D. Delen ◽

R. Mayer

Keyword(s):

Simulation Modeling ◽

Levels Of Abstraction ◽

Multiple Levels

A lightweight approach to performance portability with targetDP

The International Journal of High Performance Computing Applications ◽

10.1177/1094342016682071 ◽

2016 ◽

Vol 32 (2) ◽

pp. 288-301

Author(s):

Alan Gray ◽

Kevin Stratford

Keyword(s):

Particle Physics ◽

Message Passing ◽

Graphics Processing Units ◽

High Performance ◽

Large Scale ◽

Message Passing Interface ◽

Graphics Processing Unit ◽

Processing Unit ◽

Performance Portability ◽

Graphics Processing

Leading high performance computing systems achieve their status through use of highly parallel devices such as NVIDIA graphics processing units or Intel Xeon Phi many-core CPUs. The concept of performance portability across such architectures, as well as traditional CPUs, is vital for the application programmer. In this paper we describe targetDP, a lightweight abstraction layer which allows grid-based applications to target data parallel hardware in a platform agnostic manner. We demonstrate the effectiveness of our pragmatic approach by presenting performance results for a complex fluid application (with which the model was co-designed), plus separate lattice quantum chromodynamics particle physics code. For each application, a single source code base is seen to achieve portable performance, as assessed within the context of the Roofline model. TargetDP can be combined with Message Passing Interface (MPI) to allow use on systems containing multiple nodes: we demonstrate this through provision of scaling results on traditional and graphics processing unit-accelerated large scale supercomputers.

Revisiting the CompCars Dataset for Hierarchical Car Classification: New Annotations, Experiments, and Results

Sensors ◽

10.3390/s21020596 ◽

2021 ◽

Vol 21 (2) ◽

pp. 596

Author(s):

Marco Buzzelli ◽

Luca Segantin

Keyword(s):

Real World ◽

High Performance ◽

Ad Hoc ◽

Future Research ◽

Levels Of Detail ◽

Excellent Starting Point ◽

Starting Point ◽

Hierarchical Nature ◽

Multiple Levels ◽

Entire Dataset

We address the task of classifying car images at multiple levels of detail, ranging from the top-level car type, down to the specific car make, model, and year. We analyze existing datasets for car classification, and identify the CompCars as an excellent starting point for our task. We show that convolutional neural networks achieve an accuracy above 90% on the finest-level classification task. This high performance, however, is scarcely representative of real-world situations, as it is evaluated on a biased training/test split. In this work, we revisit the CompCars dataset by first defining a new training/test split, which better represents real-world scenarios by setting a more realistic baseline at 61% accuracy on the new test set. We also propagate the existing (but limited) type-level annotation to the entire dataset, and we finally provide a car-tight bounding box for each image, automatically defined through an ad hoc car detector. To evaluate this revisited dataset, we design and implement three different approaches to car classification, two of which exploit the hierarchical nature of car annotations. Our experiments show that higher-level classification in terms of car type positively impacts classification at a finer grain, now reaching 70% accuracy. The achieved performance constitutes a baseline benchmark for future research, and our enriched set of annotations is made available for public download.

Embedded GPU Implementation for High-Performance Ultrasound Imaging

Electronics ◽

10.3390/electronics10080884 ◽

2021 ◽

Vol 10 (8) ◽

pp. 884

Author(s):

Stefano Rossi ◽

Enrico Boni

Keyword(s):

High Performance ◽

Graphics Processing Unit ◽

Digital Signal ◽

Processing Unit ◽

Embedded Computing ◽

Field Programmable ◽

Peripheral Component Interconnect ◽

Programmable Gate Arrays ◽

Graphics Processing ◽

Signal Processors

Methods of increasing complexity are currently being proposed for ultrasound (US) echographic signal processing. Graphics Processing Unit (GPU) resources allowing massive exploitation of parallel computing are ideal candidates for these tasks. Many high-performance US instruments, including open scanners like ULA-OP 256, have an architecture based only on Field-Programmable Gate Arrays (FPGAs) and/or Digital Signal Processors (DSPs). This paper proposes the implementation of the embedded NVIDIA Jetson Xavier AGX module on board ULA-OP 256. The system architecture was revised to allow the introduction of a new Peripheral Component Interconnect Express (PCIe) communication channel, while maintaining backward compatibility with all other embedded computing resources already on board. Moreover, the Input/Output (I/O) peripherals of the module make the ultrasound system independent, freeing the user from the need to use an external controlling PC.

FPGA implementation and image encryption application of a new PRNG based on a memristive Hopfield neural network with a special activation gradient

Chinese Physics B ◽

10.1088/1674-1056/ac3cb2 ◽

2021 ◽

Author(s):

Fei Yu ◽

Zinan Zhang ◽

Hui Shen ◽

Yuanyuan Huang ◽

Shuo Cai ◽

...

Keyword(s):

Neural Network ◽

Image Encryption ◽

High Performance ◽

Random Sequence ◽

Random Number Generator ◽

Security Analysis ◽

Hopfield Neural Network ◽

Data Encryption ◽

Design Tool ◽

Processing Unit

Abstract In this paper, a memristive Hopfield neural network with a special activation gradient (MHNN) is proposed by adding a suitable memristor to the Hopfield neural network (HNN) with a special activation gradient. The MHNN is simulated and dynamic analyzed, and implemented on FPGA. Then, a new pseudo-random number generator (PRNG) based on MHNN is proposed. The post-processing unit of the PRNG is composed of nonlinear post-processor and XOR calculator, which effectively ensures the randomness of PRNG. The experiments in this paper comply with the IEEE 754-1985 high precision 32-bit floating point standard and are done on the Vivado design tool using a Xilinx XC7Z020CLG400-2 FPGA chip and the Verilog-HDL hardware programming language. The random sequence generated by the PRNG proposed in this paper has passed the NIST SP800-22 test suite and security analysis, proving its randomness and high performance. Finally, an image encryption system based on PRNG is proposed and implemented on FPGA, which proves the value of the image encryption system in the field of data encryption connected to the Internet of Things (IoT).

Exploring technology related design-space limitations of high performance network processing

ESSCIRC 2007 - 33rd European Solid-State Circuits Conference ◽

10.1109/esscirc.2007.4430285 ◽

2007 ◽

Author(s):

John V. McCanny ◽

Sakir Sezer ◽

Maire O'Neill

Keyword(s):

High Performance ◽

Design Space ◽

Related Design ◽

Network Processing