Cooperative Mechanism of Local Memory and Cache in Network Processors

Traditional network processors (NPs) adopt either local memory mechanism or cache mechanism as the hierarchical memory structure. The local memory mechanism usually has small on-chip memory space which is not fit for the various complicated applications. The cache mechanism is better at dealing with the temporary data which need to be read and written frequently. But in deep packet processing, cache miss occurs when reading each segment of packet. We propose a cooperative mechanism of local memory and cache. In which the packet data and temporary data are stored into local memory and cache respectively. The analysis and experimental evaluation shows that the cooperative mechanism can improve the performance of network processors and reduce processing latency with little extra resources cost.

Download Full-text

Multimodal emotion recognition with hierarchical memory networks

Intelligent Data Analysis ◽

10.3233/ida-205183 ◽

2021 ◽

Vol 25 (4) ◽

pp. 1031-1045

Author(s):

Helang Lai ◽

Keke Wu ◽

Lingli Li

Keyword(s):

Emotion Recognition ◽

The Self ◽

Global Memory ◽

Human Computer Interactions ◽

Accuracy Improvement ◽

Local Memory ◽

Hierarchical Memory ◽

Multimodal Emotion Recognition ◽

Novel Model ◽

Computer Interactions

Emotion recognition in conversations is crucial as there is an urgent need to improve the overall experience of human-computer interactions. A promising improvement in this field is to develop a model that can effectively extract adequate contexts of a test utterance. We introduce a novel model, termed hierarchical memory networks (HMN), to address the issues of recognizing utterance level emotions. HMN divides the contexts into different aspects and employs different step lengths to represent the weights of these aspects. To model the self dependencies, HMN takes independent local memory networks to model these aspects. Further, to capture the interpersonal dependencies, HMN employs global memory networks to integrate the local outputs into global storages. Such storages can generate contextual summaries and help to find the emotional dependent utterance that is most relevant to the test utterance. With an attention-based multi-hops scheme, these storages are then merged with the test utterance using an addition operation in the iterations. Experiments on the IEMOCAP dataset show our model outperforms the compared methods with accuracy improvement.

Download Full-text

Memory Map: A Multiprocessor Cache Simulator

Journal of Electrical and Computer Engineering ◽

10.1155/2012/365091 ◽

2012 ◽

Vol 2012 ◽

pp. 1-12 ◽

Cited By ~ 4

Author(s):

Shaily Mittal ◽

Nitin

Keyword(s):

Shared Memory ◽

Data Flow ◽

Memory Systems ◽

System On Chip ◽

Multiprocessor System ◽

Flow Management ◽

Hit Rate ◽

Multiple Processors ◽

On Chip ◽

Cache Miss

Nowadays, Multiprocessor System-on-Chip (MPSoC) architectures are mainly focused on by manufacturers to provide increased concurrency, instead of increased clock speed, for embedded systems. However, managing concurrency is a tough task. Hence, one major issue is to synchronize concurrent accesses to shared memory. An important characteristic of any system design process is memory configuration and data flow management. Although, it is very important to select a correct memory configuration, it might be equally imperative to choreograph the data flow between various levels of memory in an optimal manner. Memory map is a multiprocessor simulator to choreograph data flow in individual caches of multiple processors and shared memory systems. This simulator allows user to specify cache reconfigurations and number of processors within the application program and evaluates cache miss and hit rate for each configuration phase taking into account reconfiguration costs. The code is open source and in java.

Download Full-text

Applied On-Chip Machine Learning for Dynamic Resource Control in Multithreaded Processors

Parallel Processing Letters ◽

10.1142/s0129626419500130 ◽

2019 ◽

Vol 29 (03) ◽

pp. 1950013

Author(s):

Shane Carroll ◽

Wei-Ming Lin

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Clock Cycle ◽

Shared Resources ◽

Multithreaded Processors ◽

Multiple Threads ◽

On Chip ◽

Control Instruction ◽

Cache Miss ◽

Fetch Bandwidth

In this paper, we propose a machine learning algorithm to control instruction fetch bandwidth in a simultaneous multithreaded CPU. In a simultaneous multithreaded CPU, multiple threads occupy pools of hardware resources in the same clock cycle. Under some conditions, one or more threads may undergo a period of inefficiency, e.g., a cache miss, thereby inefficiently using shared resources and degrading the performance of other threads. If these inefficiencies can be identified at runtime, the offending thread can be temporarily blocked from fetching new instructions into the pipeline and given time to recover from its inefficiency, and prevent the shared system resources from being wasted on a stalled thread. In this paper, we propose a machine learning approach to determine when a thread should be blocked from fetching new instructions. The model is trained offline and the parameters embedded in a CPU, which can be queried with runtime statistics to determine if a thread is running inefficiently and should be temporarily blocked from fetching. We propose two models: a simple linear model and a higher-capacity neural network. We test each model in a simulation environment and show that system performance can increase by up to 19% on average with a feasible implementation of the proposed algorithm.

Download Full-text

A METHOD OF PACKET PROCESSING IN INTEGRATED NETWORK PROCESSORS

Systems and Means of Informatics ◽

10.14357/08696527170108 ◽

2017 ◽

Keyword(s):

Packet Processing ◽

Network Processors ◽

Integrated Network

Download Full-text

A trace driven study of the effectiveness of cache mechanism for network processors

Proceedings. 2005 International Conference on Wireless Communications, Networking and Mobile Computing, 2005. ◽

10.1109/wcnm.2005.1544215 ◽

2005 ◽

Author(s):

Zhen Liu ◽

Hao Che ◽

Kai Zheng ◽

Shanzhen Chen ◽

Bin Liu

Keyword(s):

Network Processors ◽

Cache Mechanism

Download Full-text

Efficiency of cache mechanism for network processors

Tsinghua Science & Technology ◽

10.1016/s1007-0214(09)70120-8 ◽

2009 ◽

Vol 14 (5) ◽

pp. 575-585 ◽

Cited By ~ 3

Author(s):

Bo Xu ◽

Jian Chang ◽

Shimeng Huang ◽

Yibo Xue ◽

Jun Li

Keyword(s):

Network Processors ◽

Cache Mechanism

Download Full-text

System-Level Optimization of Accelerator Local Memory for Heterogeneous Systems-on-Chip

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems ◽

10.1109/tcad.2016.2611506 ◽

2016 ◽

pp. 1-1 ◽

Cited By ~ 1

Author(s):

Christian Pilato ◽

Paolo Mantovani ◽

Giuseppe Di Guglielmo ◽

Luca P. Carloni

Keyword(s):

Heterogeneous Systems ◽

System Level ◽

Local Memory ◽

Systems On Chip ◽

On Chip

Download Full-text

CCN-Based Inter-Vehicle Communication for Efficient Collection of Road and Traffic Information

Electronics ◽

10.3390/electronics9010112 ◽

2020 ◽

Vol 9 (1) ◽

pp. 112

Author(s):

Takanori Nakazawa ◽

Suhua Tang ◽

Sadao Obana

Keyword(s):

Road Traffic ◽

Driving Safety ◽

Traffic Information ◽

Network Simulator ◽

Data Packets ◽

Vehicle Communication ◽

Vehicle Mobility ◽

Content Centric Network ◽

Cache Mechanism ◽

Cache Miss

Recently, inter-vehicle communication, which helps to avoid collision accidents (by driving safety support system) and facilitate self-driving (by dissemination of road and traffic information), has attracted much attention. In this paper, in order to efficiently collect road/traffic information in the request/response manner, first a basic method, Content-centric network (CCN) for Vehicular network (CV), is proposed, which applies CCN cache function to inter-vehicle communication. Content naming and routing, which take vehicle mobility into account, are investigated. On this basis, the CV method is extended (called ECV) to avoid the cache miss problem caused by vehicle movement, and is further enhanced (called ECV+) to more efficiently exploit cache buffer in vehicles, caching content according to a probability decided by a channel usage rate. Extensive evaluations on the network simulator Scenargie, with a realistic open street map, confirm that the CV method and its extensions (ECV, ECV+) effectively reduce the average number of hops of data packets (by up to 47%, 63%, and 83%, respectively) and greatly improve the content acquisition success rate (by up to 356%, 444%, and 689%, respectively), compared to the method without a cache mechanism.

Download Full-text