buffer allocation
Recently Published Documents


TOTAL DOCUMENTS

279
(FIVE YEARS 35)

H-INDEX

28
(FIVE YEARS 2)

2021 ◽  
Vol 20 (5s) ◽  
pp. 1-24
Author(s):  
Xinyi Zhang ◽  
Yawen Wu ◽  
Peipei Zhou ◽  
Xulong Tang ◽  
Jingtong Hu

Multi-head self-attention (attention mechanism) has been employed in a variety of fields such as machine translation, language modeling, and image processing due to its superiority in feature extraction and sequential data analysis. This is benefited from a large number of parameters and sophisticated model architecture behind the attention mechanism. To efficiently deploy attention mechanism on resource-constrained devices, existing works propose to reduce the model size by building a customized smaller model or compressing a big standard model. A customized smaller model is usually optimized for the specific task and needs effort in model parameters exploration. Model compression reduces model size without hurting the model architecture robustness, which can be efficiently applied to different tasks. The compressed weights in the model are usually regularly shaped (e.g. rectangle) but the dimension sizes vary (e.g. differs in rectangle height and width). Such compressed attention mechanism can be efficiently deployed on CPU/GPU platforms as their memory and computing resources can be flexibly assigned with demand. However, for Field Programmable Gate Arrays (FPGAs), the data buffer allocation and computing kernel are fixed at run time to achieve maximum energy efficiency. After compression, weights are much smaller and different in size, which leads to inefficient utilization of FPGA on-chip buffer. Moreover, the different weight heights and widths may lead to inefficient FPGA computing kernel execution. Due to the large number of weights in the attention mechanism, building a unique buffer and computing kernel for each compressed weight on FPGA is not feasible. In this work, we jointly consider the compression impact on buffer allocation and the required computing kernel during the attention mechanism compressing. A novel structural pruning method with memory footprint awareness is proposed and the associated accelerator on FPGA is designed. The experimental results show that our work can compress Transformer (an attention mechanism based model) by 95x. The developed accelerator can fully utilize the FPGA resource, processing the sparse attention mechanism with the run-time throughput performance of 1.87 Tops in ZCU102 FPGA.


Author(s):  
Paramanand Patil ◽  
Satyanarayan Padaganur ◽  
Umesh Dixit ◽  
Achyut Yaragal

Author(s):  
José Omar Hernández - Va´zquez ◽  
Salvador Hernández-González ◽  
José Israel Hernández - V´ázquez ◽  
Vicente Figueroa- Fernández ◽  
Claudia Iveth Cancino de la Fuente

Footwear production is subject to the variability inherent in any process, and producers often need to apply tools that allow them to make the right decisions. This work documents the process to optimize the buffer allocation in a shoe manufacturing line minimizing the cycle time in the system, applying a metamodeling approach. It was found that the Front sewing operation, and the interaction between the Lining sewing operation and the assembly operation have the greatest effect on the flow time of the product within the process; the optimum assignment of spaces follows a non-uniform arrangement on the line saturating the slower stations; the cycle time follows a non-linear behavior vs. the total number of spaces (N) in the line. For a certain value of N, the cycle time reaches a minimum value.


Author(s):  
Junqi Hu ◽  
Sigrún Andradóttir ◽  
Hayriye Ayhan

Standard server assignment policies for multi-server queueing stations include the noncollaborative policy, where the servers work in parallel on different jobs; and the fully collaborative policy, where the servers work together on the same job. However, if each job can be decomposed into subtasks with no precedence relationships, then we consider a form of server coordination named task assignment where the servers work in parallel on different subtasks of the same job. We identify the task assignment policy that maximizes the long-run average throughput of a queueing station with finite internal buffers when blocked servers can be idled or reassigned to either replace or collaborate with other servers on unblocked subtasks. We then compare the server coordination policies and show that the task assignment is best when the servers are highly specialized; otherwise, the fully collaborative or noncollaborative policies are preferable depending on whether the synergy level among the servers is high or not. We also provide numerical results that quantify our previous comparison. Finally, we address buffer allocation in longer lines where there are precedence relationships between some of the tasks, and present numerical results that suggest our comparisons for one queueing station generalize to longer lines.


2021 ◽  
Vol 11 (6) ◽  
pp. 2748
Author(s):  
Dug Hee Moon ◽  
Dong Ok Kim ◽  
Yang Woo Shin

The estimation of production rate (or throughput) is important in manufacturing system design. Herein, we consider the manufacturing system of an automotive body shop in which two types of car are produced, and one car (engine car) is substituted by the other car (electric car) gradually. In this body shop, two different underbody lines are installed because the underbody structures of the two types of cars differ completely; however, the side body line and main body line are shared by the two cars. Furthermore, we assume that the underbody lines are reconfigurable based on an increase in the product mix of the electric car. A simulation-based meta-model, which is in the form of a quadratic polynomial function, is developed to estimate the production rate. In the meta-modelling process, we group some buffer locations and represent them as one variable to reduce the number of variables included in the meta-model. Subsequently, the meta-models have been used to optimize two types of buffer allocation problems, and optimal solutions are obtained easily.


Author(s):  
Shaohui Xi ◽  
James MacGregor Smith ◽  
Qingxin Chen ◽  
Ning Mao ◽  
Huiyu Zhang ◽  
...  

Author(s):  
Ai-Lin Yu ◽  
Hui-Yu Zhang ◽  
Qing-Xin Chen ◽  
Ning Mao ◽  
Shao-Hui Xi
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document