scholarly journals A Partially Shared Thin Reconfigurable Array For Multicore Processor

Author(s):  
Francisco Carlos Junior ◽  
Ivan Silva ◽  
Ricardo Jacobi

Reconfigurable architectures have been widely used as single core processor accelerators. In the multi-core era, however, it is necessary to review the way that reconfigurable arrays are integrated into multi-core processor. Generally, a set of reconfigurable functional units are employed in a similar way as they are used in single core processors. Unfortunately, a considerable increase in the area ensues from this practice. Besides, in applications with unbalanced workload in their threads this approach can lead to a inefficient use of the reconfigurable architecture in cores with a low or even idle workload. To cope with this issue, this work proposes and evaluates a partially shared thin reconfigurable array, which allows to share reconfigurable resources among the processor's cores. Sharing is performed dynamically by the configuration scheduler hardware. The results shows that the sharing mechanism provided 76% of energy savings, improving the performance 41% in average when compared with a version without the proposed reconfigurable array. A comparison with a version of the reconfigurable array without the sharing mechanism was performed and shows that the sharing mechanism improved up to 11.16% in the system performance.

2010 ◽  
Vol 439-440 ◽  
pp. 1223-1229
Author(s):  
Shuo Li ◽  
Gao Chao Xu ◽  
Yu Shuang Dong ◽  
Feng Wu

With the development of microelectronics technology, Chip Multi-Processor (CMP) or multi-core design has become a mainstream choice for major microprocessor vendors. But in a chip-multiprocessor with a shared cache structure , the competing accesses from different applications degrade the system performance , resulting in non-optimal performance and non-predicting executing time. Cache partitioning techniques can exclusively partition the shared cache among multiple competing applications. In this paper, we first introduce the problems caused by Cache pollution in multicore processor structure; then present the different methods of Cache partitioning in multicore processor structure¬ --categorizing them based on the different metrics. And finally, we discuss some possible directions for future research in the area.


2011 ◽  
Vol 2011 ◽  
pp. 1-15
Author(s):  
Ismail Ktata ◽  
Fakhreddine Ghaffari ◽  
Bertrand Granado ◽  
Mohamed Abid

Applications executed on embedded systems require dynamicity and flexibility according to user and environment needs. Dynamically reconfigurable architecture could satisfy these requirements but needs efficient mechanisms to be managed efficiently. In this paper, we propose a dedicated application modeling technique that helps to establish a predictive scheduling approach to manage a dynamically reconfigurable architecture named OLLAF. OLLAF is designed to support an operating system that deals with complex embedded applications. This model will be used for a predictive scheduling based on an early estimation of our application dynamicity. A vision system of a mobile robot application has been used to validate the presented model and scheduling approach. We have demonstrated that with our modeling we can realize an efficient predictive scheduling on a robot vision application with a mean error of 6.5%.


1999 ◽  
Vol 121 (3) ◽  
pp. 171-175 ◽  
Author(s):  
Mingsheng Liu ◽  
David E. Claridge

This paper presents the physical models for the maximum potential thermal energy savings from optimizing the hot deck and cold deck reset schedules for dual duct variable air volume systems. The maximum potential savings can be determined by using these models combined with basic system operating parameters and bin data. The system performance can be evaluated by comparing the actual savings with the maximum potential savings. The energy savings from optimal cold deck and hot deck reset schedules in multi-zone buildings should be at least 75 percent of the maximum potential savings.


2014 ◽  
Vol 699 ◽  
pp. 828-833 ◽  
Author(s):  
Sumeru ◽  
Markus ◽  
Farid Nasir Ani ◽  
Henry Nasution

Air conditioning system consumes approximately 50% of the total energy consumption of buildings. Split-type air conditioner is the most widely used in residential and commercial buildings. As a result, enhancement on the performance of the air conditioners will yield a significant energy savings. The use of ejector as an expansion device on the split-type air conditioners is one method to increase the system performance. Exergy analysis on a split-type air conditioner uses an ejector as an expansion device at room and outdoor temperatures of 24 °C and 34 °C, respectively, yielded the percentage of exergy reduction up to 40.6%. Also, the exergy losses on in the compressor had the highest impact on the performance improvement of the split-type air conditioner.


2012 ◽  
Vol 2012 ◽  
pp. 1-31 ◽  
Author(s):  
Lu-Ting Ko ◽  
Jwu-E Chen ◽  
Hsi-Chin Hsin ◽  
Yaw-Shih Shieh ◽  
Tze-Yun Sung

Discrete cosine transform (DCT) and inverse DCT (IDCT) have been widely used in many image processing systems and real-time computation of nonlinear time series. In this paper, the unified DCT/IDCT algorithm based on the subband decompositions of a signal is proposed. It is derived from the data flow of subband decompositions with factorized coefficient matrices in a recursive manner. The proposed algorithm only requires(4(log2n)−1−1)and(4(log2n)−1−1)/3multiplication time forn-point DCT and IDCT, with a single multiplier and a single processor, respectively. Moreover, the peak signal-to-noise ratio (PSNR) of the proposed algorithm outperforms the conventional DCT/IDCT. As a result, the subband-based approach to DCT/IDCT is preferable to the conventional approach in terms of computational complexity and system performance. The proposed reconfigurable architecture of linear array DCT/IDCT processor has been implemented by FPGA.


2015 ◽  
Vol 24 (03) ◽  
pp. 1550043 ◽  
Author(s):  
Chen Yang ◽  
Leibo Liu ◽  
Yansheng Wang ◽  
Shouyi Yin ◽  
Peng Cao ◽  
...  

The major bottleneck of coarse-grained reconfigurable arrays (CGRAs) is the excessive configuration overhead; as a result, computing potential cannot be fully utilized. At run-time, the function of CGRAs can be fully and dynamically reconfigured by changing contexts. Therefore, the frequency of context switching on CGRAs is very high. On the other hand, the configuration time of CGRAs is very long. This paper proposes three configuration approaches to reduce interval latency when switching configuration contexts. These proposed approaches include input data relocation (IDR), line-based context switching (LCS), and loop interval minimization (LIM). IDR relocates input data to the first stage of the pipeline; as a result, the delay time for the input data of the next data flow graph (DFG) is reduced. LCS is a LCS mechanism for adjacent independent DFGs to reduce the interval of context switching, thereby expanding the depth of the pipeline. LIM is used to minimize the interval of loops. Simulations on a coarse-grained reconfigurable processor called reconfigurable multimedia system (REMUS) show that 1080 p@30 fps for H.264 high profile video decoding can be achieved under 200 MHz working frequency. As for AVS and MPEG2 decoding algorithms, much higher performance, i.e., 1080 p@39 fps and 1080 p@41 fps, can be achieved respectively.


2011 ◽  
Vol 361-363 ◽  
pp. 1047-1050
Author(s):  
Bin Liao

The pattern of using the household billing to promote heating energy savings has become a focus discussion in the current national energy conservation. Nowadays the average energy consumption in China is 2 to 3 times than the developed countries with the same weather conditions, equivalent to the level of developed countries in 60 to 70 years. We report a daily heat-energy consumption measuring test in Beijing since 2009, the result shows that 90% of the total households we tested never change their valves to regulate the heat exchange systems in two winters, the one at least change their valves once are about 5%. So that the way households use the central heating is not fit for the need to save heat-energy.


2018 ◽  
Vol 7 (4) ◽  
pp. 2100
Author(s):  
Safaa S. Omran ◽  
Ahmed K. Abdul-abbas

Hardware design of multicore 32-bits processor is implemented to achieve low latency and high throughput QR decomposition (QRD) based on two algorithms which they are Gram Schmidt (GS) and Givens Rotation (GR). The orthogonal matrices are computed using the first core processor by Gram Schmidt algorithm, and the upper triangular matrices are computed using the second core processor by Givens Rotation algorithm. This design of multicore processor can achieve 50M QRD/s throughput for (4 × 4) matrices at running frequency 200 MHz.  


Author(s):  
Mònica Paz Barroso ◽  
John R. Wilson

New demands on modern manufacturing systems have emphasised the need for higher levels of overall system reliability. The main focus of this paper is that of the reliability of manufacturing personnel and the way in which this interrelates with overall system performance. A framework - Human Error and Disturbance Occurrence in Manufacturing Systems (HEDOMS) is proposed, which integrates human reliability with overall system performance, relating human error with disturbance occurrence and handling. The HEDOMS framework has been extended into a toolkit to enable the identification of potential for human error and disturbance occurrence in manufacturing systems, as well as the definition of suitable error reduction measures.


2002 ◽  
Vol 8 (4) ◽  
pp. 279-291 ◽  
Author(s):  
PHILIP EDMONDS ◽  
ADAM KILGARRIFF

Has system performance on Word Sense Disambiguation (WSD) reached a limit? Automatic systems don't perform nearly as well as humans on the task, and from the results of the SENSEVAL exercises, recent improvements in system performance appear negligible or even negative. Still, systems do perform much better than the baselines, so something is being done right. System evaluation is crucial to explain these results and to show the way forward. Indeed, the success of any project in WSD is tied to the evaluation methodology used, and especially to the formalization of the task that the systems perform. The evaluation of WSD has turned out to be as difficult as designing the systems in the first place.


Sign in / Sign up

Export Citation Format

Share Document