scholarly journals The ROSACE case study: From Simulink specification to multi/many-core execution

Author(s):  
Claire Pagetti ◽  
David Saussie ◽  
Romain Gratia ◽  
Eric Noulard ◽  
Pierre Siron
Keyword(s):  
2021 ◽  
Vol 36 (1) ◽  
pp. 33-43
Author(s):  
Jian-Bin Fang ◽  
Xiang-Ke Liao ◽  
Chun Huang ◽  
De-Zun Dong

Author(s):  
Haoyuan Ying ◽  
Klaus Hofmann ◽  
Thomas Hollstein

Due to the growing demand on high performance and low power in embedded systems, many core architectures are proposed the most suitable solutions. While the design concentration of many core embedded systems is switching from computation-centric to communication-centric, Network-on-Chip (NoC) is one of the best interconnect techniques for such architectures because of the scalability and high communication bandwidth. Formalized and optimized system-level design methods for NoC-based many core embedded systems are desired to improve the system performance and to reduce the power consumption. In order to understand the design optimization methods in depth, a case study of optimizing many core embedded systems based on 3-Dimensional (3D) NoC with irregular vertical link distribution topology through task mapping, core placement, routing, and topology generation is demonstrated in this chapter. Results of cycle-accurate simulation experiments prove the validity and efficiency of the design methods. Specific to the case study configuration, in maximum 60% vertical links can be saved while maintaining the system efficiency in comparison to full vertical link connection 3D NoCs by applying the design optimization methods.


2013 ◽  
Vol 7 (4) ◽  
pp. 143-154
Author(s):  
Han‐Yee Kim ◽  
Young‐Hwan Kim ◽  
HeonChang Yu ◽  
Taeweon Suh

2017 ◽  
Vol 104 ◽  
pp. 234-251 ◽  
Author(s):  
Ajay Panyala ◽  
Daniel Chavarría-Miranda ◽  
Joseph B. Manzano ◽  
Antonino Tumeo ◽  
Mahantesh Halappanavar

2015 ◽  
Vol 2015 ◽  
pp. 1-12
Author(s):  
Hari Radhakrishnan ◽  
Damian W. I. Rouson ◽  
Karla Morris ◽  
Sameer Shende ◽  
Stavros C. Kassinos

This paper summarizes a strategy for parallelizing a legacy Fortran 77 program using the object-oriented (OO) and coarray features that entered Fortran in the 2003 and 2008 standards, respectively. OO programming (OOP) facilitates the construction of an extensible suite of model-verification and performance tests that drive the development. Coarray parallel programming facilitates a rapid evolution from a serial application to a parallel application capable of running on multicore processors and many-core accelerators in shared and distributed memory. We delineate 17 code modernization steps used to refactor and parallelize the program and study the resulting performance. Our initial studies were done using the Intel Fortran compiler on a 32-core shared memory server. Scaling behavior was very poor, and profile analysis using TAU showed that the bottleneck in the performance was due to our implementation of a collective, sequential summation procedure. We were able to improve the scalability and achieve nearly linear speedup by replacing the sequential summation with a parallel, binary tree algorithm. We also tested the Cray compiler, which provides its own collective summation procedure. Intel provides no collective reductions. With Cray, the program shows linear speedup even in distributed-memory execution. We anticipate similar results with other compilers once they support the new collective procedures proposed for Fortran 2015.


2013 ◽  
Vol 66 (1) ◽  
pp. 431-487 ◽  
Author(s):  
Arslan Munir ◽  
Farinaz Koushanfar ◽  
Ann Gordon-Ross ◽  
Sanjay Ranka

2011 ◽  
Vol 24 (12) ◽  
pp. 1317-1333 ◽  
Author(s):  
Bryan Marker ◽  
Ernie Chan ◽  
Jack Poulson ◽  
Robert Geijn ◽  
Rob F. Van der Wijngaart ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document