scholarly journals On the Transformation Optimization for Stencil Computation

Electronics ◽  
2021 ◽  
Vol 11 (1) ◽  
pp. 38
Author(s):  
Huayou Su ◽  
Kaifang Zhang ◽  
Songzhu Mei

Stencil computation optimizations have been investigated quite a lot, and various approaches have been proposed. Loop transformation is a vital kind of optimization in modern production compilers and has proved successful employment within compilers. In this paper, we combine the two aspects to study the potential benefits some common transformation recipes may have for stencils. The recipes consist of loop unrolling, loop fusion, address precalculation, redundancy elimination, instruction reordering, load balance, and a forward and backward update algorithm named semi-stencil. Experimental evaluations of diverse stencil kernels, including 1D, 2D, and 3D computation patterns, on two typical ARM and Intel platforms, demonstrate the respective effects of the transformation recipes. An average speedup of 1.65× is obtained, and the best is 1.88× for the single transformation recipes we analyze. The compound recipes demonstrate a maximum speedup of 1.92×.

2009 ◽  
Vol 9 (2) ◽  
pp. 98-114 ◽  
Author(s):  
Lars Winkler Pettersson ◽  
Andreas Kjellin ◽  
Mats Lind ◽  
Stefan Seipel

Multi-Viewer Display Environments (MVDE) provide unique opportunities to present personalized information to several users concurrently in the same physical display space. MVDEs can support correct 3D visualizations to multiple users, present correctly oriented text and symbols to all viewers and allow individually chosen subsets of information in a shared context. MVDEs aim at supporting collaborative visual analysis, and when used to visualize disjoint information in partitioned visualizations they even necessitate collaboration. When solving visual tasks collaboratively in a MVDE, overall performance is affected not only by the inherent effects of the graphical presentation but also by the interaction between the collaborating users. We present results from an empirical study where we compared views with lack of shared visual references in disjoint sets of information to views with mutually shared information. Potential benefits of 2D and 3D visualizations in a collaborative task were investigated and the effects of partitioning visualizations both in terms of task performance, interaction behavior and clutter reduction. In our study of a collaborative task that required only a minimum of information to be shared, we found that partitioned views with a lack of shared visual references were significantly less efficient than integrated views. However, the study showed that subjects were equally capable of solving the task at low error levels in partitioned and integrated views. An explorative analysis revealed that the amount of visual clutter was reduced heavily in partitioned visualization, whereas verbal and deictic communication between subjects increased. It also showed that the type of the visualization (2D/3D) affects interaction behavior strongly. An interesting result is that collaboration on complex geo-time visualizations is actually as efficient in 2D as in 3D.


Cells ◽  
2018 ◽  
Vol 7 (12) ◽  
pp. 225 ◽  
Author(s):  
Yuan Liu ◽  
Ye-Guang Chen

Colorectal cancer (CRC) is one of the most common cancers that have high occurrence and death in both males and females. As various factors have been found to contribute to CRC development, personalized therapies are critical for efficient treatment. To achieve this purpose, the establishment of patient-derived tumor models is critical for diagnosis and drug test. The establishment of three-dimensional (3D) organoid cultures and two-dimensional (2D) monolayer cultures of patient-derived epithelial tissues is a breakthrough for expanding living materials for later use. This review provides an overview of the different types of 2D- and 3D-based intestinal stem cell cultures, their potential benefits, and the drawbacks in personalized medicine in treatment of the intestinal disorders.


2020 ◽  
Author(s):  
Italo Epicoco ◽  
Francesca Mele ◽  
Silvia Mocavero ◽  
Marco Chiarelli ◽  
Alessandro D'Anca ◽  
...  

<p>In the roadmap of modern parallel architectures development, the computing power of a node grows much more quickly than main memory performance (capacity, bandwidth). This leads to an even much higher gap between computing and memory resources. An efficient use of the cache memory is becoming ever more essential as optimization technique.<br>The NEMO model uses a finite difference integration method and a regular cartesian grid for space discretization. The NEMO code reflects this choice: a generic field is represented in memory as a 3D array; and the code is mainly composed of three-level nested loops. These loops often include only a few operations in the body; the results are stored in a temporary 3D array and then used in subsequent loops until the final calculation.<br>The aim of this work is to make better use of the cache memory by fusing DO loops together. The loop fusion is a transformation which takes two or more adjacent loops that have the same iteration space traversal and combines their bodies into a single loop.<br>The fusion of the loops is not trivial, and it could require introducing additional redundant operations to solve data dependencies. Unfortunately, this leads to a drawback of the overall performance. To avoid the redundant operation, we can adopt pointers to arrays and implement a pointer rotation at each loop iteration.<br>We have developed the loop fusion transformation in an advection kernel extracted from the NEMO oceanic model. We have compared 3 different versions of the optimized advection kernel, with 3 different levels of loop fusion.<br>The first prototype refers to the implementation where the extreme fusion is applied, and all loops in the routine have been fused. In this version, the operations are replicated up to 3 times. In the second prototype the buffer rotation has been applied only in the outermost loop. In the third prototype, the buffer rotation has also been implemented for the second dimension, and this version introduces only a limited amount of redundant operations.</p><p>The tests have been performed on the Athena cluster located at the CMCC supercomputing center. The supercomputing infrastructure is based on the Intel Xeon E5-2670 processors. The memory hierarchy is composed of 32KB of L1 cache, 256KB of L2 and 20MB L3 cache shared among the cores. The results clearly proved the effectiveness of the loop fusion approach that reaches a speedup of 2x with a high number of cores. The third prototype has proven to be the most promising solution. Prototypes 1 and 2 provide a good improvement up to 256 cores then the redundant operations lead to a loss of performance.<br>A deeper analysis measuring the Last Level Cache misses also showed how the loop transformation significantly reduced the number of cache misses.<br>Despite the good results achieved with the loop fusion optimization, we can remark that this optimization is strictly linked to the computing architecture. A fully portable performance improvement can be ensured by the adoption of a DSL (Domain Specific Language).</p>


Author(s):  
P.M. Rice ◽  
MJ. Kim ◽  
R.W. Carpenter

Extrinsic gettering of Cu on near-surface dislocations in Si has been the topic of recent investigation. It was shown that the Cu precipitated hetergeneously on dislocations as Cu silicide along with voids, and also with a secondary planar precipitate of unknown composition. Here we report the results of investigations of the sense of the strain fields about the large (~100 nm) silicide precipitates, and further analysis of the small (~10-20 nm) planar precipitates.Numerous dark field images were analyzed in accordance with Ashby and Brown's criteria for determining the sense of the strain fields about precipitates. While the situation is complicated by the presence of dislocations and secondary precipitates, micrographs like those shown in Fig. 1(a) and 1(b) tend to show anomalously wide strain fields with the dark side on the side of negative g, indicating the strain fields about the silicide precipitates are vacancy in nature. This is in conflict with information reported on the η'' phase (the Cu silicide phase presumed to precipitate within the bulk) whose interstitial strain field is considered responsible for the interstitial Si atoms which cause the bounding dislocation to expand during star colony growth.


2021 ◽  
Author(s):  
Ruoyang Liu ◽  
Ke Tian Tan ◽  
Yifan Gong ◽  
Yongzhi Chen ◽  
Zhuoer Li ◽  
...  

Covalent organic frameworks offer a molecular platform for integrating organic units into periodically ordered yet extended 2D and 3D polymers to create topologically well-defined polygonal lattices and built-in discrete micropores and/or mesopores.


2014 ◽  
Vol 4 (1) ◽  
pp. 23-29
Author(s):  
Constance Hilory Tomberlin

There are a multitude of reasons that a teletinnitus program can be beneficial, not only to the patients, but also within the hospital and audiology department. The ability to use technology for the purpose of tinnitus management allows for improved appointment access for all patients, especially those who live at a distance, has been shown to be more cost effective when the patients travel is otherwise monetarily compensated, and allows for multiple patient's to be seen in the same time slots, allowing for greater access to the clinic for the patients wishing to be seen in-house. There is also the patient's excitement in being part of a new technology-based program. The Gulf Coast Veterans Health Care System (GCVHCS) saw the potential benefits of incorporating a teletinnitus program and began implementation in 2013. There were a few hurdles to work through during the beginning organizational process and the initial execution of the program. Since the establishment of the Teletinnitus program, the GCVHCS has seen an enhancement in patient care, reduction in travel compensation, improvement in clinic utilization, clinic availability, the genuine excitement of the use of a new healthcare media amongst staff and patients, and overall patient satisfaction.


2012 ◽  
Author(s):  
Michael Sackllah ◽  
Denny Yu ◽  
Charles Woolley ◽  
Steven Kasten ◽  
Thomas J. Armstrong

Author(s):  
Denny Yu ◽  
Michael Sackllah ◽  
Charles Woolley ◽  
Steven Kasten ◽  
Thomas J. Armstrong
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document