Shared Memory Parallelism

Author(s):  
Wesley Petersen ◽  
Peter Arbenz

Shared memory machines typically have relatively few processors, say 2–128. An intrinsic characteristic of these machines is a strategy for memory coherence and a fast tightly coupled network for distributing data from a commonly accessible memory system. Our test examples were run on two HP Superdome clusters: Stardust is a production machine with 64 PA-8700 processors, and Pegasus is a 32 CPU machine with the same kind of processors. The HP9000 is grouped into cells, each with 4 CPUs, a common memory/cell, and connected to a CCNUMA crossbar network. The network consists of sets of 4×4 crossbars and is shown in Figure 4.2. An effective bandwidth test, the EFF_BW benchmark [116], groups processors into two equally sized sets. Arbitrary pairings are made between elements from each group, Figure 4.3, and the cross-sectional bandwidth of the network is measured for a fixed number of processors and varying message sizes. The results from the HP9000 machine Stardust are shown in Figure 4.4. It is clear from this figure that the cross-sectional bandwidth of the network is quite high. Although not apparent from Figure 4.4, the latency for this test (the intercept near Message Size = 0) is not high. Due to the low incremental resolution of MPI_Wtime, multiple test runs must be done to quantify the latency. Dr Byrde’s tests show that minimum latency is ≳ 1.5μs. A clearer example of a shared memory architecture is the Cray X1 machine, shown in Figures 4.5 and 4.6. In Figure 4.6, the shared memory design is obvious. Each multi-streaming processor (MSP) shown in Figure 4.5 has 4 processors (custom designed processor chips forged by IBM), and 4 corresponding caches. Although not clear from available diagrams, vector memory access apparently permits cache by-pass; hence the term streaming in MSP. That is, vector registers are loaded directly from memory: see, for example, Figure 3.4. On each board (called nodes) are 4 such MSPs and 16 memory modules which share a common (coherent) memory view. Coherence is only maintained on each board, but not across multiple board systems.

Author(s):  
J.-F. Revol ◽  
Y. Van Daele ◽  
F. Gaill

The only form of cellulose which could unequivocally be ascribed to the animal kingdom is the tunicin that occurs in the tests of the tunicates. Recently, high-resolution solid-state l3C NMR revealed that tunicin belongs to the Iβ form of cellulose as opposed to the Iα form found in Valonia and bacterial celluloses. The high perfection of the tunicin crystallites led us to study its crosssectional shape and to compare it with the shape of those in Valonia ventricosa (V.v.), the goal being to relate the cross-section of cellulose crystallites with the two allomorphs Iα and Iβ.In the present work the source of tunicin was the test of the ascidian Halocvnthia papillosa (H.p.). Diffraction contrast imaging in the bright field mode was applied on ultrathin sections of the V.v. cell wall and H.p. test with cellulose crystallites perpendicular to the plane of the sections. The electron microscope, a Philips 400T, was operated at 120 kV in a low intensity beam condition.


1960 ◽  
Vol 19 (3) ◽  
pp. 803-809
Author(s):  
D. J. Matthews ◽  
R. A. Merkel ◽  
J. D. Wheat ◽  
R. F. Cox

2018 ◽  
Author(s):  
Sang Hoon Lee ◽  
Jeff Blackwood ◽  
Stacey Stone ◽  
Michael Schmidt ◽  
Mark Williamson ◽  
...  

Abstract The cross-sectional and planar analysis of current generation 3D device structures can be analyzed using a single Focused Ion Beam (FIB) mill. This is achieved using a diagonal milling technique that exposes a multilayer planar surface as well as the cross-section. this provides image data allowing for an efficient method to monitor the fabrication process and find device design errors. This process saves tremendous sample-to-data time, decreasing it from days to hours while still providing precise defect and structure data.


Sign in / Sign up

Export Citation Format

Share Document