Balancing productivity and performance on the cell broadband engine

Author(s):  
Sadaf R Alam ◽  
Jeremy S Meredith ◽  
Jeffrey S Vetter
2009 ◽  
Vol 17 (1-2) ◽  
pp. 43-57 ◽  
Author(s):  
Michael Kistler ◽  
John Gunnels ◽  
Daniel Brokenshire ◽  
Brad Benton

In this paper we present the design and implementation of the Linpack benchmark for the IBM BladeCenter QS22, which incorporates two IBM PowerXCell 8i1processors. The PowerXCell 8i is a new implementation of the Cell Broadband Engine™2 architecture and contains a set of special-purpose processing cores known as Synergistic Processing Elements (SPEs). The SPEs can be used as computational accelerators to augment the main PowerPC processor. The added computational capability of the SPEs results in a peak double precision floating point capability of 108.8 GFLOPS. We explain how we modified the standard open source implementation of Linpack to accelerate key computational kernels using the SPEs of the PowerXCell 8i processors. We describe in detail the implementation and performance of the computational kernels and also explain how we employed the SPEs for high-speed data movement and reformatting. The result of these modifications is a Linpack benchmark optimized for the IBM PowerXCell 8i processor that achieves 170.7 GFLOPS on a BladeCenter QS22 with 32 GB of DDR2 SDRAM memory. Our implementation of Linpack also supports clusters of QS22s, and was used to achieve a result of 11.1 TFLOPS on a cluster of 84 QS22 blades. We compare our results on a single BladeCenter QS22 with the base Linpack implementation without SPE acceleration to illustrate the benefits of our optimizations.


Author(s):  
H. M. Thieringer

It has repeatedly been show that with conventional electron microscopes very fine electron probes can be produced, therefore allowing various micro-techniques such as micro recording, X-ray microanalysis and convergent beam diffraction. In this paper the function and performance of an SIEMENS ELMISKOP 101 used as a scanning transmission microscope (STEM) is described. This mode of operation has some advantages over the conventional transmission microscopy (CTEM) especially for the observation of thick specimen, in spite of somewhat longer image recording times.Fig.1 shows schematically the ray path and the additional electronics of an ELMISKOP 101 working as a STEM. With a point-cathode, and using condensor I and the objective lens as a demagnifying system, an electron probe with a half-width ob about 25 Å and a typical current of 5.10-11 amp at 100 kV can be obtained in the back focal plane of the objective lens.


Author(s):  
Huang Min ◽  
P.S. Flora ◽  
C.J. Harland ◽  
J.A. Venables

A cylindrical mirror analyser (CMA) has been built with a parallel recording detection system. It is being used for angular resolved electron spectroscopy (ARES) within a SEM. The CMA has been optimised for imaging applications; the inner cylinder contains a magnetically focused and scanned, 30kV, SEM electron-optical column. The CMA has a large inner radius (50.8mm) and a large collection solid angle (Ω > 1sterad). An energy resolution (ΔE/E) of 1-2% has been achieved. The design and performance of the combination SEM/CMA instrument has been described previously and the CMA and detector system has been used for low voltage electron spectroscopy. Here we discuss the use of the CMA for ARES and present some preliminary results.The CMA has been designed for an axis-to-ring focus and uses an annular type detector. This detector consists of a channel-plate/YAG/mirror assembly which is optically coupled to either a photomultiplier for spectroscopy or a TV camera for parallel detection.


Author(s):  
Joe A. Mascorro ◽  
Gerald S. Kirby

Embedding media based upon an epoxy resin of choice and the acid anhydrides dodecenyl succinic anhydride (DDSA), nadic methyl anhydride (NMA), and catalyzed by the tertiary amine 2,4,6-Tri(dimethylaminomethyl) phenol (DMP-30) are widely used in biological electron microscopy. These media possess a viscosity character that can impair tissue infiltration, particularly if original Epon 812 is utilized as the base resin. Other resins that are considerably less viscous than Epon 812 now are available as replacements. Likewise, nonenyl succinic anhydride (NSA) and dimethylaminoethanol (DMAE) are more fluid than their counterparts DDSA and DMP- 30 commonly used in earlier formulations. This work utilizes novel epoxy and anhydride combinations in order to produce embedding media with desirable flow rate and viscosity parameters that, in turn, would allow the medium to optimally infiltrate tissues. Specifically, embeding media based on EmBed 812 or LX 112 with NSA (in place of DDSA) and DMAE (replacing DMP-30), with NMA remaining constant, are formulated and offered as alternatives for routine biological work.Individual epoxy resins (Table I) or complete embedding media (Tables II-III) were tested for flow rate and viscosity. The novel media were further examined for their ability to infilftrate tissues, polymerize, sectioning and staining character, as well as strength and stability to the electron beam and column vacuum. For physical comparisons, a volume (9 ml) of either resin or media was aspirated into a capillary viscocimeter oriented vertically. The material was then allowed to flow out freely under the influence of gravity and the flow time necessary for the volume to exit was recored (Col B,C; Tables). In addition, the volume flow rate (ml flowing/second; Col D, Tables) was measured. Viscosity (n) could then be determined by using the Hagen-Poiseville relation for laminar flow, n = c.p/Q, where c = a geometric constant from an instrument calibration with water, p = mass density, and Q = volume flow rate. Mass weight and density of the materials were determined as well (Col F,G; Tables). Infiltration schedules utilized were short (1/2 hr 1:1, 3 hrs full resin), intermediate (1/2 hr 1:1, 6 hrs full resin) , or long (1/2 hr 1:1, 6 hrs full resin) in total time. Polymerization schedules ranging from 15 hrs (overnight) through 24, 36, or 48 hrs were tested. Sections demonstrating gold interference colors were collected on unsupported 200- 300 mesh grids and stained sequentially with uranyl acetate and lead citrate.


Author(s):  
D. E. Newbury ◽  
R. D. Leapman

Trace constituents, which can be very loosely defined as those present at concentration levels below 1 percent, often exert influence on structure, properties, and performance far greater than what might be estimated from their proportion alone. Defining the role of trace constituents in the microstructure, or indeed even determining their location, makes great demands on the available array of microanalytical tools. These demands become increasingly more challenging as the dimensions of the volume element to be probed become smaller. For example, a cubic volume element of silicon with an edge dimension of 1 micrometer contains approximately 5×1010 atoms. High performance secondary ion mass spectrometry (SIMS) can be used to measure trace constituents to levels of hundreds of parts per billion from such a volume element (e. g., detection of at least 100 atoms to give 10% reproducibility with an overall detection efficiency of 1%, considering ionization, transmission, and counting).


1986 ◽  
Vol 50 (5) ◽  
pp. 264-267 ◽  
Author(s):  
GH Westerman ◽  
TG Grandy ◽  
JV Lupo ◽  
RE Mitchell

Sign in / Sign up

Export Citation Format

Share Document