scholarly journals Thanos: High-Performance CPU-GPU Based Balanced Graph Partitioning Using Cross-Decomposition

Author(s):  
Dae Hee Kim ◽  
Rakesh Nagi ◽  
Deming Chen
2021 ◽  
Vol 22 (4) ◽  
pp. 413-424
Author(s):  
Siddheshwar Vilas Patil ◽  
Dinesh B. Kulkarni

In modern computing, high-performance computing (HPC) and parallel computing require most of the decision-making in terms of distributing the payloads (input) uniformly across the available set of resources, majorly processors; the former deals with the hardware and its better utilization. In parallel computing, a larger, complex problem is broken down into multiple smaller calculations and executed simultaneously on several processors. The efficient use of resources (processors) plays a vital role in achieving the maximum throughput which necessitates uniform load distribution across available processors, i.e. load balancing. The load balancing in parallel computing is modeled as a graph partitioning problem. In the graph partitioning problem, the weighted nodes represent the computing cost at each node, and the weighted edges represent the communication cost between the connected nodes. The goal is to partition the graph G into k partitions such that: I) the sum of weights on the nodes is approximately equal for each partition, and, II) the sum of weights on the edges across different partitions is minimum.  In this paper, a novel node-weighted and edge-weighted k-way balanced graph partitioning (NWEWBGP) algorithm of  O(n x n)  is proposed. The algorithm works for all relevant values of k, meets or improves on earlier algorithms in terms of balanced partitioning and lowest edge-cut. For evaluation and validation, the outcome is compared with the ground truth benchmarks.


Author(s):  
Е.Н. Головченко ◽  
М.В. Якобовский

Задача рациональной декомпозиции расчетных сеток возникает при численном моделировании на высокопроизводительных вычислительных системах проблем механики сплошных сред, импульсной энергетики, электродинамики и др. Число процессоров, на котором будет считаться вычислительная задача, как правило, заранее не известно. В этой связи имеет смысл предварительно однократно разбить сетку на большое число микродоменов, а затем формировать из них домены. Методы разбиения графов параллельных пакетов ParMETIS, Jostle, PT-Scotch и Zoltan основываются на иерархических алгоритмах, недостатком которых является образование несвязных доменов. Другим недостатком указанных пакетов является получение сильно несбалансированных разбиений. Разработан пакет программ GridSpiderPar для параллельной декомпозиции больших сеток. Проведены вычислительные эксперименты по сравнению различных разбиений на микродомены, разбиений графов микродоменов на домены, а также разбиений сразу на домены нескольких сеток ($10^8$ вершин, $10^9$ элементов), полученных методами созданного комплекса программ GridSpiderPar и пакетов ParMETIS, Zoltan и PT-Scotch. Качество разбиений проверялось по дисбалансу числа вершин в доменах, числу несвязных доменов и числу разрезанных ребер, а также по эффективности параллельного счета задач газовой динамики при распределении сеток по ядрам в соответствии с различными разбиениями. Полученные результаты выявили преимущества разработанных алгоритмов. The problem of load balancing arises in parallel mesh-based numerical solution of problems of continuum mechanics, energetics, electrodynamics etc. on high-performance computing systems. The number of processors to run a computational problem is often unknown. It makes sense, therefore, to partition a mesh into a great number of microdomains which then are used to create subdomains. Graph partitioning methods implemented in state-of-the-art parallel partitioning tools ParMETIS, Jostle, PT-Scotch and Zoltan are based on multilevel algorithms. That approach has a shortcoming of forming unconnected subdomains. Another shortcoming of present graph partitioning methods is generation of strongly imbalanced partitions. The program package for parallel large mesh decomposition GridSpiderPar was developed. We compared different partitions into microdomains, microdomain graph partitions and partitions into subdomains of several meshes (10^8 vertices, 10^9 elements) obtained by means of the partitioning tool GridSpiderPar and the packages ParMETIS, Zoltan and PT-Scotch. Balance of the partitions, edge-cut and number of unconnected subdomains in different partitions were compared as well as the computational performance of gas-dynamic problem simulations run on different partitions. The obtained results demonstrate advantages of the proposed algorithms.


Author(s):  
A. V. Crewe ◽  
M. Isaacson ◽  
D. Johnson

A double focusing magnetic spectrometer has been constructed for use with a field emission electron gun scanning microscope in order to study the electron energy loss mechanism in thin specimens. It is of the uniform field sector type with curved pole pieces. The shape of the pole pieces is determined by requiring that all particles be focused to a point at the image slit (point 1). The resultant shape gives perfect focusing in the median plane (Fig. 1) and first order focusing in the vertical plane (Fig. 2).


Author(s):  
N. Yoshimura ◽  
K. Shirota ◽  
T. Etoh

One of the most important requirements for a high-performance EM, especially an analytical EM using a fine beam probe, is to prevent specimen contamination by providing a clean high vacuum in the vicinity of the specimen. However, in almost all commercial EMs, the pressure in the vicinity of the specimen under observation is usually more than ten times higher than the pressure measured at the punping line. The EM column inevitably requires the use of greased Viton O-rings for fine movement, and specimens and films need to be exchanged frequently and several attachments may also be exchanged. For these reasons, a high speed pumping system, as well as a clean vacuum system, is now required. A newly developed electron microscope, the JEM-100CX features clean high vacuum in the vicinity of the specimen, realized by the use of a CASCADE type diffusion pump system which has been essentially improved over its predeces- sorD employed on the JEM-100C.


Author(s):  
John W. Coleman

In the design engineering of high performance electromagnetic lenses, the direct conversion of electron optical design data into drawings for reliable hardware is oftentimes difficult, especially in terms of how to mount parts to each other, how to tolerance dimensions, and how to specify finishes. An answer to this is in the use of magnetostatic analytics, corresponding to boundary conditions for the optical design. With such models, the magnetostatic force on a test pole along the axis may be examined, and in this way one may obtain priority listings for holding dimensions, relieving stresses, etc..The development of magnetostatic models most easily proceeds from the derivation of scalar potentials of separate geometric elements. These potentials can then be conbined at will because of the superposition characteristic of conservative force fields.


Author(s):  
J W Steeds ◽  
R Vincent

We review the analytical powers which will become more widely available as medium voltage (200-300kV) TEMs with facilities for CBED on a nanometre scale come onto the market. Of course, high performance cold field emission STEMs have now been in operation for about twenty years, but it is only in relatively few laboratories that special modification has permitted the performance of CBED experiments. Most notable amongst these pioneering projects is the work in Arizona by Cowley and Spence and, more recently, that in Cambridge by Rodenburg and McMullan.There are a large number of potential advantages of a high intensity, small diameter, focussed probe. We discuss first the advantages for probes larger than the projected unit cell of the crystal under investigation. In this situation we are able to perform CBED on local regions of good crystallinity. Zone axis patterns often contain information which is very sensitive to thickness changes as small as 5nm. In conventional CBED, with a lOnm source, it is very likely that the information will be degraded by thickness averaging within the illuminated area.


Author(s):  
Klaus-Ruediger Peters

A new generation of high performance field emission scanning electron microscopes (FSEM) is now commercially available (JEOL 890, Hitachi S 900, ISI OS 130-F) characterized by an "in lens" position of the specimen where probe diameters are reduced and signal collection improved. Additionally, low voltage operation is extended to 1 kV. Compared to the first generation of FSEM (JE0L JSM 30, Hitachi S 800), which utilized a specimen position below the final lens, specimen size had to be reduced but useful magnification could be impressively increased in both low (1-4 kV) and high (5-40 kV) voltage operation, i.e. from 50,000 to 200,000 and 250,000 to 1,000,000 x respectively.At high accelerating voltage and magnification, contrasts on biological specimens are well characterized1 and are produced by the entering probe electrons in the outmost surface layer within -vl nm depth. Backscattered electrons produce only a background signal. Under these conditions (FIG. 1) image quality is similar to conventional TEM (FIG. 2) and only limited at magnifications >1,000,000 x by probe size (0.5 nm) or non-localization effects (%0.5 nm).


Author(s):  
G.K.W. Balkau ◽  
E. Bez ◽  
J.L. Farrant

The earliest account of the contamination of electron microscope specimens by the deposition of carbonaceous material during electron irradiation was published in 1947 by Watson who was then working in Canada. It was soon established that this carbonaceous material is formed from organic vapours, and it is now recognized that the principal source is the oil-sealed rotary pumps which provide the backing vacuum. It has been shown that the organic vapours consist of low molecular weight fragments of oil molecules which have been degraded at hot spots produced by friction between the vanes and the surfaces on which they slide. As satisfactory oil-free pumps are unavailable, it is standard electron microscope practice to reduce the partial pressure of organic vapours in the microscope in the vicinity of the specimen by using liquid-nitrogen cooled anti-contamination devices. Traps of this type are sufficient to reduce the contamination rate to about 0.1 Å per min, which is tolerable for many investigations.


Author(s):  
Lee D. Peachey ◽  
Lou Fodor ◽  
John C. Haselgrove ◽  
Stanley M. Dunn ◽  
Junqing Huang

Stereo pairs of electron microscope images provide valuable visual impressions of the three-dimensional nature of specimens, including biological objects. Beyond this one seeks quantitatively accurate models and measurements of the three dimensional positions and sizes of structures in the specimen. In our laboratory, we have sought to combine high resolution video cameras with high performance computer graphics systems to improve both the ease of building 3D reconstructions and the accuracy of 3D measurements, by using multiple tilt images of the same specimen tilted over a wider range of angles than can be viewed stereoscopically. Ultimately we also wish to automate the reconstruction and measurement process, and have initiated work in that direction.Figure 1 is a stereo pair of 400 kV images from a 1 micrometer thick transverse section of frog skeletal muscle stained with the Golgi stain. This stain selectively increases the density of the transverse tubular network in these muscle cells, and it is this network that we reconstruct in this example.


Sign in / Sign up

Export Citation Format

Share Document