scholarly journals Performance of MC2 and the ECMWF IFS Forecast Model on the Fujitsu VPP700 and NEC SX-4M

2000 ◽  
Vol 8 (1) ◽  
pp. 23-30 ◽  
Author(s):  
Michel Desgagné ◽  
Stephen Thomas ◽  
Michel Valin

The NEC SX-4M cluster and Fujitsu VPP700 supercomputers are both based on custom vector processors using low-power CMOS technology. Their basic architectures and programming models are however somewhat different. A multi-node SX-4M cluster contains up to 32 processors per shared memory node, with a maximum of 16 nodes connected via the proprietary NEC IXS fibre channel crossbar network. A hybrid combination of inter-node MPI message-passing with intra-node tasking or threads is possible. The Fujitsu VPP700 is a fully distributed-memory vector machine with a crossbar interconnect which also supports MPI. The parallel performance of the MC2 model for high-resolution mesoscale forecasting over large domains and of the IFS RAPS 4.0 benchmark are presented for several different machine configurations. These include an SX-4/32, an SX-4/32M cluster and up to 100 PE's of the VPP700. Our results indicate that performance degradation for both models on a single SX-4 node is primarily due to memory contention within the internal crossbar switch. Multinode SX-4 performance is slightly better than single node. Longer vector lengths and SDRAM memory on the VPP700 result in lower per processor execution rates. Both models achieve close to ideal scaling on the VPP700.

2017 ◽  
Vol 2017 ◽  
pp. 1-9 ◽  
Author(s):  
Rongshan Wei ◽  
Shizhong Guo ◽  
Shanzhi Yang

This paper presents an integrated Hall switch sensor based on SMIC 0.18 µm CMOS technology. The system includes a front-end Hall element and a back-end signal processing circuit. By optimizing the structure of the Hall element and using the orthogonal coupling and spinning current technology, the offset voltage can be suppressed effectively. The simulation results showed that the Hall switch can eliminate offset voltage greater than 1 mV at 3.3 V supply voltage. Two modes of the Hall switch circuit, the awake mode and the sleep mode, were realized by using clock logic signals without compromising the performance of the Hall switch, thereby reducing power consumption. The test results showed that the operate point and the release point of the switch were within the range of 3–7 mT at 3.3 V supply voltage. Meanwhile, the current consumption is 7.89 µA.


2017 ◽  
Vol 27 (02) ◽  
pp. 1850027
Author(s):  
Mehdi Habibi ◽  
Khatereh Akbari ◽  
Marzieh Mokhtari ◽  
Peyman Moallem

Smart image sensors with low data rate output are well fitted for security and surveillance tasks, since at lower data rates, power consumption is reduced and the image sensor can be operated with limited energy resources such as solar panels. In this paper, a new data transfer scheme is presented to reduce the data rate of the pixels which have undergone value change. Although different pixel difference detecting architectures have been previously reported but it is shown that the given method is more effective in terms of power dissipation and data transfer rate reduction. The proposed architecture is evaluated as a [Formula: see text]-pixel sensor in a standard CMOS technology and comparison with other data transfer approaches is performed in the same process and configuration.


2010 ◽  
Vol 54 (5) ◽  
pp. 564-567 ◽  
Author(s):  
Chan-Yuan Hu ◽  
Jone F. Chen ◽  
Shih-Chih Chen ◽  
Shoou-Jinn Chang ◽  
Kay-Ming Lee ◽  
...  

1999 ◽  
Vol 103 (1027) ◽  
pp. 443-447 ◽  
Author(s):  
W. McMillan ◽  
M. Woodgate ◽  
B. E. Richards ◽  
B. J. Gribben ◽  
K. J. Badcock ◽  
...  

Abstract Motivated by a lack of sufficient local and national computing facilities for computational fluid dynamics simulations, the Affordable Systems Computing Unit (ASCU) was established to investigate low cost alternatives. The options considered have all involved cluster computing, a term which refers to the grouping of a number of components into a managed system capable of running both serial and parallel applications. The present work aims to demonstrate the utility of commodity processors for dedicated batch processing. The performance of the cluster has proved to be extremely cost effective, enabling large three dimensional flow simulations on a computer costing less than £25k sterling at current market prices. The experience gained on this system in terms of single node performance, message passing and parallel performance will be discussed. In particular, comparisons with the performance of other systems will be made. Several medium-large scale CFD simulations performed using the new cluster will be presented to demonstrate the potential of commodity processor based parallel computers for aerodynamic simulation.


Author(s):  
Saulo R. M. Barros ◽  
David Dent ◽  
Lars Isaksen ◽  
Guy Robinson ◽  
Fritz G. Wollenweber

Sign in / Sign up

Export Citation Format

Share Document