scholarly journals From Physics Model to Results: An Optimizing Framework for Cross-Architecture Code Generation

2013 ◽  
Vol 21 (1-2) ◽  
pp. 1-16 ◽  
Author(s):  
Marek Blazewicz ◽  
Ian Hinder ◽  
David M. Koppelman ◽  
Steven R. Brandt ◽  
Milosz Ciznicki ◽  
...  

Starting from a high-level problem description in terms of partial differential equations using abstract tensor notation, theChemoraframework discretizes, optimizes, and generates complete high performance codes for a wide range of compute architectures. Chemora extends the capabilities of Cactus, facilitating the usage of large-scale CPU/GPU systems in an efficient manner for complex applications, without low-level code tuning. Chemora achieves parallelism through MPI and multi-threading, combining OpenMP and CUDA. Optimizations include high-level code transformations, efficient loop traversal strategies, dynamically selected data and instruction cache usage strategies, and JIT compilation of GPU code tailored to the problem characteristics. The discretization is based on higher-order finite differences on multi-block domains. Chemora's capabilities are demonstrated by simulations of black hole collisions. This problem provides an acid test of the framework, as the Einstein equations contain hundreds of variables and thousands of terms.

2019 ◽  
Vol 16 (3) ◽  
pp. 117-123
Author(s):  
Tsung-Ching Huang ◽  
Ting Lei ◽  
Leilai Shao ◽  
Sridhar Sivapurapu ◽  
Madhavan Swaminathan ◽  
...  

Abstract High-performance low-cost flexible hybrid electronics (FHE) are desirable for applications such as internet of things and wearable electronics. Carbon nanotube (CNT) thin-film transistor (TFT) is a promising candidate for high-performance FHE because of its high carrier mobility, superior mechanical flexibility, and material compatibility with low-cost printing and solution processes. Flexible sensors and peripheral CNT-TFT circuits, such as decoders, drivers, and sense amplifiers, can be printed and hybrid-integrated with thinned (<50 μm) silicon chips on soft, thin, and flexible substrates for a wide range of applications, from flexible displays to wearable medical devices. Here, we report (1) a process design kit (PDK) to enable FHE design automation for large-scale FHE circuits and (2) solution process-proven intellectual property blocks for TFT circuits design, including Pseudo-Complementary Metal-Oxide-Semiconductor (Pseudo-CMOS) flexible digital logic and analog amplifiers. The FHE-PDK is fully compatible with popular silicon design tools for design and simulation of hybrid-integrated flexible circuits.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Manosh Kumar Biswas ◽  
Mita Bagchi ◽  
Ujjal Kumar Nath ◽  
Dhiman Biswas ◽  
Sathishkumar Natarajan ◽  
...  

Abstract Lily belongs to family liliaceae, which mainly propagates vegetatively. Therefore, sufficient number of polymorphic, informative, and functional molecular markers are essential for studying a wide range of genetic parameters in Lilium species. We attempted to develop, characterize and design SSR (simple sequence repeat) markers using online genetic resources for analyzing genetic diversity and population structure of Lilium species. We found di-nucleotide repeat motif were more frequent (4684) within 0.14 gb (giga bases) transcriptome than other repeats, of which was two times higher than tetra-repeat motifs. Frequency of di-(AG/CT), tri-(AGG/CTT), tetra-(AAAT), penta-(AGAGG), and hexa-(AGAGGG) repeats was 34.9%, 7.0%, 0.4%, 0.3%, and 0.2%, respectively. A total of 3607 non-redundant SSR primer pairs was designed based on the sequences of CDS, 5′-UTR and 3′-UTR region covering 34%, 14%, 23%, respectively. Among them, a sub set of primers (245 SSR) was validated using polymerase chain reaction (PCR) amplification, of which 167 primers gave expected PCR amplicon and 101 primers showed polymorphism. Each locus contained 2 to 12 alleles on average 0.82 PIC (polymorphic information content) value. A total of 87 lily accessions was subjected to genetic diversity analysis using polymorphic SSRs and found to separate into seven groups with 0.73 to 0.79 heterozygosity. Our data on large scale SSR based genetic diversity and population structure analysis may help to accelerate the breeding programs of lily through utilizing different genomes, understanding genetics and characterizing germplasm with efficient manner.


Aerospace ◽  
2018 ◽  
Vol 5 (4) ◽  
pp. 104 ◽  
Author(s):  
Ilias Lappas ◽  
Michail Bozoudis

The development of a parametric model for the variable portion of the Cost Per Flying Hour (CPFH) of an ‘unknown’ aircraft platform and its application to diverse types of fixed and rotary wing aircraft development programs (F-35A, Su-57, Dassault Rafale, T-X candidates, AW189, Airbus RACER among others) is presented. The novelty of this paper lies in the utilization of a diverse sample of aircraft types, aiming to obtain a ‘universal’ Cost Estimating Relationship (CER) applicable to a wide range of platforms. Moreover, the model does not produce absolute cost figures but rather analogy ratios versus the F-16’s CPFH, broadening the model’s applicability. The model will enable an analyst to carry out timely and reliable Operational and Support (O&S) cost estimates for a wide range of ‘unknown’ aircraft platforms at their early stages of conceptual design, despite the lack of actual data from the utilization and support life cycle stages. The statistical analysis is based on Ordinary Least Squares (OLS) regression, conducted with R software (v5.3.1, released on 2 July 2018). The model’s output is validated against officially published CPFH data of several existing ‘mature’ aircraft platforms, including one of the most prolific fighter jet types all over the world, the F-16C/D, which is also used as a reference to compare CPFH estimates of various next generation aircraft platforms. Actual CPFH data of the Hellenic Air Force (HAF) have been used to develop the parametric model, the application of which is expected to significantly inform high level decision making regarding aircraft procurement, budgeting and future force structure planning, including decisions related to large scale aircraft modifications and upgrades.


1976 ◽  
Vol 98 (2) ◽  
pp. 229-238 ◽  
Author(s):  
G. J. Walker

The influence of free stream disturbances on transition is discussed and it is noted that significant regions of laminar flow may exist on axial turbomachine blades despite the high level of disturbance to which they are subjected. A family of surface velocity distributions giving unseparated flow on the suction surface of an axial compressor blade is derived using data from detailed boundary layer measurements on the blading of a single-stage machine. The distributions are broadly similar to those adopted by Wortmann in designing high performance isolated aerofoil sections for operation at much higher Reynolds numbers. The theoretical performance of blades having the specified surface velocity distributions is computed for a wide range of conditions, and the effects of varying Reynolds number and other design parameters are analyzed. The results suggest the possibility of obtaining useful improvements in performance over that of conventional compressor blade sections. The computed performance values show an almost unique relation between the blade losses and the suction surface diffusion ratio. However the correlation of losses with the equivalent diffusion ratio is found to break down at high values of the latter parameter.


2014 ◽  
Vol 2014 ◽  
pp. 1-15 ◽  
Author(s):  
Vinícius da Fonseca Vieira ◽  
Carolina Ribeiro Xavier ◽  
Nelson Francisco Favilla Ebecken ◽  
Alexandre Gonçalves Evsukoff

Community structure detection is one of the major research areas of network science and it is particularly useful for large real networks applications. This work presents a deep study of the most discussed algorithms for community detection based on modularity measure: Newman’s spectral method using a fine-tuning stage and the method of Clauset, Newman, and Moore (CNM) with its variants. The computational complexity of the algorithms is analysed for the development of a high performance code to accelerate the execution of these algorithms without compromising the quality of the results, according to the modularity measure. The implemented code allows the generation of partitions with modularity values consistent with the literature and it overcomes 1 million nodes with Newman’s spectral method. The code was applied to a wide range of real networks and the performances of the algorithms are evaluated.


2012 ◽  
Vol 192-193 ◽  
pp. 545-550 ◽  
Author(s):  
Mario Rosso ◽  
Ildiko Peter ◽  
Gianluigi Chiarmetta ◽  
Ivano Gattelli

This paper presents an analysis of a new rheocasting process suitable for the manufacturing of high performance automotive parts. The process is able for the realization of components using Al alloys. An important aspect is related to the possibility to obtain quite wide range of thicknesses, starting from 2.5 mm. The used alloy is the well known A356, with low Fe content, maximum 0.08 wt%. T6 heat treatments has been performed, while the soundness of the parts has been certified by non destructive tests. These parts are produced to be mounted on a top level and famous sport car. Non standard samples for mechanical tests have been machined directly from the components. Following the mechanical tests fracture surface analysis has been carried out by SEM to observe some morphological details and to evaluate the influence of the process and of the alloy conditions on the fracture behaviour. On the polished transverse sections of the samples morphological analysis has been performed. The obtained results shown high level of mechanical strength for all series of components. The reliability of the process is very high at a convenient level of manufacturing rate. The weldability of the parts has been demonstrated.


2019 ◽  
Vol 7 (1) ◽  
pp. 55-70
Author(s):  
Moh. Zikky ◽  
M. Jainal Arifin ◽  
Kholid Fathoni ◽  
Agus Zainal Arifin

High-Performance Computer (HPC) is computer systems that are built to be able to solve computational loads. HPC can provide a high-performance technology and short the computing processes timing. This technology was often used in large-scale industries and several activities that require high-level computing, such as rendering virtual reality technology. In this research, we provide Tawaf’s Virtual Reality with 1000 of Pilgrims and realistic surroundings of Masjidil-Haram as the interactive and immersive simulation technology by imitating them with 3D models. Thus, the main purpose of this study is to calculate and to understand the processing time of its Virtual Reality with the implementation of tawaf activities using various platforms; such as computer and Android smartphone. The results showed that the outer-line or outer rotation of Kaa’bah mostly consumes minimum times although he must pass the longer distance than the closer one.  It happened because the agent with the closer area to Kaabah is facing the crowded peoples. It means an obstacle has the more impact than the distances in this case.


Polymers ◽  
2021 ◽  
Vol 13 (20) ◽  
pp. 3465
Author(s):  
Jianli Cui ◽  
Xueli Nan ◽  
Guirong Shao ◽  
Huixia Sun

Researchers are showing an increasing interest in high-performance flexible pressure sensors owing to their potential uses in wearable electronics, bionic skin, and human–machine interactions, etc. However, the vast majority of these flexible pressure sensors require extensive nano-architectural design, which both complicates their manufacturing and is time-consuming. Thus, a low-cost technology which can be applied on a large scale is highly desirable for the manufacture of flexible pressure-sensitive materials that have a high sensitivity over a wide range of pressures. This work is based on the use of a three-dimensional elastic porous carbon nanotubes (CNTs) sponge as the conductive layer to fabricate a novel flexible piezoresistive sensor. The synthesis of a CNTs sponge was achieved by chemical vapor deposition, the basic underlying principle governing the sensing behavior of the CNTs sponge-based pressure sensor and was illustrated by employing in situ scanning electron microscopy. The CNTs sponge-based sensor has a quick response time of ~105 ms, a high sensitivity extending across a broad pressure range (less than 10 kPa for 809 kPa−1) and possesses an outstanding permanence over 4,000 cycles. Furthermore, a 16-pixel wireless sensor system was designed and a series of applications have been demonstrated. Its potential applications in the visualizing pressure distribution and an example of human–machine communication were also demonstrated.


Author(s):  
Yassine Sabri ◽  
Aouad Siham

Multi-area and multi-faceted remote sensing (SAR) datasets are widely used due to the increasing demand for accurate and up-to-date information on resources and the environment for regional and global monitoring. In general, the processing of RS data involves a complex multi-step processing sequence that includes several independent processing steps depending on the type of RS application. The processing of RS data for regional disaster and environmental monitoring is recognized as computationally and data demanding.Recently, by combining cloud computing and HPC technology, we propose a method to efficiently solve these problems by searching for a large-scale RS data processing system suitable for various applications. Real-time on-demand service. The ubiquitous, elastic, and high-level transparency of the cloud computing model makes it possible to run massive RS data management and data processing monitoring dynamic environments in any cloud. via the web interface. Hilbert-based data indexing methods are used to optimally query and access RS images, RS data products, and intermediate data. The core of the cloud service provides a parallel file system of large RS data and an interface for accessing RS data from time to time to improve localization of the data. It collects data and optimizes I/O performance. Our experimental analysis demonstrated the effectiveness of our method platform.


2021 ◽  
Vol 15 ◽  
Author(s):  
Giordana Florimbi ◽  
Emanuele Torti ◽  
Stefano Masoli ◽  
Egidio D'Angelo ◽  
Francesco Leporati

In modern computational modeling, neuroscientists need to reproduce long-lasting activity of large-scale networks, where neurons are described by highly complex mathematical models. These aspects strongly increase the computational load of the simulations, which can be efficiently performed by exploiting parallel systems to reduce the processing times. Graphics Processing Unit (GPU) devices meet this need providing on desktop High Performance Computing. In this work, authors describe a novel Granular layEr Simulator development implemented on a multi-GPU system capable of reconstructing the cerebellar granular layer in a 3D space and reproducing its neuronal activity. The reconstruction is characterized by a high level of novelty and realism considering axonal/dendritic field geometries, oriented in the 3D space, and following convergence/divergence rates provided in literature. Neurons are modeled using Hodgkin and Huxley representations. The network is validated by reproducing typical behaviors which are well-documented in the literature, such as the center-surround organization. The reconstruction of a network, whose volume is 600 × 150 × 1,200 μm3 with 432,000 granules, 972 Golgi cells, 32,399 glomeruli, and 4,051 mossy fibers, takes 235 s on an Intel i9 processor. The 10 s activity reproduction takes only 4.34 and 3.37 h exploiting a single and multi-GPU desktop system (with one or two NVIDIA RTX 2080 GPU, respectively). Moreover, the code takes only 3.52 and 2.44 h if run on one or two NVIDIA V100 GPU, respectively. The relevant speedups reached (up to ~38× in the single-GPU version, and ~55× in the multi-GPU) clearly demonstrate that the GPU technology is highly suitable for realistic large network simulations.


Sign in / Sign up

Export Citation Format

Share Document