code optimization Latest Research Papers

In the field of optical science, it is becoming increasingly important to observe and manipulate matter at the atomic scale using ultrashort pulsed light. For the first time, we have performed the ab initio simulation solving the Maxwell equation for light electromagnetic fields, the time-dependent Kohn-Sham equation for electrons, and the Newton equation for ions in extended systems. In the simulation, the most time-consuming parts were stencil and nonlocal pseudopotential operations on the electron orbitals as well as fast Fourier transforms for the electron density. Code optimization was thoroughly performed on the Fujitsu A64FX processor to achieve the highest performance. A simulation of amorphous SiO2 thin film composed of more than 10,000 atoms was performed using 27,648 nodes of the Fugaku supercomputer. The simulation achieved excellent time-to-solution with the performance close to the maximum possible value in view of the memory bandwidth bound, as well as excellent weak scalability.

Download Full-text

Machine vision assisted online array state code optimization Method for 1-bit Programmable Metasurface

10.1109/imws-amp53428.2021.9643896 ◽

2021 ◽

Author(s):

Hanting Zhao ◽

Zhuo Wang ◽

Hongrui Zhang ◽

Menglin Wei ◽

Siyuan Jiang ◽

...

Keyword(s):

Machine Vision ◽

Optimization Method ◽

Code Optimization ◽

State Code

Download Full-text

Making the Computation of Approximations of Invariant Measures and Its Attractors for IFS and GIFS, Through the Deterministic Algorithm, Tractable

10.21203/rs.3.rs-1000434/v1 ◽

2021 ◽

Author(s):

Rudnei Dias da Cunha ◽

Elismar R. Oliveira

Keyword(s):

Data Structures ◽

Invariant Measures ◽

Deterministic Algorithm ◽

Search Algorithms ◽

Code Optimization ◽

Running Time ◽

Use Of Data

Abstract We present algorithms to compute approximations of invariant measures and its attractors for IFS and GIFS, using the deterministic algorithm in a tractable way, with code optimization strategies and use of data structures and search algorithms. The results show that these algorithms allow the use of these (G)IFS in a reasonable running time.

Download Full-text

Cache-Aware and Roofline-Ideal Automatic Differentiation

10.2118/203933-ms ◽

2021 ◽

Author(s):

Yuxuan Jing ◽

Rami M. Younis

Keyword(s):

Automatic Differentiation ◽

Simulation Software ◽

Code Optimization ◽

Discrete Operator ◽

Lazy Evaluation ◽

Software Libraries ◽

Language Constructs ◽

Working Set ◽

Expression Templates ◽

Programming Interface

Abstract Automatic differentiation software libraries augment arithmetic operations with their derivatives, thereby relieving the programmer of deriving, implementing, debugging, and maintaining derivative code. With this encapsulation however, the responsibility of code optimization relies more heavily on the AD system itself (as opposed to the programmer and the compiler). Moreover, given that there are multiple contexts in reservoir simulation software for which derivatives are required (e.g. property package and discrete operator evaluations), the AD infrastructure must also be adaptable. An Operator Overloading AD design is proposed and tested to provide scalability and computational efficiency seemlessly across memory- and compute-bound applications. This is achieved by 1) use of portable and standard programming language constructs (C++17 and OpenMP 4.5 standards), 2) adopting a vectorized programming interface, 3) lazy evaluation via expression templates, and 4) multiple memory alignment and layout policies. Empirical analysis is conducted on various kernels spanning various arithmetic intensity and working set sizes. Cache- aware roofline analysis results show that the performance and scalability attained are reliably ideal. In terms of floapting point operations executed per second, the performance of the AD system matches optimized hand-code. Finally, the implementation is benchmarked using the Automatically Differentiable Expression Templates Library (ADETL).

Download Full-text

Automatic mapping and code optimization for OpenCL kernels on FT-matrix architecture (WIP paper)

Proceedings of the 22nd ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems ◽

10.1145/3461648.3463845 ◽

2021 ◽

Author(s):

Xiaolei Zhao ◽

Mei Wen ◽

Zhaoyun Chen ◽

Yang Shi ◽

Chunyuan Zhang

Keyword(s):

Code Optimization ◽

Automatic Mapping

Download Full-text

Code Optimization For Fast Chirp FMCW Automotive MIMO Radar

IEEE Transactions on Vehicular Technology ◽

10.1109/tvt.2021.3095956 ◽

2021 ◽

pp. 1-1

Author(s):

Oded Bialer ◽

Amnon Jonas ◽

Tom Tirer

Keyword(s):

Mimo Radar ◽

Code Optimization

Download Full-text

CODE OPTIMIZATION METHOD FOR QUALCOMM HEXAGON PROCESSOR, SUPPORTING INSTRUCTION LEVEL PARALLELISM AND BUILT WITH VLIW (Very Long Instruction Word) ARCHITECTURE

ITNOU: Information technologies in education, science and management ◽

10.47501/itnou.2021.1.105-115 ◽

2021 ◽

Vol 115 ◽

pp. 105-115

Author(s):

Tatiana Nikolaevna Romanova ◽

◽

Dmitry Igorevich Gorin ◽

Keyword(s):

Optimization Method ◽

Register Allocation ◽

Instruction Level Parallelism ◽

Code Optimization ◽

Very Long Instruction Word ◽

Running Time ◽

Level Parallelism ◽

Packet Density

A method for optimizing the filling of a machine word with independent instructions is proposed, which allows to increase the performance of programs by stacking the maximum number of independent commands in a package. The paper also confirms the hypothesis that with the transition to random register allocation by the compiler, the packet density will increase, which will result in a decrease in the program's running time.

Download Full-text

Direct reconstruction method for discontinuous Galerkin methods on higher-order mixed-curved meshes III. Code optimization via tensor contraction

Computers & Fluids ◽

10.1016/j.compfluid.2020.104790 ◽

2021 ◽

Vol 215 ◽

pp. 104790

Author(s):

Hojun You ◽

Chongam Kim

Keyword(s):

Discontinuous Galerkin ◽

Discontinuous Galerkin Methods ◽

Higher Order ◽

Code Optimization ◽

Reconstruction Method ◽

Galerkin Methods ◽

Tensor Contraction

Download Full-text

Transformation of Voice Signals to Spatial Domain for Code Optimization in Digital Image Processing

Communications in Computer and Information Science - Recent Trends in Image Processing and Pattern Recognition ◽

10.1007/978-981-16-0493-5_18 ◽

2021 ◽

pp. 196-209

Author(s):

Akram Alsubari ◽

Ghanshyam D. Ramteke ◽

Rakesh J. Ramteke

Keyword(s):

Image Processing ◽

Digital Image Processing ◽

Digital Image ◽

Spatial Domain ◽

Code Optimization

Download Full-text

Method for Adaptation of Algorithms to GPU Architecture

10.20948/graphicon-2021-3027-930-941 ◽

2021 ◽

Author(s):

Vadim Bulavintsev ◽

Dmitry Zhdanov

Keyword(s):

Graphics Processing Units ◽

Search Algorithm ◽

Boolean Satisfiability ◽

Control Flow ◽

Code Optimization ◽

Search Performance ◽

Backtracking Search ◽

Boolean Satisfiability Problem ◽

Graphics Processing ◽

Gpu Architecture

We propose a generalized method for adapting and optimizing algorithms for efficient execution on modern graphics processing units (GPU). The method consists of several steps. First, build a control flow graph (CFG) of the algorithm. Next, transform the CFG into a tree of loops and merge non-parallelizable loops into parallelizable ones. Finally, map the resulting loops tree to the tree of GPU computational units, unrolling the algorithm’s loops as necessary for the match. The mapping should be performed bottom-up, from the lowest GPU architecture levels to the highest ones, to minimize off-chip memory access and maximize register file usage. The method provides programmer with a convenient and robust mental framework and strategy for GPU code optimization. We demonstrate the method by adapting to a GPU the DPLL backtracking search algorithm for solving the Boolean satisfiability problem (SAT). The resulting GPU version of DPLL outperforms the CPU version in raw tree search performance sixfold for regular Boolean satisfiability problems and twofold for irregular ones.

Download Full-text

code optimization
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Large-scale ab initio simulation of light–matter interaction at the atomic scale in Fugaku

Machine vision assisted online array state code optimization Method for 1-bit Programmable Metasurface

Making the Computation of Approximations of Invariant Measures and Its Attractors for IFS and GIFS, Through the Deterministic Algorithm, Tractable

Cache-Aware and Roofline-Ideal Automatic Differentiation

Automatic mapping and code optimization for OpenCL kernels on FT-matrix architecture (WIP paper)

Code Optimization For Fast Chirp FMCW Automotive MIMO Radar

CODE OPTIMIZATION METHOD FOR QUALCOMM HEXAGON PROCESSOR, SUPPORTING INSTRUCTION LEVEL PARALLELISM AND BUILT WITH VLIW (Very Long Instruction Word) ARCHITECTURE

Direct reconstruction method for discontinuous Galerkin methods on higher-order mixed-curved meshes III. Code optimization via tensor contraction

Transformation of Voice Signals to Spatial Domain for Code Optimization in Digital Image Processing

Method for Adaptation of Algorithms to GPU Architecture

Export Citation Format

code optimizationRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Large-scale ab initio simulation of light–matter interaction at the atomic scale in Fugaku

Machine vision assisted online array state code optimization Method for 1-bit Programmable Metasurface

Making the Computation of Approximations of Invariant Measures and Its Attractors for IFS and GIFS, Through the Deterministic Algorithm, Tractable

Cache-Aware and Roofline-Ideal Automatic Differentiation

Automatic mapping and code optimization for OpenCL kernels on FT-matrix architecture (WIP paper)

Code Optimization For Fast Chirp FMCW Automotive MIMO Radar

CODE OPTIMIZATION METHOD FOR QUALCOMM HEXAGON PROCESSOR, SUPPORTING INSTRUCTION LEVEL PARALLELISM AND BUILT WITH VLIW (Very Long Instruction Word) ARCHITECTURE

Direct reconstruction method for discontinuous Galerkin methods on higher-order mixed-curved meshes III. Code optimization via tensor contraction

Transformation of Voice Signals to Spatial Domain for Code Optimization in Digital Image Processing

Method for Adaptation of Algorithms to GPU Architecture

code optimization
Recently Published Documents