optimized code Latest Research Papers

Secure Federated Aggregate-Count Queries on Medical Patient Databases Using Fully-Homomorphic Cryptography

10.1101/2021.11.10.468090 ◽

2021 ◽

Author(s):

Alexander T. Leighton ◽

Yun William Yu

Keyword(s):

Differential Privacy ◽

Homomorphic Encryption ◽

Approximation Error ◽

Medical Patient ◽

Numerical Estimation ◽

Fully Homomorphic Encryption ◽

Health Records ◽

Science Community ◽

Optimized Code ◽

Open Question

Electronic health records (EHR) are often siloed across a network of hospitals, but researchers may wish to perform aggregate count queries on said records in entirety---e.g. How many patients have diabetes? Prior work has established a strong approach to answering these queries in the form of probabilistic sketching algorithms like LogLog and HyperLogLog; however, it has remained somewhat of an open question how these algorithms should be made truly private. While many works in the computational biology community---as well as the computer science community at large---have attempted to solve this problem using differential privacy, these methods involve adding noise and still reveal some amount of non-trivial information. Here, we prototype a new protocol using fully homomorphic encryption that is trivially secured even in the setting of quantum-capable adversaries, as it reveals no information other than that which can be trivially gained from final numerical estimation. Simulating up to 16 parties on a single CPU thread takes no longer than 20 minutes to return an estimate with expected 6% approximation error; furthermore, the protocol is parallelizable across both parties and cores, so, in practice, with optimized code, we might expect sub-minute processing time for each party.

Download Full-text

Not so fast: understanding and mitigating negative impacts of compiler optimizations on code reuse gadget sets

Proceedings of the ACM on Programming Languages ◽

10.1145/3485531 ◽

2021 ◽

Vol 5 (OOPSLA) ◽

pp. 1-30

Author(s):

Michael D. Brown ◽

Matthew Pruett ◽

Robert Bigelow ◽

Girish Mururu ◽

Santosh Pande

Keyword(s):

Code Generation ◽

Compiler Optimization ◽

High Rate ◽

Compiler Optimizations ◽

Code Reuse ◽

Semantic Security ◽

Negative Impacts ◽

Optimized Code ◽

Security Guarantees ◽

Semantic Properties

Despite extensive testing and correctness certification of their functional semantics, a number of compiler optimizations have been shown to violate security guarantees implemented in source code. While prior work has shed light on how such optimizations may introduce semantic security weaknesses into programs, there remains a significant knowledge gap concerning the impacts of compiler optimizations on non-semantic properties with security implications. In particular, little is currently known about how code generation and optimization decisions made by the compiler affect the availability and utility of reusable code segments called gadgets required for implementing code reuse attack methods such as return-oriented programming. In this paper, we bridge this gap through a study of the impacts of compiler optimization on code reuse gadget sets. We analyze and compare 1,187 variants of 20 different benchmark programs built with two production compilers (GCC and Clang) to determine how their optimization behaviors affect the code reuse gadget sets present in program variants with respect to both quantitative and qualitative metrics. Our study exposes an important and unexpected problem; compiler optimizations introduce new gadgets at a high rate and produce code containing gadget sets that are generally more useful to an attacker than those in unoptimized code. Using differential binary analysis, we identify several undesirable behaviors at the root of this phenomenon. In turn, we propose and evaluate several strategies to mitigate these behaviors. In particular, we show that post-production binary recompilation can effectively mitigate these behaviors with negligible performance impacts, resulting in optimized code with significantly smaller and less useful gadget sets.

Download Full-text

Benchmarking Highly Parallel Hardware for Spiking Neural Networks in Robotics

Frontiers in Neuroscience ◽

10.3389/fnins.2021.667011 ◽

2021 ◽

Vol 15 ◽

Author(s):

Lea Steffen ◽

Robin Koch ◽

Stefan Ulbrich ◽

Sven Nitzsche ◽

Arne Roennau ◽

...

Keyword(s):

Neural Networks ◽

Average Energy ◽

Spiking Neural Networks ◽

Attractive Alternative ◽

Full Potential ◽

Neuromorphic Hardware ◽

Graphical Processing ◽

Average Energy Consumption ◽

Optimized Code ◽

Robotic Application

Animal brains still outperform even the most performant machines with significantly lower speed. Nonetheless, impressive progress has been made in robotics in the areas of vision, motion- and path planning in the last decades. Brain-inspired Spiking Neural Networks (SNN) and the parallel hardware necessary to exploit their full potential have promising features for robotic application. Besides the most obvious platform for deploying SNN, brain-inspired neuromorphic hardware, Graphical Processing Units (GPU) are well capable of parallel computing as well. Libraries for generating CUDA-optimized code, like GeNN and affordable embedded systems make them an attractive alternative due to their low price and availability. While a few performance tests exist, there has been a lack of benchmarks targeting robotic applications. We compare the performance of a neural Wavefront algorithm as a representative of use cases in robotics on different hardware suitable for running SNN simulations. The SNN used for this benchmark is modeled in the simulator-independent declarative language PyNN, which allows using the same model for different simulator backends. Our emphasis is the comparison between Nest, running on serial CPU, SpiNNaker, as a representative of neuromorphic hardware, and an implementation in GeNN. Beyond that, we also investigate the differences of GeNN deployed to different hardware. A comparison between the different simulators and hardware is performed with regard to total simulation time, average energy consumption per run, and the length of the resulting path. We hope that the insights gained about performance details of parallel hardware solutions contribute to developing more efficient SNN implementations for robotics.

Download Full-text

Symbolic differentiation algorithm for inverse dynamics of serial robots with flexible joints

Journal of Mechanisms and Robotics ◽

10.1115/1.4051355 ◽

2021 ◽

pp. 1-11

Author(s):

Thanh-Trung DO ◽

Viet-Hung Vu ◽

Zhaoheng Liu

Keyword(s):

Inverse Dynamics ◽

Computation Time ◽

Computational Time ◽

Flexible Joints ◽

Symbolic Form ◽

Matlab Code ◽

Flexible Joint Robots ◽

Serial Robots ◽

Real Time Applications ◽

Optimized Code

Abstract A new symbolic differentiation algorithm is proposed in this paper to automatically generate the inverse dynamics of flexible joint robots in symbolic form, and results obtained can be used in real-time applications. The proposed method with 𝒪(n) computational complexity is developed based on the recursive Newton-Euler algorithm, the chain rule of differentiation, and the computer algebra system. The input of the proposed algorithm consists of symbolic matrices describing the kinematic and dynamic parameters of the robot. The output is the inverse dynamics solution written in portable and optimized code (C-code/Matlab-code). An exemplary, numerical simulation for inverse dynamics of the Kuka LWR4 robot with seven flexible joints is conducted using Matlab, in which the computational time per cycle of inverse dynamics is about 0.02 millisecond. The numerical example provides very good matching results versus existing methods, while requiring much less computation time and complexity.

Download Full-text

Jointly Optimized Design of Distributed RS Codes by Proper Selection in Relay

10.21203/rs.3.rs-239203/v1 ◽

2021 ◽

Author(s):

Pengcheng Guo ◽

Fengfan Yang ◽

Chunli Zhao ◽

Waheed Ullah

Keyword(s):

Low Complexity ◽

Vital Role ◽

Optimized Design ◽

Proper Selection ◽

Distributed Coding ◽

Rs Codes ◽

Coding Scheme ◽

Coding Schemes ◽

Optimized Code ◽

Rs Coding

Abstract This paper proposes a distributed RS coding scheme which is comprised of two different ReedSolomon (RS) codes over fast Rayleigh fading channel. Practically in any distributed coding scheme, an appropriate encoding strategy at the relay plays a vital role in achieving an optimized code at the destination. Therefore, the authors have proposed an efficient approach for proper selection of information at the relay based on subspace approach. Using this approach as the proper benchmark, another more practical selection approach with low complexity is also proposed. Monte Carlo simulations demonstrate that the distributed RS coding scheme under the two approaches can achieve nearly the same bit error rate (BER) performance. Furthermore, to jointly decode the source and relay codes at the destination, two different decoding algorithms named as naive and smart algorithms are proposed. The simulation results reveal that the advantage of smart algorithm as compared to naive one. The proposed distributed RS coding scheme with smart algorithm outperforms its non-cooperative scheme by a gain of 2.4-3.2 dB under identical conditions. Moreover, the proposed distributed RS coding scheme outperforms multiple existing distributed coding schemes, making it an excellent candidate for the future distributed coding wireless communications.

Download Full-text

Automated Generation of Optimized Code Implementing SVM models on GPUs

IEEE Latin America Transactions ◽

10.1109/tla.2021.9447690 ◽

2021 ◽

Vol 19 (3) ◽

pp. 413-420

Author(s):

Oscar Jesus Castro ◽

Ines Fernando Vega

Keyword(s):

Automated Generation ◽

Optimized Code

Download Full-text

Highly efficient lattice Boltzmann multiphase simulations of immiscible fluids at high-density ratios on CPUs and GPUs through code generation

The International Journal of High Performance Computing Applications ◽

10.1177/10943420211016525 ◽

2021 ◽

pp. 109434202110165

Author(s):

Markus Holzer ◽

Martin Bauer ◽

Harald Köstler ◽

Ulrich Rüde

Keyword(s):

Code Generation ◽

Lattice Boltzmann ◽

High Performance ◽

Three Dimensional ◽

Coupled Model ◽

Immiscible Fluids ◽

High Density ◽

Density Ratios ◽

Optimized Code ◽

High Level

A high-performance implementation of a multiphase lattice Boltzmann method based on the conservative Allen-Cahn model supporting high-density ratios and high Reynolds numbers is presented. Meta-programming techniques are used to generate optimized code for CPUs and GPUs automatically. The coupled model is specified in a high-level symbolic description and optimized through automatic transformations. The memory footprint of the resulting algorithm is reduced through the fusion of compute kernels. A roofline analysis demonstrates the excellent efficiency of the generated code on a single GPU. The resulting single GPU code has been integrated into the multiphysics framework waLBerla to run massively parallel simulations on large domains. Communication hiding and GPUDirect-enabled MPI yield near-perfect scaling behavior. Scaling experiments are conducted on the Piz Daint supercomputer with up to 2048 GPUs, simulating several hundred fully resolved bubbles. Further, validation of the implementation is shown in a physically relevant scenario—a three-dimensional rising air bubble in water.

Download Full-text

An optimized code book design and assignment based on 32-QAM constellation in downlink SCMA systems

Digital Signal Processing ◽

10.1016/j.dsp.2020.102919 ◽

2021 ◽

Vol 109 ◽

pp. 102919

Author(s):

Prach P. Waghmare

Keyword(s):

Book Design ◽

Optimized Code ◽

Code Book

Download Full-text

Dataset Sensitive Autotuning of Multi-versioned Code Based on Monotonic Properties

10.1007/978-3-030-83978-9_1 ◽

2021 ◽

pp. 3-23

Author(s):

Philip Munksgaard ◽

Svend Lund Breddam ◽

Troels Henriksen ◽

Fabian Cristian Gieseke ◽

Cosmin Oancea

Keyword(s):

Optimal Solution ◽

Black Box ◽

Stochastic Search ◽

Effective Solution ◽

Tuning Time ◽

Monotonicity Assumption ◽

Optimized Code ◽

Tuning Strategy ◽

Monotonic Properties ◽

Rule Systems

AbstractFunctional languages allow rewrite-rule systems that aggressively generate a multitude of semantically-equivalent but differently-optimized code versions. In the context of GPGPU execution, this paper addresses the important question of how to compose these code versions into a single program that (near-)optimally discriminates them across different datasets. Rather than aiming at a general autotuning framework reliant on stochastic search, we argue that in some cases, a more effective solution can be obtained by customizing the tuning strategy for the compiler transformation producing the code versions.We present a simple and highly-composable strategy which requires that the (dynamic) program property used to discriminate between code versions conforms with a certain monotonicity assumption. Assuming the monotonicity assumption holds, our strategy guarantees that if an optimal solution exists it will be found. If an optimal solution doesn’t exist, our strategy produces human tractable and deterministic results that provide insights into what went wrong and how it can be fixed.We apply our tuning strategy to the incremental-flattening transformation supported by the publicly-available Futhark compiler and compare with a previous black-box tuning solution that uses the popular OpenTuner library. We demonstrate the feasibility of our solution on a set of standard datasets of real-world applications and public benchmark suites, such as Rodinia and FinPar. We show that our approach shortens the tuning time by a factor of $$6\times $$ 6 × on average, and more importantly, in five out of eleven cases, it produces programs that are (as high as $$10\times $$ 10 × ) faster than the ones produced by the OpenTuner-based technique.

Download Full-text

Efficient Evaluation of Electrostatic Potential with Computerized Optimized Code

Physical Chemistry Chemical Physics ◽

10.1039/d1cp02805g ◽

2021 ◽

Author(s):

Jun Zhang ◽

Tian Lu

Keyword(s):

Electrostatic Potential ◽

Efficient Algorithm ◽

Molecular Electrostatic Potential ◽

Optimized Code ◽

A Performance

The evaluation of molecular electrostatic potential (ESP) is a performance bottleneck for many computational chemical tasks like RESP charge fitting or QM/MM simulations. In this paper, an efficient algorithm for...

Download Full-text

optimized code
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Secure Federated Aggregate-Count Queries on Medical Patient Databases Using Fully-Homomorphic Cryptography

Not so fast: understanding and mitigating negative impacts of compiler optimizations on code reuse gadget sets

Benchmarking Highly Parallel Hardware for Spiking Neural Networks in Robotics

Symbolic differentiation algorithm for inverse dynamics of serial robots with flexible joints

Jointly Optimized Design of Distributed RS Codes by Proper Selection in Relay

Automated Generation of Optimized Code Implementing SVM models on GPUs

Highly efficient lattice Boltzmann multiphase simulations of immiscible fluids at high-density ratios on CPUs and GPUs through code generation

An optimized code book design and assignment based on 32-QAM constellation in downlink SCMA systems

Dataset Sensitive Autotuning of Multi-versioned Code Based on Monotonic Properties

Efficient Evaluation of Electrostatic Potential with Computerized Optimized Code

Export Citation Format

optimized codeRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Secure Federated Aggregate-Count Queries on Medical Patient Databases Using Fully-Homomorphic Cryptography

Not so fast: understanding and mitigating negative impacts of compiler optimizations on code reuse gadget sets

Benchmarking Highly Parallel Hardware for Spiking Neural Networks in Robotics

Symbolic differentiation algorithm for inverse dynamics of serial robots with flexible joints

Jointly Optimized Design of Distributed RS Codes by Proper Selection in Relay

Automated Generation of Optimized Code Implementing SVM models on GPUs

Highly efficient lattice Boltzmann multiphase simulations of immiscible fluids at high-density ratios on CPUs and GPUs through code generation

An optimized code book design and assignment based on 32-QAM constellation in downlink SCMA systems

Dataset Sensitive Autotuning of Multi-versioned Code Based on Monotonic Properties

Efficient Evaluation of Electrostatic Potential with Computerized Optimized Code

optimized code
Recently Published Documents