Exploiting Weak Diffusion of Gimli: Improved Distinguishers and Preimage Attacks

The Gimli permutation proposed in CHES 2017 was designed for cross-platform performance. One main strategy to achieve such a goal is to utilize a sparse linear layer (Small-Swap and Big-Swap), which occurs every two rounds. In addition, the round constant addition occurs every four rounds and only one 32-bit word is affected by it. The above two facts have been recently exploited to construct a distinguisher for the full Gimli permutation with time complexity 264. By utilizing a new property of the SP-box, we demonstrate that the time complexity of the full-round distinguisher can be further reduced to 252 while a significant bias still remains. Moreover, for the 18-round Gimli permutation, we could construct a distinguisher even with only 2 queries. Apart from the permutation itself, the weak diffusion can also be utilized to accelerate the preimage attacks on reduced Gimli-Hash and Gimli-XOF-128 with a divide-and-conquer method. As a consequence, the preimage attacks on reduced Gimli-Hash and Gimli-XOF-128 can reach up to 5 rounds and 9 rounds, respectively. Since Gimli is included in the second round candidates in NIST’s Lightweight Cryptography Standardization process, we expect that our analysis can further advance the understanding of Gimli. To the best of our knowledge, the distinguishing attacks and preimage attacks are the best so far.

Download Full-text

An Efficient Multibody Divide and Conquer Algorithm

Volume 6: 5th International Conference on Multibody Systems, Nonlinear Dynamics, and Control, Parts A, B, and C ◽

10.1115/detc2005-84546 ◽

2005 ◽

Cited By ~ 1

Author(s):

James H. Critchley

Keyword(s):

Multibody Dynamics ◽

Time Complexity ◽

Future Generation ◽

Parallel Computers ◽

Parallel Processors ◽

Divide And Conquer ◽

Parallel Computer ◽

Divide And Conquer Algorithm ◽

Computer Resources ◽

Theoretical Minimum

A new and efficient form of Featherstone’s multibody Divide and Conquer Algorithm (DCA) is presented. The DCA was the first algorithm to achieve theoretically optimal logarithmic time complexity with a theoretical minimum of parallel computer resources for general problems of multibody dynamics, however the DCA is extremely inefficient in the presence of small to modest parallel computers. The new efficient DCA approach (DCAe) demonstrates that large DCA subsystems can be constructed using fast sequential techniques and realize substantial speed increases in the presence of as few as two parallel processors. Previously the DCA was a tool intended for a future generation of parallel computers, this enhanced version promises practical and competitive performance with the parallel computers of today.

Download Full-text

Distributed Operational Space Formulation of Serial Manipulators

Journal of Computational and Nonlinear Dynamics ◽

10.1115/1.4025577 ◽

2013 ◽

Vol 9 (2) ◽

Cited By ~ 2

Author(s):

Kishor D. Bhalerao ◽

James Critchley ◽

Denny Oetomo ◽

Roy Featherstone ◽

Oussama Khatib

Keyword(s):

Parallel Algorithms ◽

Time Complexity ◽

Degrees Of Freedom ◽

Null Space ◽

Divide And Conquer ◽

Serial Manipulators ◽

Divide And Conquer Algorithm ◽

Operational Space ◽

Space Formulation ◽

Space Dynamics

This paper presents a new parallel algorithm for the operational space dynamics of unconstrained serial manipulators, which outperforms contemporary sequential and parallel algorithms in the presence of two or more processors. The method employs a hybrid divide and conquer algorithm (DCA) multibody methodology which brings together the best features of the DCA and fast sequential techniques. The method achieves a logarithmic time complexity (O(log(n)) in the number of degrees of freedom (n) for computing the operational space inertia (Λe) of a serial manipulator in presence of O(n) processors. The paper also addresses the efficient sequential and parallel computation of the dynamically consistent generalized inverse (J¯e) of the task Jacobian, the associated null space projection matrix (Ne), and the joint actuator forces (τnull) which only affect the manipulator posture. The sequential algorithms for computing J¯e, Ne, and τnull are of O(n), O(n2), and O(n) computational complexity, respectively, while the corresponding parallel algorithms are of O(log(n)), O(n), and O(log(n)) time complexity in the presence of O(n) processors.

Download Full-text

Status report on the first round of the NIST lightweight cryptography standardization process

10.6028/nist.ir.8268 ◽

2019 ◽

Cited By ~ 4

Author(s):

Meltem Sönmez Turan ◽

Kerry A McKay ◽

Çağdaş Çalık ◽

Donghoon Chang ◽

Larry Bassham

Keyword(s):

Status Report ◽

Lightweight Cryptography ◽

Standardization Process

Download Full-text

Binary Search In Linked List

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.d7296.118419 ◽

2019 ◽

Vol 8 (4) ◽

pp. 2684-2686

Keyword(s):

Time Complexity ◽

Binary Search ◽

Divide And Conquer ◽

Linear Search ◽

Linked List

This paper is based on an approach to implement Binary Search in Linked List. Binary Search is divide and conquer approach to search an element from the list of sorted element. In Linked List we can do binary search but it has time complexity O(n) that is same as what we have for linear search which makes Binary Search inefficient to use in Linked List. The main problem that binary search takes O(n) time in Linked List due to fact that in linked list we are not able to do indexing which led traversing of each element in Linked list take O(n) time. In this paper a method is implemented through which binary search can be done with time complexity of O(log2n). This is done with the help of auxiliary array. Auxiliary array helps in indexing of linked list through which one can traverse a node in O(1) complexity hence reducing the complexity of binary search to O(log2n) hence increasing efficiency of binary search in linked

Download Full-text

Pyjamask: Block Cipher and Authenticated Encryption with Highly Efficient Masked Implementation

IACR Transactions on Symmetric Cryptology ◽

10.46586/tosc.v2020.is1.31-59 ◽

2020 ◽

pp. 31-59

Author(s):

Dahmun Goudarzi ◽

Jérémy Jean ◽

Stefan Kölbl ◽

Thomas Peyrin ◽

Matthieu Rivain ◽

...

Keyword(s):

Block Cipher ◽

High Order ◽

Side Channel ◽

Design Rationale ◽

Authenticated Encryption ◽

Lightweight Cryptography ◽

Standardization Process ◽

Very High ◽

Better Than ◽

Provably Secure

This paper introduces Pyjamask, a new block cipher family and authenticated encryption proposal submitted to the NIST lightweight cryptography standardization process. Pyjamask targets side-channel resistance as one of its main goal. More precisely, it strongly minimizes the number of nonlinear gates used in its internal primitive in order to allow efficient masked implementations, especially for high-order masking in software. Compared to other block ciphers, our proposal has thus among the smallest number of binary AND computations per input bit at the time of writing. Even though Pyjamask minimizes such an important criterion, it remains rather lightweight and efficient, thanks to a general bitslice construction that enables to computation of all nonlinear gates in parallel. For authenticated encryption, we adopt the provably secure AEAD mode OCB which has been extensively studied and has the benefit to offer full parallelization. Of course, other block cipher-based modes can be considered as well if other performance profiles are to be targeted.The paper first gives the specification of the Pyjamask block cipher and the associated AEAD proposal. We also provide a detailed design rationale for the block cipher which is guided by our aim of software efficiency in the presence of high-order masking. The security of the design is analyzed against most commonly known cryptanalysis techniques. We finally describe efficient (masked) implementations in software and provide implementation results with aggressive performances for masking of very high orders (up to 128). We also provide a rough estimation of the hardware performances which remain much better than those of an AES round-based implementation.

Download Full-text

On Parallel Methods of Multibody Dynamics

Volume 5: 19th Biennial Conference on Mechanical Vibration and Noise, Parts A, B, and C ◽

10.1115/detc2003/vib-48317 ◽

2003 ◽

Cited By ~ 1

Author(s):

James H. Critchley ◽

Kurt S. Anderson

Keyword(s):

Time Complexity ◽

Multibody System ◽

Practical Importance ◽

Divide And Conquer ◽

Optimal Time ◽

Worst Case ◽

Parallel Methods ◽

Divide And Conquer Algorithm ◽

Coordinate Reduction ◽

Parallel Arrays

Optimal time efficient parallel computation methods for large multibody system dynamics are defined and investigated in detail. Comparative observations are made which demonstrate significant deficiencies in operating regions of practical importance and a new parallel algorithm is generated to address them. The new method of Recursive Coordinate Reduction Parallelism (RCRP) outperforms or directly reduces to the fastest general multibody algorithms available for small parallel resources and obtains “O(logk(n))” time complexity in the presence of larger parallel arrays. Performance of this method relative to the Divide and Conquer Algorithm is illustrated with an operations count for the worst case of a multibody chain system.

Download Full-text

Hardware Benchmarking of Round 2 Candidates in the NIST Lightweight Cryptography Standardization Process

10.23919/date51398.2021.9473930 ◽

2021 ◽

Author(s):

Kamyar Mohajerani ◽

Richard Haeussler ◽

Rishub Nagpal ◽

Farnoud Farahmand ◽

Abubakr Abdulgadir ◽

...

Keyword(s):

Lightweight Cryptography ◽

Standardization Process

Download Full-text

An Efficient Multibody Divide and Conquer Algorithm and Implementation

Journal of Computational and Nonlinear Dynamics ◽

10.1115/1.3079823 ◽

2009 ◽

Vol 4 (2) ◽

Cited By ~ 10

Author(s):

James H. Critchley ◽

Kurt S. Anderson ◽

Adarsh Binani

Keyword(s):

Multibody Dynamics ◽

Time Complexity ◽

Future Generation ◽

Parallel Computers ◽

Divide And Conquer ◽

Parallel Computer ◽

Recursive Method ◽

Divide And Conquer Algorithm ◽

Computer Resources ◽

Theoretical Minimum

A new and efficient form of Featherstone’s multibody divide and conquer algorithm (DCA) is presented and evaluated. The DCA was the first algorithm to achieve theoretically the optimal logarithmic time complexity with a theoretical minimum of parallel computer resources for general problems of multibody dynamics; however, the DCA is extremely inefficient in the presence of small to modest parallel computers. This alternative efficient DCA (DCAe) approach demonstrates that large DCA subsystems can be constructed using fast sequential techniques to realize a substantial increase in speed. The usefulness of the DCAe is directly demonstrated in an application to a four processor workstation and compared with the results from the original DCA and a fast sequential recursive method. Previously the DCA was a tool intended for a future generation of parallel computers; this enhanced version delivers practical and competitive performance with the parallel computers of today.

Download Full-text

Security analysis of Subterranean 2.0

Designs Codes and Cryptography ◽

10.1007/s10623-021-00892-6 ◽

2021 ◽

Author(s):

Ling Song ◽

Yi Tu ◽

Danping Shi ◽

Lei Hu

Keyword(s):

Security Analysis ◽

Differential Analysis ◽

Round Function ◽

State Recovery ◽

Linear Layer ◽

Non Linear ◽

Standardization Process ◽

Open Question ◽

First Time ◽

Bias Evaluation

AbstractSubterranean 2.0 is a cipher suite that can be used for hashing, authenticated encryption, MAC computation, etc. It was designed by Daemen, Massolino, Mehrdad, and Rotella, and has been selected as a candidate in the second round of NIST’s lightweight cryptography standardization process. Subterranean 2.0 is a duplex-based construction and utilizes a single-round permutation in the duplex. It is the simplicity of the round function that makes it an attractive target of cryptanalysis. In this paper, we examine the single-round permutation in various phases of Subterranean 2.0 and specify three related attack scenarios that deserve further investigation: keystream biases in the keyed squeezing phase, state collisions in the keyed absorbing phase, and one-round differential analysis in the nonce-misuse setting. To facilitate cryptanalysis in the first two scenarios, we novelly propose a set of size-reduced toy versions of Subterranean 2.0: Subterranean-m. Then we make an observation for the first time on the resemblance between the non-linear layer in the round function of Subterranean 2.0 and SIMON’s round function. Inspired by the existing work on SIMON, we propose explicit formulas for computing the exact correlation of linear trails of Subterranean 2.0 and other ciphers utilizing similar non-linear operations. We then construct our models for searching trails to be used in the keystream bias evaluation and state collision attacks. Our results show that most instances of Subterranean-m are secure in the first two attack scenarios but there exist instances that are not. Further, we find a flaw in the designers’ reasoning of Subterranean 2.0’s linear bias but support the designers’ claim that there is no linear bias measurable from at most $$2^{96}$$ 2 96 data blocks. Due to the time-consuming search, the security of Subterranean 2.0 against the state collision attack in keyed modes still remains an open question. Finally, we observe that one-round differentials allow to recover state bits in the nonce-misuse setting. By proposing nested one-round differentials, we obtain a sufficient number of state bits, leading to a practical state recovery with only 20 repetitions of the nonce and 88 blocks of data. It is noted that our work does not threaten the security of Subterranean 2.0.

Download Full-text

Design and Implementation of an Efficient Multibody Divide and Conquer Algorithm

Volume 5: 6th International Conference on Multibody Systems, Nonlinear Dynamics, and Control, Parts A, B, and C ◽

10.1115/detc2007-35128 ◽

2007 ◽

Cited By ~ 2

Author(s):

James H. Critchley ◽

Adarsh Binani ◽

Kurt Anderson

Keyword(s):

Time Complexity ◽

Future Generation ◽

Parallel Computers ◽

Divide And Conquer ◽

Parallel Computer ◽

Recursive Method ◽

Design And Implementation ◽

Divide And Conquer Algorithm ◽

Computer Resources ◽

Theoretical Minimum

A new and efficient form of Featherstone’s multibody Divide and Conquer Algorithm (DCA) is presented. The DCA was the first algorithm to achieve theoretically optimal logarithmic time complexity with a theoretical minimum of parallel computer resources for general problems of multibody dynamics, however the DCA is extremely inefficient in the presence of small to modest parallel computers. This alternative efficient DCA approach (DCAe) demonstrates that large DCA subsystems can be constructed using fast sequential techniques to realize a substantial increase in speed. The usefullness of the DCAe is directly demonstrated in an application to a four processor workstation and compared with results from the original DCA and a fast sequential recursive method. Previously the DCA was a tool intended for a future generation of parallel computers, this enhanced version delivers practical and competitive performance with the parallel computers of today.

Download Full-text