Self-Spectre, Write-Execute and the Hidden State

Abstract The recent Meltdown and Spectre vulnerabilities have highlighted a very present and real threat in the on-chip memory cache units which can ultimately provide a hidden state, albeit only readable via memory timing instructions [Kocher, P.—Genkin, D.— Gruss, D.— Haas, W.—Hamburg, M.—Lipp, M.–Mangard, S.—Prescher, T.—Schwarz, M.—Yarom, Y.: Spectre attacks: Exploiting speculative execution, CoRR, abs/1801.01203, 2018]. Yet the exploits, although having some complexity and slowness, are demonstrably reliable on nearly all processors produced for the last two decades. Moving out from looking at this strictly as a means of reading protected memory, as the large microprocessor companies move to close this security vulnerability, an interesting question arises. Could the inherent design of the processor give the ability to hide arbitrary calculations in this speculative and parallel side channel? Without even using protected memory and exploiting the vulnerability, as has been the focus, there could very well be a whole class of techniques which exploit the side-channel. It could be done in a way which would be largely un-preventable behavior as the technology would start to become self-defeating or require a more complicated and expensive on-chip cache memory system to properly post-speculatively clean itself. And the ability to train the branch predictor to incorrectly speculatively behave is almost certain given hardware limitations, andthusprovidesexactly this pathway. A novel approach looks at just how much computation can be done speculatively with a result store via indirect reads and available through the memory cache. A multi-threaded approach can allow a multi-stage computation pipeline where each computation is passed to a read-out thread and then to the next computation thread [Swanson, S.—McDowell, L. K.—Swift, M. M.—Eggers, S. J.–Levy H. M.: An evaluation of speculative instruction execution on simultaneous multithreaded processors, ACM Trans. Comput. Syst. 21 (2003), 314–340]. Through channels like this, an application can surreptitiously make arbitrary calculations, or even leak data without any standard tracing tools being capable of monitoring the subtle changes. Like a variation of the famous physics Heisenberg uncertainty principle, even a tool capable of reading the cache states would not only be incredibly inefficient, but thereby tamper with and modify the state. Tools like in-circuit emulators, or specially designed cache emulators would be needed to unmask the speculative reads, and it is further difficult to visualize with a linear time-line. Specifically, the AES and RSA algorithms will be studied with respect to these ideas, looking at success rates for various calculation batches with speculative execution, while having a summary view to see the rather severe performance penalties for using such methods. Either approaches could provide for strong white-box cryptography when considering a binary, non-source code form. In terms of white-box methods, both could be significantly challenging to locate or deduce the inner workings of the code. Further, both methods can easily surreptitiously leak or hide data within shared memory in a seemingly innocuous manner.

Download Full-text

Low-Process–Voltage–Temperature-Sensitivity Multi-Stage Timing Monitor for System-on-Chip Applications

Electronics ◽

10.3390/electronics10131587 ◽

2021 ◽

Vol 10 (13) ◽

pp. 1587

Author(s):

Duo Sheng ◽

Hsueh-Ru Lin ◽

Li Tai

Keyword(s):

High Performance ◽

Power Reduction ◽

System On Chip ◽

Timing Information ◽

Multi Stage ◽

Dynamic Voltage ◽

And Performance ◽

On Chip ◽

Maximum Measurement ◽

Maximum Measurement Error

High performance and complex system-on-chip (SoC) design require a throughput and stable timing monitor to reduce the impacts of uncertain timing and implement the dynamic voltage and frequency scaling (DVFS) scheme for overall power reduction. This paper presents a multi-stage timing monitor, combining three timing-monitoring stages to achieve a high timing-monitoring resolution and a wide timing-monitoring range simultaneously. Additionally, because the proposed timing monitor has high immunity to the process–voltage–temperature (PVT) variation, it provides a more stable time-monitoring results. The time-monitoring resolution and range of the proposed timing monitor are 47 ps and 2.2 µs, respectively, and the maximum measurement error is 0.06%. Therefore, the proposed multi-stage timing monitor provides not only the timing information of the specified signals to maintain the functionality and performance of the SoC, but also makes the operation of the DVFS scheme more efficient and accurate in SoC design.

Download Full-text

ThermalAttackNet: Are CNNs Making It Easy to Perform Temperature Side-Channel Attack in Mobile Edge Devices?

Future Internet ◽

10.3390/fi13060146 ◽

2021 ◽

Vol 13 (6) ◽

pp. 146

Author(s):

Somdip Dey ◽

Amit Kumar Singh ◽

Klaus McDonald-Maier

Keyword(s):

Information Flow ◽

Heat Dissipation ◽

Side Channel ◽

Side Channel Attack ◽

Side Channel Attacks ◽

Information Flow Control ◽

On Chip ◽

Temperature Side ◽

Over Time ◽

Memory Efficient

Side-channel attacks remain a challenge to information flow control and security in mobile edge devices till this date. One such important security flaw could be exploited through temperature side-channel attacks, where heat dissipation and propagation from the processing cores are observed over time in order to deduce security flaws. In this paper, we study how computer vision-based convolutional neural networks (CNNs) could be used to exploit temperature (thermal) side-channel attack on different Linux governors in mobile edge device utilizing multi-processor system-on-chip (MPSoC). We also designed a power- and memory-efficient CNN model that is capable of performing thermal side-channel attack on the MPSoC and can be used by industry practitioners and academics as a benchmark to design methodologies to secure against such an attack in MPSoC.

Download Full-text

A linear-time eigenvalue solver for finite-element-based analysis of large-scale wave propagation problems in on-chip interconnect structures

2008 IEEE Antennas and Propagation Society International Symposium ◽

10.1109/aps.2008.4619427 ◽

2008 ◽

Author(s):

Jongwon Lee ◽

V. Balakrishnan ◽

Cheng-Kok Koh ◽

Dan Jiao

Keyword(s):

Finite Element ◽

Wave Propagation ◽

Large Scale ◽

Linear Time ◽

On Chip ◽

Eigenvalue Solver

Download Full-text

Operating System for Runtime Reconfigurable Multiprocessor Systems

International Journal of Reconfigurable Computing ◽

10.1155/2011/121353 ◽

2011 ◽

Vol 2011 ◽

pp. 1-16 ◽

Cited By ~ 16

Author(s):

Diana Göhringer ◽

Michael Hübner ◽

Etienne Nguepi Zeutebouo ◽

Jürgen Becker

Keyword(s):

Operating System ◽

Resource Management ◽

Multiprocessor System ◽

Task Mapping ◽

Access Port ◽

Novel Approach ◽

Hardware Resource ◽

Hardware Architectures ◽

On Chip ◽

Internal Configuration

Operating systems traditionally handle the task scheduling of one or more application instances on processor-like hardware architectures. RAMPSoC, a novel runtime adaptive multiprocessor System-on-Chip, exploits the dynamic reconfiguration on FPGAs to generate, start and terminate hardware and software tasks. The hardware tasks have to be transferred to the reconfigurable hardware via a configuration access port. The software tasks can be loaded into the local memory of the respective IP core either via the configuration access port or via the on-chip communication infrastructure (e.g. a Network-on-Chip). Recent-series of Xilinx FPGAs, such as Virtex-5, provide two Internal Configuration Access Ports, which cannot be accessed simultaneously. To prevent conflicts, the access to these ports as well as the hardware resource management needs to be controlled, e.g. by a special-purpose operating system running on an embedded processor. For that purpose and to handle the relations between temporally and spatially scheduled operations, the novel approach of an operating system is of high importance. This special purpose operating system, called CAP-OS (Configuration Access Port-Operating System), which will be presented in this paper, supports the clients using the configuration port with the services of priority-based access scheduling, hardware task mapping and resource management.

Download Full-text

SecNVM: Power Side-Channel Elimination Using On-Chip Capacitors for Highly Secure Emerging NVM

IEEE Transactions on Very Large Scale Integration (VLSI) Systems ◽

10.1109/tvlsi.2021.3087734 ◽

2021 ◽

pp. 1-11

Author(s):

Karthikeyan Nagarajan ◽

Farid Uddin Ahmed ◽

Mohammad Nasim Imtiaz Khan ◽

Asmit De ◽

Masud H. Chowdhury ◽

...

Keyword(s):

Side Channel ◽

On Chip

Download Full-text

Leakage Power and Side Channel Security of Nanoscale Cryptosystem-on-Chip (CoC)

2009 IEEE Computer Society Annual Symposium on VLSI ◽

10.1109/isvlsi.2009.46 ◽

2009 ◽

Author(s):

Amir Khatib Zadeh ◽

Catherine Gebotys

Keyword(s):

Leakage Power ◽

Side Channel ◽

On Chip

Download Full-text

Leveraging on-chip voltage regulators as a countermeasure against side-channel attacks

Proceedings of the 52nd Annual Design Automation Conference on - DAC '15 ◽

10.1145/2744769.2744866 ◽

2015 ◽

Cited By ~ 27

Author(s):

Weize Yu ◽

Orhun Aras Uzun ◽

Selçuk Köse

Keyword(s):

Voltage Regulators ◽

Side Channel ◽

Side Channel Attacks ◽

On Chip

Download Full-text

Multi-stage arc magma evolution recorded by apatite in volcanic rocks

Geology ◽

10.1130/g46998.1 ◽

2020 ◽

Vol 48 (4) ◽

pp. 323-327 ◽

Cited By ~ 2

Author(s):

Chetan L. Nathwani ◽

Matthew A. Loader ◽

Jamie J. Wilkinson ◽

Yannick Buret ◽

Robert H. Sievwright ◽

...

Keyword(s):

Fractional Crystallization ◽

Volcanic Rocks ◽

Explosive Volcanism ◽

Ore Deposits ◽

Magma Evolution ◽

Arc Magmas ◽

Deep Crust ◽

Multi Stage ◽

Novel Approach ◽

Arc Magma

Abstract Protracted magma storage in the deep crust is a key stage in the formation of evolved, hydrous arc magmas that can result in explosive volcanism and the formation of economically valuable magmatic-hydrothermal ore deposits. High magmatic water content in the deep crust results in extensive amphibole ± garnet fractionation and the suppression of plagioclase crystallization as recorded by elevated Sr/Y ratios and high Eu (high Eu/Eu*) in the melt. Here, we use a novel approach to track the petrogenesis of arc magmas using apatite trace element chemistry in volcanic formations from the Cenozoic arc of central Chile. These rocks formed in a magmatic cycle that culminated in high-Sr/Y magmatism and porphyry ore deposit formation in the Miocene. We use Sr/Y, Eu/Eu*, and Mg in apatite to track discrete stages of arc magma evolution. We apply fractional crystallization modeling to show that early-crystallizing apatite can inherit a high-Sr/Y and high-Eu/Eu* melt chemistry signature that is predetermined by amphibole-dominated fractional crystallization in the lower crust. Our modeling shows that crystallization of the in situ host-rock mineral assemblage in the shallow crust causes competition for trace elements in the melt that leads to apatite compositions diverging from bulk-magma chemistry. Understanding this decoupling behavior is important for the use of apatite as an indicator of metallogenic fertility in arcs and for interpretation of provenance in detrital studies.

Download Full-text

A New Fault Tolerant Routing Algorithm for Networks on Chip

International Journal of Embedded and Real-Time Communication Systems ◽

10.4018/ijertcs.2019070105 ◽

2019 ◽

Vol 10 (3) ◽

pp. 68-85

Author(s):

Chakib Nehnouh ◽

Mohamed Senouci

Keyword(s):

Network Architecture ◽

Fault Tolerant ◽

Routing Algorithm ◽

Transient Faults ◽

Networks On Chip ◽

Congestion Detection ◽

Novel Approach ◽

On Chip ◽

Detection Mechanisms ◽

Correct Data

To provide correct data transmission and to handle the communication requirements, the routing algorithm should find a new path to steer packets from the source to the destination in a faulty network. Many solutions have been proposed to overcome faults in network-on-chips (NoCs). This article introduces a new fault-tolerant routing algorithm, to tolerate permanent and transient faults in NoCs. This solution called DINRA can satisfy simultaneously congestion avoidance and fault tolerance. In this work, a novel approach inspired by Catnap is proposed for NoCs using local and global congestion detection mechanisms with a hierarchical sub-network architecture. The evaluation (on reliability, latency and throughput) shows the effectiveness of this approach to improve the NoC performances compared to state of art. In addition, with the test module and fault register integrated in the basic architecture, the routers are able to detect faults dynamically and re-route packets to fault-free and congestion-free zones.

Download Full-text

Large-Scale Multi-View Subspace Clustering in Linear Time

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5867 ◽

2020 ◽

Vol 34 (04) ◽

pp. 4412-4419 ◽

Cited By ~ 3

Author(s):

Zhao Kang ◽

Wangtao Zhou ◽

Zhitong Zhao ◽

Junming Shao ◽

Meng Han ◽

...

Keyword(s):

Large Scale ◽

State Of The Art ◽

Linear Time ◽

Subspace Clustering ◽

Data Sets ◽

Clustering Methods ◽

Single View ◽

Novel Approach ◽

Points Of View ◽

Effectiveness And Efficiency

A plethora of multi-view subspace clustering (MVSC) methods have been proposed over the past few years. Researchers manage to boost clustering accuracy from different points of view. However, many state-of-the-art MVSC algorithms, typically have a quadratic or even cubic complexity, are inefficient and inherently difficult to apply at large scales. In the era of big data, the computational issue becomes critical. To fill this gap, we propose a large-scale MVSC (LMVSC) algorithm with linear order complexity. Inspired by the idea of anchor graph, we first learn a smaller graph for each view. Then, a novel approach is designed to integrate those graphs so that we can implement spectral clustering on a smaller graph. Interestingly, it turns out that our model also applies to single-view scenario. Extensive experiments on various large-scale benchmark data sets validate the effectiveness and efficiency of our approach with respect to state-of-the-art clustering methods.

Download Full-text