global memory
Recently Published Documents


TOTAL DOCUMENTS

181
(FIVE YEARS 49)

H-INDEX

17
(FIVE YEARS 2)

2022 ◽  
Vol 6 (POPL) ◽  
pp. 1-31
Author(s):  
Yuting Wang ◽  
Ling Zhang ◽  
Zhong Shao ◽  
Jérémie Koenig

Memory models play an important role in verified compilation of imperative programming languages. A representative one is the block-based memory model of CompCert---the state-of-the-art verified C compiler. Despite its success, the abstraction over memory space provided by CompCert's memory model is still primitive and inflexible. In essence, it uses a fixed representation for identifying memory blocks in a global memory space and uses a globally shared state for distinguishing between used and unused blocks. Therefore, any reasoning about memory must work uniformly for the global memory; it is impossible to individually reason about different sub-regions of memory (i.e., the stack and global definitions). This not only incurs unnecessary complexity in compiler verification, but also poses significant difficulty for supporting verified compilation of open or concurrent programs which need to work with contextual memory, as manifested in many previous extensions of CompCert. To remove the above limitations, we propose an enhancement to the block-based memory model based on nominal techniques; we call it the nominal memory model. By adopting the key concepts of nominal techniques such as atomic names and supports to model the memory space, we are able to 1) generalize the representation of memory blocks to any types satisfying the properties of atomic names and 2) remove the global constraints for managing memory blocks, enabling flexible memory structures for open and concurrent programs. To demonstrate the effectiveness of the nominal memory model, we develop a series of extensions of CompCert based on it. These extensions show that the nominal memory model 1) supports a general framework for verified compilation of C programs, 2) enables intuitive reasoning of compiler transformations on partial memory; and 3) enables modular reasoning about programs working with contextual memory. We also demonstrate that these extensions require limited changes to the original CompCert, making the verification techniques based on the nominal memory model easy to adopt.


Author(s):  
S. Savickaite ◽  
C. Morrison ◽  
E. Lux ◽  
J. Delafield-Butt ◽  
D. R. Simmons

AbstractThis paper describes a smart tablet-based drawing app to digitally record participants’ engagement with the Rey-Osterrieth complex figure (ROCF) task, a well-characterised perceptual memory task that assesses local and global memory. Digitisation of the tasks allows for improved ecological validity, especially in children attracted to tablet devices. Further, digital translation of the tasks affords new measures, including accuracy and computation of the fine motor control kinematics employed to carry out the drawing Here, we report a feasibility study to test the relationship between two neurodevelopmental conditions: autism spectrum disorder (ASD) and attention-deficit/hyperactivity disorder (ADHD). The smart tablet app was employed with 39 adult participants (18-35) characterised for autistic and ADHD traits, and scored using the ROCF perceptual and organisational scoring systems. Trait scores and conditions were predictor variables in linear regression models. Positive correlations were found between the attention-to-detail, attention-switching and communication subscales of the autistic trait questionnaire and organisational scores on the ROCF task. These findings suggest that autistic traits might be linked to differential performance on the ROCF task. Novelty and future applications of the app are discussed.


2022 ◽  
Vol 98 (1) ◽  
pp. 263-280
Author(s):  
Katrin Antweiler

Abstract This article investigates local endeavours for Holocaust memory in post-apartheid South Africa in their relation to global memory imperatives that are, among others, produced by supranational organizations such as UNESCO and the International Holocaust Remembrance Alliance. Drawing on a larger case-study on globalized memory, I analyse to what extent a generalized mnemonic framework is reflected in South Africa's 2007 curriculum reform, namely its inclusion of the Holocaust and subsequent memory politics. In order to illuminate the coloniality of memorialization, I trace the epistemic location of the narrative that suggests that Holocaust memory nourishes democratic values and human rights—maybe even more so than local memories of violence and oppression such as colonization and apartheid. In this regard, I found that while many activists for Holocaust memory continuously and sometimes uncritically advocate for its global implementation, a decolonial perspective enables us to understand the power dynamics constitutive of universal moral norms around Holocaust memory that tacitly transmit global demands to local contexts. I therefore suggest that, within the global colonial matrix of power, a universally advised practice of memorializing the Holocaust to specific ends can be regarded as a technique of governmentality, because it risks limiting utopian thought beyond the Euro-modern paradigm.


2021 ◽  
Vol 2021 ◽  
pp. 1-18
Author(s):  
Nugool Sataporn ◽  
Worasait Suwannik ◽  
Montri Maleewong

Compute Unified Device Architecture (CUDA) implementations are presented of a well-balanced finite volume method for solving a shallow water model. The CUDA platform allows programs to run parallel on GPU. Four versions of the CUDA algorithm are presented in addition to a CPU implementation. Each version is improved from the previous one. We present the following techniques for optimizing a CUDA program: limiting register usage, changing the global memory access pattern, and using loop unroll. The accuracy of all programs is investigated in 3 test cases: a circular dam break on a dry bed, a circular dam break on a wet bed, and a dam break flow over three humps. The last parallel version shows 3.84x speedup over the first CUDA implementation. We use our program to simulate a real-world problem based on an assumed partial breakage of the Srinakarin Dam located in Kanchanaburi province, Thailand. The simulation shows that the strong interaction between massive water flows and bottom elevations under wet and dry conditions is well captured by the well-balanced scheme, while the optimized parallel program produces a 57.32x speedup over the serial version.


Author(s):  
Antonio Fuentes-Alventosa ◽  
Juan Gómez-Luna ◽  
José Maria González-Linares ◽  
Nicolás Guil ◽  
R. Medina-Carnicer

AbstractCAVLC (Context-Adaptive Variable Length Coding) is a high-performance entropy method for video and image compression. It is the most commonly used entropy method in the video standard H.264. In recent years, several hardware accelerators for CAVLC have been designed. In contrast, high-performance software implementations of CAVLC (e.g., GPU-based) are scarce. A high-performance GPU-based implementation of CAVLC is desirable in several scenarios. On the one hand, it can be exploited as the entropy component in GPU-based H.264 encoders, which are a very suitable solution when GPU built-in H.264 hardware encoders lack certain necessary functionality, such as data encryption and information hiding. On the other hand, a GPU-based implementation of CAVLC can be reused in a wide variety of GPU-based compression systems for encoding images and videos in formats other than H.264, such as medical images. This is not possible with hardware implementations of CAVLC, as they are non-separable components of hardware H.264 encoders. In this paper, we present CAVLCU, an efficient implementation of CAVLC on GPU, which is based on four key ideas. First, we use only one kernel to avoid the long latency global memory accesses required to transmit intermediate results among different kernels, and the costly launches and terminations of additional kernels. Second, we apply an efficient synchronization mechanism for thread-blocks (In this paper, to prevent confusion, a block of pixels of a frame will be referred to as simply block and a GPU thread block as thread-block.) that process adjacent frame regions (in horizontal and vertical dimensions) to share results in global memory space. Third, we exploit fully the available global memory bandwidth by using vectorized loads to move directly the quantized transform coefficients to registers. Fourth, we use register tiling to implement the zigzag sorting, thus obtaining high instruction-level parallelism. An exhaustive experimental evaluation showed that our approach is between 2.5$$\times$$ × and 5.4$$\times$$ × faster than the only state-of-the-art GPU-based implementation of CAVLC.


Nanomaterials ◽  
2021 ◽  
Vol 11 (11) ◽  
pp. 3029
Author(s):  
Xudong Wang ◽  
Xueyang Shen ◽  
Suyang Sun ◽  
Wei Zhang

Chalcogenide phase-change materials (PCMs) based random access memory (PCRAM) enter the global memory market as storage-class memory (SCM), holding great promise for future neuro-inspired computing and non-volatile photonic applications. The thermal stability of the amorphous phase of PCMs is a demanding property requiring further improvement. In this work, we focus on indium, an alloying ingredient extensively exploited in PCMs. Starting from the prototype GeTe alloy, we incorporated indium to form three typical compositions along the InTe-GeTe tie line: InGe3Te4, InGeTe2 and In3GeTe4. The evolution of structural details, and the optical properties of the three In-Ge-Te alloys in amorphous and crystalline form, was thoroughly analyzed via ab initio calculations. This study proposes a chemical composition possessing both improved thermal stability and sizable optical contrast for PCM-based non-volatile photonic applications.


Author(s):  
Mengjia Yin ◽  
Xianbin Xu ◽  
Tao Zhang ◽  
Conghuan Ye

Establishment of a performance evaluation model is a hotspot of current research. In this paper, the performance bottleneck is analyzed quantitatively, which provided programmers with a guidance to optimize the performance bottleneck. This paper takes a matrix as an example; the matrix is divided into a dense matrix or a sparse matrix. For dense matrix, the performance is first analyzed in a quantitative way, and an evaluation model is developed, which includes the instruction pipeline, shared memory, and global memory. For sparse matrix, this paper aims at the four formats of CSR, ELL, COO, and HYB, through the observation data obtained from the actual operation of large datasets, finds the relationship between the running time, dataset form, and storage model, and establishes their relational model functions. Through practical test and comparison, the error between the execution time of the test dataset that is predicted by the model function and the actual running time is found to be within a stable finite deviation threshold, proving that the model has certain practicability.


2021 ◽  
Author(s):  
Tse-Yuan Wang ◽  
Chun-Feng Wu ◽  
Che-Wei Tsao ◽  
Yuan-Hao Chang ◽  
Tei-Wei Kuo
Keyword(s):  

2021 ◽  
Vol 25 (4) ◽  
pp. 1031-1045
Author(s):  
Helang Lai ◽  
Keke Wu ◽  
Lingli Li

Emotion recognition in conversations is crucial as there is an urgent need to improve the overall experience of human-computer interactions. A promising improvement in this field is to develop a model that can effectively extract adequate contexts of a test utterance. We introduce a novel model, termed hierarchical memory networks (HMN), to address the issues of recognizing utterance level emotions. HMN divides the contexts into different aspects and employs different step lengths to represent the weights of these aspects. To model the self dependencies, HMN takes independent local memory networks to model these aspects. Further, to capture the interpersonal dependencies, HMN employs global memory networks to integrate the local outputs into global storages. Such storages can generate contextual summaries and help to find the emotional dependent utterance that is most relevant to the test utterance. With an attention-based multi-hops scheme, these storages are then merged with the test utterance using an addition operation in the iterations. Experiments on the IEMOCAP dataset show our model outperforms the compared methods with accuracy improvement.


Sign in / Sign up

Export Citation Format

Share Document