A novel approach to mining access patterns

Author(s):  
Xiuming Yu ◽  
Meijing Li ◽  
Hyeongsoo Kim ◽  
Dong Gyu Lee ◽  
Jeong Seok Park ◽  
...  
2020 ◽  
Vol 641 ◽  
pp. A66 ◽  
Author(s):  
B. Vandenbroucke ◽  
P. Camps

Context. Monte Carlo radiative transfer (MCRT) is a widely used technique to model the interaction between radiation and a medium. It plays an important role in astrophysical modelling and when these models are compared with observations. Aims. We present a novel approach to MCRT that addresses the challenging memory-access patterns of traditional MCRT algorithms, which prevent an optimal performance of MCRT simulations on modern hardware with a complex memory architecture. Methods. We reformulated the MCRT photon-packet life cycle as a task-based algorithm, whereby the computation is broken down into small tasks that are executed concurrently. Photon packets are stored in intermediate buffers, and tasks propagate photon packets through small parts of the computational domain, moving them from one buffer to another in the process. Results. Using the implementation of the new algorithm in the photoionization MCRT code CMACIONIZE 2.0, we show that the decomposition of the MCRT grid into small parts leads to a significant performance gain during the photon-packet propagation phase, which constitutes the bulk of an MCRT algorithm because memory caches are used more efficiently. Our new algorithm is faster by a factor 2 to 4 than an equivalent traditional algorithm and shows good strong scaling up to 30 threads. We briefly discuss adjustments to our new algorithm and extensions to other astrophysical MCRT applications. Conclusions. We show that optimising the memory access patterns of a memory-bound algorithm such as MCRT can yield significant performance gains.


2021 ◽  
Vol 14 (6) ◽  
pp. 864-877
Author(s):  
Lujia Yin ◽  
Yiming Zhang ◽  
Zhaoning Zhang ◽  
Yuxing Peng ◽  
Peng Zhao

Despite the fact that GPUs and accelerators are more efficient in deep learning (DL), commercial clouds like Facebook and Amazon now heavily use CPUs in DL computation because there are large numbers of CPUs which would otherwise sit idle during off-peak periods. Following the trend, CPU vendors have not only released high-performance many-core CPUs but also developed efficient math kernel libraries. However, current DL platforms cannot scale well to a large number of CPU cores, making many-core CPUs inefficient in DL computation. We analyze the memory access patterns of various layers and identify the root cause of the low scalability, i.e., the per-layer barriers that are implicitly imposed by current platforms which assign one single instance (i.e., one batch of input data) to a CPU. The barriers cause severe memory bandwidth contention and CPU starvation in the access-intensive layers (like activation and BN). This paper presents a novel approach called ParaX, which boosts the performance of DL on many-core CPUs by effectively alleviating bandwidth contention and CPU starvation. Our key idea is to assign one instance to each CPU core instead of to the entire CPU, so as to remove the per-layer barriers on the executions of the many cores. ParaX designs an ultralight scheduling policy which sufficiently overlaps the access-intensive layers with the compute-intensive ones to avoid contention, and proposes a NUMA-aware gradient server mechanism for training which leverages shared memory to substantially reduce the overhead of per-iteration parameter synchronization. We have implemented ParaX on MXNet. Extensive evaluation on a two-NUMA Intel 8280 CPU shows that ParaX significantly improves the training/inference throughput for all tested models (for image recognition and natural language processing) by 1.73X ~ 2.93X.


2019 ◽  
Vol 476 (24) ◽  
pp. 3705-3719 ◽  
Author(s):  
Avani Vyas ◽  
Umamaheswar Duvvuri ◽  
Kirill Kiselyov

Platinum-containing drugs such as cisplatin and carboplatin are routinely used for the treatment of many solid tumors including squamous cell carcinoma of the head and neck (SCCHN). However, SCCHN resistance to platinum compounds is well documented. The resistance to platinum has been linked to the activity of divalent transporter ATP7B, which pumps platinum from the cytoplasm into lysosomes, decreasing its concentration in the cytoplasm. Several cancer models show increased expression of ATP7B; however, the reason for such an increase is not known. Here we show a strong positive correlation between mRNA levels of TMEM16A and ATP7B in human SCCHN tumors. TMEM16A overexpression and depletion in SCCHN cell lines caused parallel changes in the ATP7B mRNA levels. The ATP7B increase in TMEM16A-overexpressing cells was reversed by suppression of NADPH oxidase 2 (NOX2), by the antioxidant N-Acetyl-Cysteine (NAC) and by copper chelation using cuprizone and bathocuproine sulphonate (BCS). Pretreatment with either chelator significantly increased cisplatin's sensitivity, particularly in the context of TMEM16A overexpression. We propose that increased oxidative stress in TMEM16A-overexpressing cells liberates the chelated copper in the cytoplasm, leading to the transcriptional activation of ATP7B expression. This, in turn, decreases the efficacy of platinum compounds by promoting their vesicular sequestration. We think that such a new explanation of the mechanism of SCCHN tumors’ platinum resistance identifies novel approach to treating these tumors.


2020 ◽  
Vol 51 (3) ◽  
pp. 544-560 ◽  
Author(s):  
Kimberly A. Murphy ◽  
Emily A. Diehm

Purpose Morphological interventions promote gains in morphological knowledge and in other oral and written language skills (e.g., phonological awareness, vocabulary, reading, and spelling), yet we have a limited understanding of critical intervention features. In this clinical focus article, we describe a relatively novel approach to teaching morphology that considers its role as the key organizing principle of English orthography. We also present a clinical example of such an intervention delivered during a summer camp at a university speech and hearing clinic. Method Graduate speech-language pathology students provided a 6-week morphology-focused orthographic intervention to children in first through fourth grade ( n = 10) who demonstrated word-level reading and spelling difficulties. The intervention focused children's attention on morphological families, teaching how morphology is interrelated with phonology and etymology in English orthography. Results Comparing pre- and posttest scores, children demonstrated improvement in reading and/or spelling abilities, with the largest gains observed in spelling affixes within polymorphemic words. Children and their caregivers reacted positively to the intervention. Therefore, data from the camp offer preliminary support for teaching morphology within the context of written words, and the intervention appears to be a feasible approach for simultaneously increasing morphological knowledge, reading, and spelling. Conclusion Children with word-level reading and spelling difficulties may benefit from a morphology-focused orthographic intervention, such as the one described here. Research on the approach is warranted, and clinicians are encouraged to explore its possible effectiveness in their practice. Supplemental Material https://doi.org/10.23641/asha.12290687


2015 ◽  
Vol 21 ◽  
pp. 128
Author(s):  
Kaniksha Desai ◽  
Halis Akturk ◽  
Ana Maria Chindris ◽  
Shon Meek ◽  
Robert Smallridge ◽  
...  
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document