data access patterns
Recently Published Documents


TOTAL DOCUMENTS

46
(FIVE YEARS 15)

H-INDEX

6
(FIVE YEARS 1)

2021 ◽  
Vol 14 (11) ◽  
pp. 2491-2504
Author(s):  
Pranjal Gupta ◽  
Amine Mhedhbi ◽  
Semih Salihoglu

We revisit column-oriented storage and query processing techniques in the context of contemporary graph database management systems (GDBMSs). Similar to column-oriented RDBMSs, GDBMSs support read-heavy analytical workloads that however have fundamentally different data access patterns than traditional analytical workloads. We first derive a set of desiderata for optimizing storage and query processors of GDBMS based on their access patterns. We then present the design of columnar storage, compression, and query processing techniques based on these desiderata. In addition to showing direct integration of existing techniques from columnar RDBMSs, we also propose novel ones that are optimized for GDBMSs. These include a novel list-based query processor, which avoids expensive data copies of traditional block-based processors under many-to-many joins, a new data structure we call single-indexed edge property pages and an accompanying edge ID scheme, and a new application of Jacobson's bit vector index for compressing NULL values and empty lists. We integrated our techniques into the GraphflowDB in-memory GDBMS. Through extensive experiments, we demonstrate the scalability and query performance benefits of our techniques.


2021 ◽  
Vol 14 (10) ◽  
pp. 1900-1912
Author(s):  
Yingqiang Zhang ◽  
Chaoyi Ruan ◽  
Cheng Li ◽  
Xinjun Yang ◽  
Wei Cao ◽  
...  

It is challenging for cloud-native relational databases to meet the ever-increasing needs of scaling compute and memory resources independently and elastically. The recent emergence of memory disaggregation architecture, relying on high-speed RDMA network, offers opportunities to build cost-effective and elastic cloud-native databases. There exist proposals to let unmodified applications run transparently on disaggregated systems. However, running relational database kernel atop such proposals experiences notable performance degradation and time-consuming failure recovery, offsetting the benefits of disaggregation. To address these challenges, in this paper, we propose a novel database architecture called LegoBase, which explores the co-design of database kernel and memory disaggregation. It pushes the memory management back to the database layer for bypassing the Linux I/O stack and re-using or designing (remote) memory access optimizations with an understanding of data access patterns. LegoBase further splits the conventional ARIES fault tolerance protocol to independently handle the local and remote memory failures for fast recovery of compute instances. We implemented LegoBase atop MySQL. We compare LegoBase against MySQL running on a standalone machine and the state-of-the-art disaggregation proposal Infiniswap. Our evaluation shows that even with a large fraction of data placed on the remote memory, LegoBase's system performance in terms of throughput (up to 9.41% drop) and P99 latency (up to 11.58% increase) is comparable to the monolithic MySQL setup, and significantly outperforms (1.99x-2.33x, respectively) the deployment of MySQL over Infiniswap. Meanwhile, LegoBase introduces an up to 3.87x and 5.48x speedup of the recovery and warm-up time, respectively, over the monolithic MySQL and MySQL over Infiniswap, when handling failures or planned re-configurations.


2020 ◽  
Vol 11 (04) ◽  
pp. 680-691
Author(s):  
Luca Calzoni ◽  
Gilles Clermont ◽  
Gregory F. Cooper ◽  
Shyam Visweswaran ◽  
Harry Hochheiser

Abstract Background Complex electronic medical records (EMRs) presenting large amounts of data create risks of cognitive overload. We are designing a Learning EMR (LEMR) system that utilizes models of intensive care unit (ICU) physicians' data access patterns to identify and then highlight the most relevant data for each patient. Objectives We used insights from literature and feedback from potential users to inform the design of an EMR display capable of highlighting relevant information. Methods We used a review of relevant literature to guide the design of preliminary paper prototypes of the LEMR user interface. We observed five ICU physicians using their current EMR systems in preparation for morning rounds. Participants were interviewed and asked to explain their interactions and challenges with the EMR systems. Findings informed the revision of our prototypes. Finally, we conducted a focus group with five ICU physicians to elicit feedback on our designs and to generate ideas for our final prototypes using participatory design methods. Results Participating physicians expressed support for the LEMR system. Identified design requirements included the display of data essential for every patient together with diagnosis-specific data and new or significantly changed information. Respondents expressed preferences for fishbones to organize labs, mouseovers to access additional details, and unobtrusive alerts minimizing color-coding. To address the concern about possible physician overreliance on highlighting, participants suggested that non-highlighted data should remain accessible. Study findings led to revised prototypes, which will inform the development of a functional user interface. Conclusion In the feedback we received, physicians supported pursuing the concept of a LEMR system. By introducing novel ways to support physicians' cognitive abilities, such a system has the potential to enhance physician EMR use and lead to better patient outcomes. Future plans include laboratory studies of both the utility of the proposed designs on decision-making, and the possible impact of any automation bias.


2020 ◽  
Vol 13 (12) ◽  
pp. 1656-1671 ◽  
Author(s):  
Jizhe Xia ◽  
Sicheng Huang ◽  
Shaobiao Zhang ◽  
Xiaoming Li ◽  
Jianrong Lyu ◽  
...  

2020 ◽  
Vol 245 ◽  
pp. 09015
Author(s):  
Philippe Canal ◽  
Elizabeth Sexton-Kennedy ◽  
Jonathan Madsen ◽  
Soon Yung Jun ◽  
Guilherme Lima ◽  
...  

The upcoming generation of exascale HPC machines will all have most of their computing power provided by GPGPU accelerators. In order to be able to take advantage of this class of machines for HEP Monte Carlo simulations, we started to develop a Geant pilot application as a collaboration between HEP and the Exascale Computing Project. We will use this pilot to study and characterize how the machines’ architecture affects performance. The pilot will encapsulate the minimum set of physics and software framework processes necessary to describe a representative HEP simulation problem. The pilot will then be used to exercise communication, computation, and data access patterns. The project’s main objective is to identify re-engineering opportunities that will increase event throughput by improving single node performance and being able to make efficient use of the next generation of accelerators available in Exascale facilities.


2020 ◽  
Vol 245 ◽  
pp. 03005
Author(s):  
Pascal Paschos ◽  
Benedikt Riedel ◽  
Mats Rynge ◽  
Lincoln Bryant ◽  
Judith Stephen ◽  
...  

In this paper we showcase the support in Open Science Grid (OSG) of Midscale collaborations, the region of computing and storage scale where multi-institutional researchers collaborate to execute their science workflows on the grid without having dedicated technical support teams of their own. Collaboration Services enables such collaborations to take advantage of the distributed resources of the Open Science Grid by facilitating access to submission hosts, the deployment of their applications and supporting their data management requirements. Distributed computing software adopted from large scale collaborations, such as CVMFS, Rucio, xCache lower the barrier of intermediate scale research to integrate with existing infrastructure.


Sign in / Sign up

Export Citation Format

Share Document