ACM SIGOPS Operating Systems Review

EZIOTracer

ACM SIGOPS Operating Systems Review ◽

10.1145/3469379.3469391 ◽

2021 ◽

Vol 55 (1) ◽

pp. 88-98

Author(s):

Mohammed Islam Naas ◽

François Trahay ◽

Alexis Colin ◽

Pierre Olivier ◽

Stéphane Rubini ◽

...

Keyword(s):

Analysis Framework ◽

Comprehensive Understanding ◽

Kernel Space ◽

Data Intensive ◽

Storage Performance ◽

Performance Requirements ◽

Memory Footprint ◽

Extreme Performance ◽

Data Intensive Applications ◽

Kernel Level

Tracing is a popular method for evaluating, investigating, and modeling the performance of today's storage systems. Tracing has become crucial with the increase in complexity of modern storage applications/systems, that are manipulating an ever-increasing amount of data and are subject to extreme performance requirements. There exists many tracing tools focusing either on the user-level or the kernel-level, however we observe the lack of a unified tracer targeting both levels: this prevents a comprehensive understanding of modern applications' storage performance profiles. In this paper, we present EZIOTracer, a unified I/O tracer for both (Linux) kernel and user spaces, targeting data intensive applications. EZIOTracer is composed of a userland as well as a kernel space tracer, complemented with a trace analysis framework able to merge the output of the two tracers, and in particular to relate user-level events to kernel-level ones, and vice-versa. On the kernel side, EZIOTracer relies on eBPF to offer safe, low-overhead, low memory footprint, and flexible tracing capabilities. We demonstrate using FIO benchmark the ability of EZIOTracer to track down I/O performance issues by relating events recorded at both the kernel and user levels. We show that this can be achieved with a relatively low overhead that ranges from 2% to 26% depending on the I/O intensity.

Download Full-text

Taking the Pulse of Financial Activities with Online Graph Processing

ACM SIGOPS Operating Systems Review ◽

10.1145/3469379.3469389 ◽

2021 ◽

Vol 55 (1) ◽

pp. 84-87

Author(s):

Xiaowei Zhu ◽

Zhisong Fu ◽

Zhenxuan Pan ◽

Jin Jiang ◽

Chuntao Hong ◽

...

Keyword(s):

Big Data ◽

High Throughput ◽

Low Latency ◽

Graph Processing ◽

Future Directions ◽

User Behaviors ◽

Data Volume

Graph processing has been widely adopted in various financial scenarios at Ant Group to detect malicious and prohibited user behaviors. The low latency requirement under big data volume and high throughput raises rigorous challenges for efficient online graph processing. This paper gives a brief introduction of our encountered issues, the current solutions, and some future directions we are exploring.

Download Full-text

Towards Next-Generation Cybersecurity with Graph AI

ACM SIGOPS Operating Systems Review ◽

10.1145/3469379.3469386 ◽

2021 ◽

Vol 55 (1) ◽

pp. 61-67

Author(s):

Benjamin Bowman ◽

H. Howie Huang

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Position Paper ◽

Next Generation ◽

Intelligent Algorithms

Cybersecurity professionals are inundated with large amounts of data, and require intelligent algorithms capable of distinguishing vulnerable from patched, normal from anomalous, and malicious from benign. Unfortunately, not all machine learning (ML) and artificial intelligence (AI) algorithms are created equal, and in this position paper we posit that a new breed of ML, specifically graph-based machine learning (Graph AI), is poised to make a significant impact in this domain. We will discuss the primary differentiators between traditional ML and graph ML, and provide reasons and justifications for why the latter is well-suited to many aspects of cybersecurity. We will present several example applications and result of graph ML in cybersecurity, followed by a discussion of the challenges that lie ahead.

Download Full-text

CuSP

ACM SIGOPS Operating Systems Review ◽

10.1145/3469379.3469385 ◽

2021 ◽

Vol 55 (1) ◽

pp. 47-60

Author(s):

Loc Hoang ◽

Roshan Dathathri ◽

Gurbinder Gill ◽

Keshav Pingali

Keyword(s):

Single Machine ◽

Distributed Memory ◽

The State ◽

Main Memory ◽

Input Graph ◽

Large Graphs ◽

Graph Analytics ◽

Graph Partitions ◽

High Level ◽

Edge Partitioning

Graph analytics systems must analyze graphs with billions of vertices and edges which require several terabytes of storage. Distributed-memory clusters are often used for analyzing such large graphs since the main memory of a single machine is usually restricted to a few hundreds of gigabytes. This requires partitioning the graph among the machines in the cluster. Existing graph analytics systems use a built-in partitioner that incorporates a particular partitioning policy, but the best policy is dependent on the algorithm, input graph, and platform. Therefore, built-in partitioners are not sufficiently flexible. Stand-alone graph partitioners are available, but they too implement only a few policies. CuSP is a fast streaming edge partitioning framework which permits users to specify the desired partitioning policy at a high level of abstraction and quickly generates highquality graph partitions. For example, it can partition wdc12, the largest publicly available web-crawl graph with 4 billion vertices and 129 billion edges, in under 2 minutes for clusters with 128 machines. Our experiments show that it can produce quality partitions 6× faster on average than the state-of-theart stand-alone partitioner in the literature while supporting a wider range of partitioning policies.

Download Full-text

Scalable Graph Neural Network Training

ACM SIGOPS Operating Systems Review ◽

10.1145/3469379.3469387 ◽

2021 ◽

Vol 55 (1) ◽

pp. 68-76

Author(s):

Marco Serafini

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Neural Network ◽

Network Architectures ◽

Neural Network Training ◽

Large Graphs ◽

Graph Data ◽

Network Training ◽

Neural Network Architectures ◽

Graph Neural Networks

Graph Neural Networks (GNNs) are a new and increasingly popular family of deep neural network architectures to perform learning on graphs. Training them efficiently is challenging due to the irregular nature of graph data. The problem becomes even more challenging when scaling to large graphs that exceed the capacity of single devices. Standard approaches to distributed DNN training, like data and model parallelism, do not directly apply to GNNs. Instead, two different approaches have emerged in the literature: whole-graph and sample-based training. In this paper, we review and compare the two approaches. Scalability is challenging with both approaches, but we make a case that research should focus on sample-based training since it is a more promising approach. Finally, we review recent systems supporting sample-based training.

Download Full-text

GeoGraph

ACM SIGOPS Operating Systems Review ◽

10.1145/3469379.3469384 ◽

2021 ◽

Vol 55 (1) ◽

pp. 38-46

Author(s):

Yiqiu Wang ◽

Shangdi Yu ◽

Laxman Dhulipala ◽

Yan Gu ◽

Julian Shun

Keyword(s):

High Performance ◽

Input Data ◽

Geometric Graph ◽

Graph Processing ◽

K Nearest Neighbors ◽

Data Set ◽

Ongoing Work ◽

Geometric Point ◽

Programming Effort ◽

Point Data

In many applications of graph processing, the input data is often generated from an underlying geometric point data set. However, existing high-performance graph processing frameworks assume that the input data is given as a graph. Therefore, to use these frameworks, the user must write or use external programs based on computational geometry algorithms to convert their point data set to a graph, which requires more programming effort and can also lead to performance degradation. In this paper, we present our ongoing work on the Geo- Graph framework for shared-memory multicore machines, which seamlessly supports routines for parallel geometric graph construction and parallel graph processing within the same environment. GeoGraph supports graph construction based on k-nearest neighbors, Delaunay triangulation, and b-skeleton graphs. It can then pass these generated graphs to over 25 graph algorithms. GeoGraph contains highperformance parallel primitives and algorithms implemented in C++, and includes a Python interface. We present four examples of using GeoGraph, and some experimental results showing good parallel speedups and improvements over the Higra library. We conclude with a vision of future directions for research in bridging graph and geometric data processing.

Download Full-text

GraphZero

ACM SIGOPS Operating Systems Review ◽

10.1145/3469379.3469383 ◽

2021 ◽

Vol 55 (1) ◽

pp. 21-37

Author(s):

Daniel Mawhirter ◽

Sam Reinehr ◽

Connor Holmes ◽

Tongping Liu ◽

Bo Wu

Keyword(s):

Group Theory ◽

Symmetry Breaking ◽

High Performance ◽

State Of The Art ◽

Input Graph ◽

Query Pattern ◽

Subgraph Matching ◽

Compilation Process ◽

Multiple Query ◽

Query Patterns

Subgraph matching is a fundamental task in many applications which identifies all the embeddings of a query pattern in an input graph. Compilation-based subgraph matching systems generate specialized implementations for the provided patterns and often substantially outperform other systems. However, the generated code causes significant computation redundancy and the compilation process incurs too much overhead to be used online, both due to the inherent symmetry in the structure of the query pattern. In this paper, we propose an optimizing query compiler, named GraphZero, to completely address these limitations through symmetry breaking based on group theory. GraphZero implements three novel techniques. First, its schedule explorer efficiently prunes the schedule space without missing any high-performance schedule. Second, it automatically generates and enforces a set of restrictions to eliminate computation redundancy. Third, it generalizes orientation, a surprisingly effective optimization that was only used for clique patterns, to apply to arbitrary patterns. Evaluation on multiple query patterns shows that GraphZero outperforms two state-of-the-art compilation and non-compilation based systems by up to 40X and 2654X, respectively.

Download Full-text

A Deeper Dive into Pattern-Aware Subgraph Exploration with PEREGRINE

ACM SIGOPS Operating Systems Review ◽

10.1145/3469379.3469381 ◽

2021 ◽

Vol 55 (1) ◽

pp. 1-10

Author(s):

Kasra Jamshidi ◽

Keval Vora

Keyword(s):

Graph Mining ◽

Programming Model ◽

General Purpose ◽

Mining System ◽

New Techniques ◽

Complex Graph ◽

Speed Up ◽

Efficient Exploration ◽

Key Techniques ◽

Theoretical Foundations

Graph mining workloads aim to extract structural properties of a graph by exploring its subgraph structures. PEREGRINE is a general-purpose graph mining system that provides a generic runtime to efficiently explore subgraph structures of interest and perform various graph mining analyses. It takes a 'pattern-aware' approach by incorporating a pattern-based programming model along with efficient pattern matching strategies. The programming model enables easier expression of complex graph mining use cases and enables PEREGRINE to extract the semantics of patterns. By analyzing the patterns, PEREGRINE generates efficient exploration plans which it uses to guide its subgraph exploration. In this paper, we present an in-depth view of the patternanalysis techniques powering the matching engine of PEREGRINE. Beyond the theoretical foundations from prior research, we expose opportunities based on how the exploration plans are evaluated, and develop key techniques for computation reuse, enumeration depth reduction, and branch elimination. Our experiments show the importance of patternawareness for scalable and performant graph mining where the presented new techniques speed up the performance by up to two orders of magnitude on top of the benefits achieved from the prior theoretical foundations that generate the initial exploration plans.

Download Full-text

Predicting file lifetimes for data placement in multi-tiered storage systems for HPC

ACM SIGOPS Operating Systems Review ◽

10.1145/3469379.3469392 ◽

2021 ◽

Vol 55 (1) ◽

pp. 99-107

Author(s):

Luis Thomas ◽

Sebastien Gougeaud ◽

Stéphane Rubini ◽

Philippe Deniel ◽

Jalil Boukhobza

Keyword(s):

High Performance ◽

Storage Systems ◽

High Capacity ◽

Trade Off ◽

Storage Class Memory ◽

Limited Budget ◽

User Data ◽

Good Trade ◽

The Right ◽

Storage Hierarchy

The emergence of Exascale machines in HPC will have the foreseen consequence of putting more pressure on the storage systems in place, not only in terms of capacity but also bandwidth and latency. With limited budget we cannot imagine using only storage class memory, which leads to the use of a heterogeneous tiered storage hierarchy. In order to make the most efficient use of the high performance tier in this storage hierarchy, we need to be able to place user data on the right tier and at the right time. In this paper, we assume a 2-tier storage hierarchy with a high performance tier and a high capacity archival tier. Files are placed on the high performance tier at creation time and moved to capacity tier once their lifetime expires (that is once they are no more accessed). The main contribution of this paper lies in the design of a file lifetime prediction model solely based on its path based on the use of Convolutional Neural Network. Results show that our solution strikes a good trade-off between accuracy and under-estimation. Compared to previous work, our model made it possible to reach an accuracy close to previous work (around 98.60% compared to 98.84%) while reducing the underestimations by almost 10x to reach 2.21% (compared to 21.86%). The reduction in underestimations is crucial as it avoids misplacing files in the capacity tier while they are still in use.

Download Full-text

VRGQ

ACM SIGOPS Operating Systems Review ◽

10.1145/3469379.3469382 ◽

2021 ◽

Vol 55 (1) ◽

pp. 11-20

Author(s):

Xiaolin Jiang ◽

Chengshuo Xu ◽

Rajiv Gupta

Keyword(s):

Power Law ◽

Second Step ◽

Graph Analytics ◽

Large Power ◽

Graph Query ◽

Query Result ◽

Small Set ◽

Graph Queries ◽

Global Nature ◽

Step Algorithm

While much of the research on graph analytics over large power-law graphs has focused on developing algorithms for evaluating a single global graph query, in practice we may be faced with a stream of queries. We observe that, due to their global nature, vertex specific graph queries present an opportunity for sharing work across queries. To take advantage of this opportunity, we have developed the VRGQ framework that accelerates the evaluation of a stream of queries via coarsegrained value reuse. In particular, the results of queries for a small set of source vertices are reused to speedup all future queries. We present a two step algorithm that in its first step initializes the query result based upon value reuse and then in the second step iteratively evaluates the query to convergence. The reused results for a small number of queries are held in a reuse table. Our experiments with best reuse configurations on four power law graphs and thousands of graph queries of five kinds yielded average speedups of 143×, 13.2×, 6.89×, 1.43×, and 1.18×.

Download Full-text

ACM SIGOPS Operating Systems Review
Latest Publications

TOTAL DOCUMENTS

H-INDEX

Published By Association For Computing Machinery

EZIOTracer

Taking the Pulse of Financial Activities with Online Graph Processing

Towards Next-Generation Cybersecurity with Graph AI

CuSP

Scalable Graph Neural Network Training

GeoGraph

GraphZero

A Deeper Dive into Pattern-Aware Subgraph Exploration with PEREGRINE

Predicting file lifetimes for data placement in multi-tiered storage systems for HPC

VRGQ

Export Citation Format

ACM SIGOPS Operating Systems ReviewLatest Publications

TOTAL DOCUMENTS

H-INDEX

Published By Association For Computing Machinery

EZIOTracer

Taking the Pulse of Financial Activities with Online Graph Processing

Towards Next-Generation Cybersecurity with Graph AI

CuSP

Scalable Graph Neural Network Training

GeoGraph

GraphZero

A Deeper Dive into Pattern-Aware Subgraph Exploration with PEREGRINE

Predicting file lifetimes for data placement in multi-tiered storage systems for HPC

VRGQ

ACM SIGOPS Operating Systems Review
Latest Publications