Proceedings of the ACM on Measurement and Analysis of Computing Systems

One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search

Proceedings of the ACM on Measurement and Analysis of Computing Systems ◽

10.1145/3491046 ◽

2021 ◽

Vol 5 (3) ◽

pp. 1-34

Author(s):

Bingqian Lu ◽

Jianyi Yang ◽

Weiwen Jiang ◽

Yiyu Shi ◽

Shaolei Ren

Keyword(s):

State Of The Art ◽

Autonomous Driving ◽

Pareto Optimal ◽

Video Content ◽

Fast Evaluation ◽

Video Content Analysis ◽

Search Spaces ◽

Neural Architecture ◽

Real World Applications ◽

Prohibitive Cost

Convolutional neural networks (CNNs) are used in numerous real-world applications such as vision-based autonomous driving and video content analysis. To run CNN inference on various target devices, hardware-aware neural architecture search (NAS) is crucial. A key requirement of efficient hardware-aware NAS is the fast evaluation of inference latencies in order to rank different architectures. While building a latency predictor for each target device has been commonly used in state of the art, this is a very time-consuming process, lacking scalability in the presence of extremely diverse devices. In this work, we address the scalability challenge by exploiting latency monotonicity --- the architecture latency rankings on different devices are often correlated. When strong latency monotonicity exists, we can re-use architectures searched for one proxy device on new target devices, without losing optimality. In the absence of strong latency monotonicity, we propose an efficient proxy adaptation technique to significantly boost the latency monotonicity. Finally, we validate our approach and conduct experiments with devices of different platforms on multiple mainstream search spaces, including MobileNet-V2, MobileNet-V3, NAS-Bench-201, ProxylessNAS and FBNet. Our results highlight that, by using just one proxy device, we can find almost the same Pareto-optimal architectures as the existing per-device NAS, while avoiding the prohibitive cost of building a latency predictor for each device.

Trade or Trick?

Proceedings of the ACM on Measurement and Analysis of Computing Systems ◽

10.1145/3491051 ◽

2021 ◽

Vol 5 (3) ◽

pp. 1-26

Author(s):

Pengcheng Xia ◽

Haoyu Wang ◽

Bingyu Gao ◽

Weihang Su ◽

Zhou Yu ◽

...

Keyword(s):

Machine Learning ◽

Security And Privacy ◽

Smart Contracts ◽

Early Stages ◽

Asset Trading ◽

Digital Asset ◽

Guilt By Association ◽

Privacy Issues ◽

Digital Assets

The prosperity of the cryptocurrency ecosystem drives the need for digital asset trading platforms. Beyond centralized exchanges (CEXs), decentralized exchanges (DEXs) are introduced to allow users to trade cryptocurrency without transferring the custody of their digital assets to the middlemen, thus eliminating the security and privacy issues of traditional CEX. Uniswap, as the most prominent cryptocurrency DEX, is continuing to attract scammers, with fraudulent cryptocurrencies flooding in the ecosystem. In this paper, we take the first step to detect and characterize scam tokens on Uniswap. We first collect all the transactions related to Uniswap V2 exchange and investigate the landscape of cryptocurrency trading on Uniswap from different perspectives. Then, we propose an accurate approach for flagging scam tokens on Uniswap based on a guilt-by-association heuristic and a machine-learning powered technique. We have identified over 10K scam tokens listed on Uniswap, which suggests that roughly 50% of the tokens listed on Uniswap are scam tokens. All the scam tokens and liquidity pools are created specialized for the "rug pull" scams, and some scam tokens have embedded tricks and backdoors in the smart contracts. We further observe that thousands of collusion addresses help carry out the scams in league with the scam token/pool creators. The scammers have gained a profit of at least $16 million from 39,762 potential victims. Our observations in this paper suggest the urgency to identify and stop scams in the decentralized finance ecosystem, and our approach can act as a whistleblower that identifies scam tokens at their early stages.

Tuxedo: Maximizing Smart Contract Computation in PoW Blockchains

Proceedings of the ACM on Measurement and Analysis of Computing Systems ◽

10.1145/3491053 ◽

2021 ◽

Vol 5 (3) ◽

pp. 1-30

Author(s):

Sourav Das ◽

Nitin Awathare ◽

Ling Ren ◽

Vinay J. Ribeiro ◽

Umesh Bellur

Keyword(s):

System Parameter ◽

Security Analysis ◽

Scaling Up ◽

Interarrival Time ◽

End To End Delay ◽

Smart Contract ◽

Key Innovation ◽

Synchronous Network ◽

And Performance ◽

Harmful Effects

Proof-of-Work (PoW) based blockchains typically allocate only a tiny fraction (e.g., less than 1% for Ethereum) of the average interarrival time (I) between blocks for validating smart contracts present in transactions. In such systems, block validation and PoW mining are typically performed sequentially, the former by CPUs and the latter by ASICs. A trivial increase in validation time (τ) introduces the popularly known Verifier's Dilemma, and as we demonstrate, causes more forking and hurts fairness. Large τ also reduces the tolerance for safety against a Byzantine adversary. Solutions that offload validation to a set of non-chain nodes (a.k.a. off-chain approaches) suffer from trust and performance issues that are non-trivial to resolve. In this paper, we present Tuxedo, the first on-chain protocol to theoretically scale τ/I ≈1 in PoW blockchains. The key innovation in Tuxedo is to perform CPU-based block processing in parallel to ASIC mining. We achieve this by allowing miners to delay validation of transactions in a block by up to ζ blocks, where ζ is a system parameter. We perform security analysis of Tuxedo considering all possible adversarial strategies in a synchronous network with maximum end-to-end delay Δ and demonstrate that Tuxedo achieves security equivalent to known results for longest chain PoW Nakamoto consensus. Our prototype implementation of Tuxedo atop Ethereum demonstrates that it can scale τ without suffering the harmful effects of naive scaling up of τ/I in existing blockchains

Real-time Bidding for Time Constrained Impression Contracts in First and Second Price Auctions - Theory and Algorithms

Proceedings of the ACM on Measurement and Analysis of Computing Systems ◽

10.1145/3491049 ◽

2021 ◽

Vol 5 (3) ◽

pp. 1-37

Author(s):

Ryan J. Kinnear ◽

Ravi R. Mazumdar ◽

Peter Marbach

Keyword(s):

Real Time ◽

Optimization Problem ◽

Optimal Solution ◽

Piecewise Constant Function ◽

Convex Optimization Problem ◽

Piecewise Constant ◽

Infinite Dimensional ◽

Optimal Behavior ◽

Price Auction ◽

Second Price Auctions

We study the optimal bids and allocations in a real-time auction for heterogeneous items subject to the requirement that specified collections of items of given types be acquired within given time constraints. The problem is cast as a continuous time optimization problem that can, under certain weak assumptions, be reduced to a convex optimization problem. Focusing on the standard first and second price auctions, we first show, using convex duality, that the optimal (infinite dimensional) bidding policy can be represented by a single finite vector of so-called ''pseudo-bids''. Using this result we are able to show that the optimal solution in the second price case turns out to be a very simple piecewise constant function of time. This contrasts with the first price case that is more complicated. Despite the fact that the optimal solution for the first price auction is genuinely dynamic, we show that there remains a close connection between the two cases and that, empirically, there is almost no difference between optimal behavior in either setting. This suggests that it is adequate to bid in a first price auction as if it were in fact second price. Finally, we detail methods for implementing our bidding policies in practice with further numerical simulations illustrating the performance.

Cerberus

Proceedings of the ACM on Measurement and Analysis of Computing Systems ◽

10.1145/3491050 ◽

2021 ◽

Vol 5 (3) ◽

pp. 1-33

Author(s):

Chen Griner ◽

Johannes Zerwas ◽

Andreas Blenk ◽

Manya Ghobadi ◽

Stefan Schmid ◽

...

Keyword(s):

Traffic Patterns ◽

Dynamic Demand ◽

Alternative Approaches

The bandwidth and latency requirements of modern datacenter applications have led researchers to propose various topology designs using static, dynamic demand-oblivious (rotor), and/or dynamic demand-aware switches. However, given the diverse nature of datacenter traffic, there is little consensus about how these designs would fare against each other. In this work, we analyze the throughput of existing topology designs under different traffic patterns and study their unique advantages and potential costs in terms of bandwidth and latency ''tax''. To overcome the identified inefficiencies, we propose Cerberus, a unified, two-layer leaf-spine optical datacenter design with three topology types. Cerberus systematically matches different traffic patterns with their most suitable topology type: e.g., latency-sensitive flows are transmitted via a static topology, all-to-all traffic via a rotor topology, and elephant flows via a demand-aware topology. We show analytically and in simulations that Cerberus can improve throughput significantly compared to alternative approaches and operate datacenters at higher loads while being throughput-proportional.

Competitive Algorithms for Online Multidimensional Knapsack Problems

Proceedings of the ACM on Measurement and Analysis of Computing Systems ◽

10.1145/3491042 ◽

2021 ◽

Vol 5 (3) ◽

pp. 1-30

Author(s):

Lin Yang ◽

Ali Zeynali ◽

Mohammad H. Hajiesmaili ◽

Ramesh K. Sitaraman ◽

Don Towsley

Keyword(s):

Lower Bound ◽

Knapsack Problem ◽

Competitive Ratio ◽

Job Scheduling ◽

Natural Generalization ◽

Constant Factor ◽

Single Dimension ◽

Multidimensional Knapsack ◽

Minimum Capacity ◽

Machine Allocation

In this paper, we study the online multidimensional knapsack problem (called OMdKP) in which there is a knapsack whose capacity is represented in m dimensions, each dimension could have a different capacity. Then, n items with different scalar profit values and m-dimensional weights arrive in an online manner and the goal is to admit or decline items upon their arrival such that the total profit obtained by admitted items is maximized and the capacity of knapsack across all dimensions is respected. This is a natural generalization of the classic single-dimension knapsack problem and finds several relevant applications such as in virtual machine allocation, job scheduling, and all-or-nothing flow maximization over a graph. We develop two algorithms for OMdKP that use linear and exponential reservation functions to make online admission decisions. Our competitive analysis shows that the linear and exponential algorithms achieve the competitive ratios of O(θα ) and O(łogł(θα)), respectively, where α is the ratio between the aggregate knapsack capacity and the minimum capacity over a single dimension and θ is the ratio between the maximum and minimum item unit values. We also characterize a lower bound for the competitive ratio of any online algorithm solving OMdKP and show that the competitive ratio of our algorithm with exponential reservation function matches the lower bound up to a constant factor.

Offline and Online Algorithms for SSD Management

Proceedings of the ACM on Measurement and Analysis of Computing Systems ◽

10.1145/3491045 ◽

2021 ◽

Vol 5 (3) ◽

pp. 1-28

Author(s):

Tomer Lange ◽

Joseph (Seffi) Naor ◽

Gala Yadgar

Keyword(s):

Greedy Algorithm ◽

Prediction Error ◽

Large Scale ◽

Optimal Algorithm ◽

Empirical Evaluation ◽

Deterministic Algorithm ◽

Management Problem ◽

Solid State Drives ◽

Wide Range ◽

The Greedy Algorithm

Flash-based solid state drives (SSDs) have gained a central role in the infrastructure of large-scale datacenters, as well as in commodity servers and personal devices. The main limitation of flash media is its inability to support update-in-place: after data has been written to a physical location, it has to be erased before new data can be written to it. Moreover, SSDs support read and write operations in granularity of pages, while erasures are performed on entire blocks, which often contain hundreds of pages. When erasing a block, any valid data it stores must be rewritten to a clean location. As an SSD eventually wears out with progressing number of erasures, the efficiency of the management algorithm has a significant impact on its endurance. In this paper we first formally define the SSD management problem. We then explore this problem from an algorithmic perspective, considering it in both offline and online settings. In the offline setting, we present a near-optimal algorithm that, given any input, performs a negligible number of rewrites (relative to the input length). We also discuss the hardness of the offline problem. In the online setting, we first consider algorithms that have no prior knowledge about the input. We prove that no deterministic algorithm outperforms the greedy algorithm in this setting, and discuss the possible benefit of randomization. We then augment our model, assuming that each request for a page arrives with a prediction of the next time the page is updated. We design an online algorithm that uses such predictions, and show that its performance improves as the prediction error decreases. We also show that the performance of our algorithm is never worse than that guaranteed by the greedy algorithm, even when the prediction error is large. We complement our theoretical findings with an empirical evaluation of our algorithms, comparing them with the state-of-the-art scheme. The results confirm that our algorithms exhibit an improved performance for a wide range of input traces.

Dissecting Cloud Gaming Performance with DECAF

Proceedings of the ACM on Measurement and Analysis of Computing Systems ◽

10.1145/3491043 ◽

2021 ◽

Vol 5 (3) ◽

pp. 1-27

Author(s):

Hassan Iqbal ◽

Ayesha Khalid ◽

Muhammad Shahzad

Keyword(s):

High Variability ◽

Frame Rate ◽

Video Streams ◽

Round Trip ◽

Packet Losses ◽

Available Bandwidth ◽

The Past ◽

Manual Intervention ◽

Comprehensive Measurement ◽

Round Trip Delay

Cloud gaming platforms have witnessed tremendous growth over the past two years with a number of large Internet companies including Amazon, Facebook, Google, Microsoft, and Nvidia publicly launching their own platforms. While cloud gaming platforms continue to grow, the visibility in their performance and relative comparison is lacking. This is largely due to absence of systematic measurement methodologies which can generally be applied. As such, in this paper, we implement DECAF, a methodology to systematically analyze and dissect the performance of cloud gaming platforms across different game genres and game platforms. DECAF is highly automated and requires minimum manual intervention. By applying DECAF, we measure the performance of three commercial cloud gaming platforms including Google Stadia, Amazon Luna, and Nvidia GeForceNow, and uncover a number of important findings. First, we find that processing delays in the cloud comprise majority of the total round trip delay experienced by users, accounting for as much as 73.54% of total user-perceived delay. Second, we find that video streams delivered by cloud gaming platforms are characterized by high variability of bitrate, frame rate, and resolution. Platforms struggle to consistently serve 1080p/60 frames per second streams across different game genres even when the available bandwidth is 8-20× that of platform's recommended settings. Finally, we show that game platforms exhibit performance cliffs by reacting poorly to packet losses, in some cases dramatically reducing the delivered bitrate by up to 6.6× when loss rates increase from 0.1% to 1%. Our work has important implications for cloud gaming platforms and opens the door for further research on comprehensive measurement methodologies for cloud gaming.

Xatu: Richer Neural Network Based Prediction for Video Streaming

Proceedings of the ACM on Measurement and Analysis of Computing Systems ◽

10.1145/3491056 ◽

2021 ◽

Vol 5 (3) ◽

pp. 1-26

Author(s):

Yun Seong Nam ◽

Jianfei Gao ◽

Chandan Bothra ◽

Ehab Ghabashneh ◽

Sanjay Rao ◽

...

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Video Streaming ◽

Real World ◽

State Of The Art ◽

Clustering Method ◽

Download Time ◽

Abr Algorithm ◽

Fully Connected ◽

Prediction Approach

The performance of Adaptive Bitrate (ABR) algorithms for video streaming depends on accurately predicting the download time of video chunks. Existing prediction approaches (i) assume chunk download times are dominated by network throughput; and (ii) apriori cluster sessions (e.g., based on ISP and CDN) and only learn from sessions in the same cluster. We make three contributions. First, through analysis of data from real-world video streaming sessions, we show (i) apriori clustering prevents learning from related clusters; and (ii) factors such as the Time to First Byte (TTFB) are key components of chunk download times but not easily incorporated into existing prediction approaches. Second, we propose Xatu, a new prediction approach that jointly learns a neural network sequence model with an interpretable automatic session clustering method. Xatu learns clustering rules across all sessions it deems relevant, and models sequences with multiple chunk-dependent features (e.g., TTFB) rather than just throughput. Third, evaluations using the above datasets and emulation experiments show that Xatu significantly improves prediction accuracies by 23.8% relative to CS2P (a state-of-the-art predictor). We show Xatu provides substantial performance benefits when integrated with multiple ABR algorithms including MPC (a well studied ABR algorithm), and FuguABR (a recent algorithm using stochastic control) relative to their default predictors (CS2P and a fully connected neural network respectively). Further, Xatu combined with MPC outperforms Pensieve, an ABR based on deep reinforcement learning.

Understanding the Practices of Global Censorship through Accurate, End-to-End Measurements

Proceedings of the ACM on Measurement and Analysis of Computing Systems ◽

10.1145/3491055 ◽

2021 ◽

Vol 5 (3) ◽

pp. 1-25

Author(s):

Lin Jin ◽

Shuai Hao ◽

Haining Wang ◽

Chase Cotton

Keyword(s):

Large Scale ◽

False Positive Rate ◽

False Negative ◽

False Negative Rate ◽

Ground Truth ◽

Positive Rate ◽

End To End ◽

Internet Censorship ◽

Different Response ◽

Vantage Points

It is challenging to conduct a large scale Internet censorship measurement, as it involves triggering censors through artificial requests and identifying abnormalities from corresponding responses. Due to the lack of ground truth on the expected responses from legitimate services, previous studies typically require a heavy, unscalable manual inspection to identify false positives while still leaving false negatives undetected. In this paper, we propose Disguiser, a novel framework that enables end-to-end measurement to accurately detect the censorship activities and reveal the censor deployment without manual efforts. The core of Disguiser is a control server that replies with a static payload to provide the ground truth of server responses. As such, we send requests from various types of vantage points across the world to our control server, and the censorship activities can be recognized if a vantage point receives a different response. In particular, we design and conduct a cache test to pre-exclude the vantage points that could be interfered by cache proxies along the network path. Then we perform application traceroute towards our control server to explore censors' behaviors and their deployment. With Disguiser, we conduct 58 million measurements from vantage points in 177 countries. We observe 292 thousand censorship activities that block DNS, HTTP, or HTTPS requests inside 122 countries, achieving a 10^-6 false positive rate and zero false negative rate. Furthermore, Disguiser reveals the censor deployment in 13 countries.

Proceedings of the ACM on Measurement and Analysis of Computing Systems
Latest Publications

TOTAL DOCUMENTS

H-INDEX

Published By Association For Computing Machinery

One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search

Trade or Trick?

Tuxedo: Maximizing Smart Contract Computation in PoW Blockchains

Real-time Bidding for Time Constrained Impression Contracts in First and Second Price Auctions - Theory and Algorithms

Cerberus

Competitive Algorithms for Online Multidimensional Knapsack Problems

Offline and Online Algorithms for SSD Management

Dissecting Cloud Gaming Performance with DECAF

Xatu: Richer Neural Network Based Prediction for Video Streaming

Understanding the Practices of Global Censorship through Accurate, End-to-End Measurements

Export Citation Format

Proceedings of the ACM on Measurement and Analysis of Computing SystemsLatest Publications

TOTAL DOCUMENTS

H-INDEX

Published By Association For Computing Machinery

One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search

Trade or Trick?

Tuxedo: Maximizing Smart Contract Computation in PoW Blockchains

Real-time Bidding for Time Constrained Impression Contracts in First and Second Price Auctions - Theory and Algorithms

Cerberus

Competitive Algorithms for Online Multidimensional Knapsack Problems

Offline and Online Algorithms for SSD Management

Dissecting Cloud Gaming Performance with DECAF

Xatu: Richer Neural Network Based Prediction for Video Streaming

Understanding the Practices of Global Censorship through Accurate, End-to-End Measurements

Proceedings of the ACM on Measurement and Analysis of Computing Systems
Latest Publications