performance bottlenecks Latest Research Papers

Graphics Processing Units~(GPUs) have been widely used to accelerate artificial intelligence, physics simulation, medical imaging, and information visualization applications. To improve GPU performance, GPU hardware designers need to identify performance issues by inspecting a huge amount of simulator-generated traces. Visualizing the execution traces can reduce the cognitive burden of users and facilitate making sense of behaviors of GPU hardware components. In this paper, we first formalize the process of GPU performance analysis and characterize the design requirements of visualizing execution traces based on a survey study and interviews with GPU hardware designers. We contribute data and task abstraction for GPU performance analysis. Based on our task analysis, we propose Daisen, a framework that supports data collection from GPU simulators and provides visualization of the simulator-generated GPU execution traces. Daisen features a data abstraction and trace format that can record simulator-generated GPU execution traces. Daisen also includes a web-based visualization tool that helps GPU hardware designers examine GPU execution traces, identify performance bottlenecks, and verify performance improvement. Our qualitative evaluation with GPU hardware designers demonstrates that the design of Daisen reflects the typical workflow of GPU hardware designers. Using Daisen, participants were able to effectively identify potential performance bottlenecks and opportunities for performance improvement. The open-sourced implementation of Daisen can be found at gitlab.com/akita/vis. Supplemental materials including a demo video, survey questions, evaluation study guide, and post-study evaluation survey are available at osf.io/j5ghq.

Download Full-text

Empirical analysis of performance bottlenecks in graph neural network training and inference with GPUs

Neurocomputing ◽

10.1016/j.neucom.2021.03.015 ◽

2021 ◽

Vol 446 ◽

pp. 165-191

Author(s):

Zhaokang Wang ◽

Yunpan Wang ◽

Chunfeng Yuan ◽

Rong Gu ◽

Yihua Huang

Keyword(s):

Neural Network ◽

Empirical Analysis ◽

Neural Network Training ◽

Performance Bottlenecks ◽

Network Training ◽

Analysis Of Performance

Download Full-text

SeqFu: A Suite of Utilities for the Robust and Reproducible Manipulation of Sequence Files

Bioengineering ◽

10.3390/bioengineering8050059 ◽

2021 ◽

Vol 8 (5) ◽

pp. 59

Author(s):

Andrea Telatin ◽

Piero Fariselli ◽

Giovanni Birolo

Keyword(s):

Next Generation Sequencing ◽

Molecular Biology ◽

High Performance ◽

Next Generation ◽

Degenerate Primers ◽

Performance Bottlenecks ◽

Quality Controls ◽

And Performance ◽

Next Generation Sequencing Ngs ◽

Generation Sequencing

Sequence files formats (FASTA and FASTQ) are commonly used in bioinformatics, molecular biology and biochemistry. With the advent of next-generation sequencing (NGS) technologies, the number of FASTQ datasets produced and analyzed has grown exponentially, urging the development of dedicated software to handle, parse, and manipulate such files efficiently. Several bioinformatics packages are available to filter and manipulate FASTA and FASTQ files, yet some essential tasks remain poorly supported, leaving gaps that any workflow analysis of NGS datasets must fill with custom scripts. This can introduce harmful variability and performance bottlenecks in pivotal steps. Here we present a suite of tools, called SeqFu (Sequence Fastx utilities), that provides a broad range of commands to perform both common and specialist operations with ease and is designed to be easily implemented in high-performance analytical pipelines. SeqFu includes high-performance implementation of algorithms to interleave and deinterleave FASTQ files, merge Illumina lanes, and perform various quality controls (identification of degenerate primers, analysis of length statistics, extraction of portions of the datasets). SeqFu dereplicates sequences from multiple files keeping track of their provenance. SeqFu is developed in Nim for high-performance processing, is freely available, and can be installed with the popular package manager Miniconda.

Download Full-text

REMOVING PERFORMANCE BOTTLENECKS ON SSDS AND SSD-BASED STORAGE SYSTEMS

10.23860/pei-shuyi-2021 ◽

2021 ◽

Author(s):

◽

Shuyi Pei

Keyword(s):

Storage Systems ◽

Performance Bottlenecks

Download Full-text

Towards Question-Answering as an Automatic Metric for Evaluating the Content Quality of a Summary

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00397 ◽

2021 ◽

Vol 9 ◽

pp. 774-789

Author(s):

Daniel Deutsch ◽

Tania Bedrax-Weiss ◽

Dan Roth

Keyword(s):

Gold Standard ◽

Question Answering ◽

State Of The Art ◽

Careful Analysis ◽

Performance Bottlenecks ◽

Content Quality ◽

Current State ◽

Benchmark Datasets ◽

Evaluation Metric

Abstract A desirable property of a reference-based evaluation metric that measures the content quality of a summary is that it should estimate how much information that summary has in common with a reference. Traditional text overlap based metrics such as ROUGE fail to achieve this because they are limited to matching tokens, either lexically or via embeddings. In this work, we propose a metric to evaluate the content quality of a summary using question-answering (QA). QA-based methods directly measure a summary’s information overlap with a reference, making them fundamentally different than text overlap metrics. We demonstrate the experimental benefits of QA-based metrics through an analysis of our proposed metric, QAEval. QAEval outperforms current state-of-the-art metrics on most evaluations using benchmark datasets, while being competitive on others due to limitations of state-of-the-art models. Through a careful analysis of each component of QAEval, we identify its performance bottlenecks and estimate that its potential upper-bound performance surpasses all other automatic metrics, approaching that of the gold-standard Pyramid Method.1

Download Full-text

Treasury Single Account Policy in Nigeria: Performance, Bottlenecks and Prospects

International Journal of Social, Political and Economic Research ◽

10.46291/ijospervol7iss4pp801-813 ◽

2020 ◽

Vol 7 (4) ◽

pp. 801-813

Author(s):

Abdulgaffar Muhammad

Keyword(s):

Federal Republic ◽

Systems Theory ◽

Financial Institution ◽

Financial System ◽

Secondary Data ◽

Government Revenue ◽

Performance Bottlenecks ◽

The Public

The treasury single account model was introduced in the Federal Republic of Nigeria to mitigate financial leakages, promote probity and prevent misappropriation of government revenue and also consolidate government accounts, this is a bid to prevent embezzlement and high handedness by revenue generating agencies. This work examined the performance, bottlenecks and prospects of treasury single account policy in Nigeria. The paper being qualitative as it relies heavily on secondary data. The study was underpinned by the systems theory. The study concluded that the implementation of the treasury single account has blocked financial leakages, promoted probity and accountability to a very large extent in the public financial system. Consequently, the paper suggested for a synergy between the executive and legislature to enforce and ensure compliance to the provisions of the TSA by ministries extra ministerial department and financial institution.

Download Full-text

Characterization and Identification of Cloudified Mobile Network Performance Bottlenecks

IEEE Transactions on Network and Service Management ◽

10.1109/tnsm.2020.3018538 ◽

2020 ◽

Vol 17 (4) ◽

pp. 2567-2583

Author(s):

Georgios Patounas ◽

Xenofon Foukas ◽

Ahmed Elmokashfi ◽

Mahesh K. Marina

Keyword(s):

Network Performance ◽

Mobile Network ◽

Performance Bottlenecks

Download Full-text

Demystifying Power and Performance Bottlenecks in Autonomous Driving Systems

2020 IEEE International Symposium on Workload Characterization (IISWC) ◽

10.1109/iiswc50251.2020.00028 ◽

2020 ◽

Author(s):

Pedro H. E. Becker ◽

Jose Maria Arnau ◽

Antonio Gonzalez

Keyword(s):

Autonomous Driving ◽

Performance Bottlenecks ◽

And Performance

Download Full-text

The Road Not Taken: Re-thinking the Feasibility of Voice Calling Over Tor

Proceedings on Privacy Enhancing Technologies ◽

10.2478/popets-2020-0063 ◽

2020 ◽

Vol 2020 (4) ◽

pp. 69-88

Author(s):

Piyush Kumar Sharma ◽

Shashwat Chaudhary ◽

Nikhil Hassija ◽

Mukulika Maity ◽

Sambuddho Chakravarty

Keyword(s):

Voice Quality ◽

The Internet ◽

Performance Bottlenecks ◽

Available Bandwidth ◽

The Road ◽

Perceptual Evaluation ◽

Performance Guarantees ◽

Cross Traffic ◽

Real Time Applications ◽

Shed Light

AbstractAnonymous VoIP calls over the Internet holds great significance for privacy-conscious users, whistle-blowers and political activists alike. Prior research deems popular anonymization systems like Tor unsuitable for providing the requisite performance guarantees that real-time applications like VoIP need. Their claims are backed by studies that may no longer be valid due to constant advancements in Tor. Moreover, we believe that these studies lacked the requisite diversity and comprehensiveness. Thus, conclusions from these studies, led them to propose novel and tailored solutions. However, no such system is available for immediate use. Additionally, operating such new systems would incur significant costs for recruiting users and volunteered relays, to provide the necessary anonymity guarantees.It thus becomes an imperative that the exact performance of VoIP over Tor be quantified and analyzed, so that the potential performance bottlenecks can be amended. We thus conducted an extensive empirical study across various in-lab and real world scenarios to shed light on VoIP performance over Tor. In over half a million calls spanning 12 months, across seven countries and covering about 6650 Tor relays, we observed that Tor supports good voice quality (Perceptual Evaluation of Speech Quality (PESQ) >3 and one-way delay <400 ms) in more than 85% of cases. Further analysis indicates that in general for most Tor relays, the contentions due to cross-traffic were low enough to support VoIP calls, that are anyways transmitted at low rates (<120 Kbps). Our findings are supported by concordant measurements using iperf that show more than the adequate available bandwidth for most cases. Hence, unlike prior efforts, our research reveals that Tor is suitable for supporting anonymous VoIP calls.

Download Full-text

Building blocks for persistent memory

The VLDB Journal ◽

10.1007/s00778-020-00622-9 ◽

2020 ◽

Vol 29 (6) ◽

pp. 1223-1241

Author(s):

Alexander van Renen ◽

Lukas Vogel ◽

Viktor Leis ◽

Thomas Neumann ◽

Alfons Kemper

Keyword(s):

Performance Evaluation ◽

Building Blocks ◽

Database Systems ◽

Performance Bottlenecks ◽

Persistent Memory ◽

Comprehensive Performance ◽

Memory Modules ◽

Real Hardware ◽

Writing Block ◽

Level Building

AbstractI/O latency and throughput are two of the major performance bottlenecks for disk-based database systems. Persistent memory (PMem) technologies, like Intel’s Optane DC persistent memory modules, promise to bridge the gap between NAND-based flash (SSD) and DRAM, and thus eliminate the I/O bottleneck. In this paper, we provide the first comprehensive performance evaluation of PMem on real hardware in terms of bandwidth and latency. Based on the results, we develop guidelines for efficient PMem usage and four optimized low-level building blocks for PMem applications: log writing, block flushing, in-place updates, and coroutines for write latency hiding.

Download Full-text

performance bottlenecks
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Daisen: A Framework for Visualizing Detailed GPU Execution

Empirical analysis of performance bottlenecks in graph neural network training and inference with GPUs

SeqFu: A Suite of Utilities for the Robust and Reproducible Manipulation of Sequence Files

REMOVING PERFORMANCE BOTTLENECKS ON SSDS AND SSD-BASED STORAGE SYSTEMS

Towards Question-Answering as an Automatic Metric for Evaluating the Content Quality of a Summary

Treasury Single Account Policy in Nigeria: Performance, Bottlenecks and Prospects

Characterization and Identification of Cloudified Mobile Network Performance Bottlenecks

Demystifying Power and Performance Bottlenecks in Autonomous Driving Systems

The Road Not Taken: Re-thinking the Feasibility of Voice Calling Over Tor

Building blocks for persistent memory

Export Citation Format

performance bottlenecksRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Daisen: A Framework for Visualizing Detailed GPU Execution

Empirical analysis of performance bottlenecks in graph neural network training and inference with GPUs

SeqFu: A Suite of Utilities for the Robust and Reproducible Manipulation of Sequence Files

REMOVING PERFORMANCE BOTTLENECKS ON SSDS AND SSD-BASED STORAGE SYSTEMS

Towards Question-Answering as an Automatic Metric for Evaluating the Content Quality of a Summary

Treasury Single Account Policy in Nigeria: Performance, Bottlenecks and Prospects

Characterization and Identification of Cloudified Mobile Network Performance Bottlenecks

Demystifying Power and Performance Bottlenecks in Autonomous Driving Systems

The Road Not Taken: Re-thinking the Feasibility of Voice Calling Over Tor

Building blocks for persistent memory

performance bottlenecks
Recently Published Documents