A Flexible Parallel Runtime for Large Scale Block-Based Matrix Multiplication

Enabling Large-Scale Simulations of Quantum Transport with Manycore Computing

Electronics ◽

10.3390/electronics10030253 ◽

2021 ◽

Vol 10 (3) ◽

pp. 253

Author(s):

Yosang Jeong ◽

Hoon Ryu

Keyword(s):

Quantum Transport ◽

Large Scale ◽

Performance Enhancement ◽

Silicon Nanowire ◽

Matrix Multiplication ◽

Tight Binding ◽

Optimization Techniques ◽

Wide Energy Range ◽

Processing Unit ◽

Binding Model

The non-equilibrium Green’s function (NEGF) is being utilized in the field of nanoscience to predict transport behaviors of electronic devices. This work explores how much performance improvement can be driven for quantum transport simulations with the aid of manycore computing, where the core numerical operation involves a recursive process of matrix multiplication. Major techniques adopted for performance enhancement are data restructuring, matrix tiling, thread scheduling, and offload computing, and we present technical details on how they are applied to optimize the performance of simulations in computing hardware, including Intel Xeon Phi Knights Landing (KNL) systems and NVIDIA general purpose graphic processing unit (GPU) devices. With a target structure of a silicon nanowire that consists of 100,000 atoms and is described with an atomistic tight-binding model, the effects of optimization techniques on the performance of simulations are rigorously tested in a KNL node equipped with two Quadro GV100 GPU devices, and we observe that computation is accelerated by a factor of up to ∼20 against the unoptimized case. The feasibility of handling large-scale workloads in a huge computing environment is also examined with nanowire simulations in a wide energy range, where good scalability is procured up to 2048 KNL nodes.

Download Full-text

Block-Checksum-Based Fault Tolerance for Matrix Multiplication on Large-Scale Parallel Systems

2018 IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th International Conference on Data Science and Systems (HPCC/SmartCity/DSS) ◽

10.1109/hpcc/smartcity/dss.2018.00054 ◽

2018 ◽

Author(s):

Yanchao Zhu ◽

Yi Liu ◽

Mingzhen Li ◽

Depei Qian

Keyword(s):

Fault Tolerance ◽

Large Scale ◽

Matrix Multiplication ◽

Parallel Systems

Download Full-text

A novel publicly delegable secure outsourcing algorithm for large-scale matrix multiplication

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-179725 ◽

2020 ◽

Vol 38 (5) ◽

pp. 6445-6455

Author(s):

Malay Kumar ◽

Vaibhav Mishra ◽

Anurag Shukla ◽

Munendra Singh ◽

Manu Vardhan

Keyword(s):

Large Scale ◽

Matrix Multiplication ◽

Secure Outsourcing ◽

Outsourcing Algorithm ◽

Scale Matrix

Download Full-text

HIERARCHICAL MAPPING FOR HPC APPLICATIONS

Parallel Processing Letters ◽

10.1142/s0129626411000229 ◽

2011 ◽

Vol 21 (03) ◽

pp. 279-299 ◽

Cited By ~ 1

Author(s):

I-HSIN CHUNG ◽

CHE-RUNG LEE ◽

JIAZHENG ZHOU ◽

YEH-CHING CHUNG

Keyword(s):

High Performance ◽

Large Scale ◽

Scale Up ◽

Matrix Multiplication ◽

Spectral Graph Theory ◽

Communication Patterns ◽

Fine Tuning ◽

Mapping Algorithm ◽

Communication Time ◽

Run Time

As the high performance computing systems scale up, mapping the tasks of a parallel application onto physical processors to allow efficient communication becomes one of the critical performance issues. Existing algorithms were usually designed to map applications with regular communication patterns. Their mapping criterion usually overlooks the size of communicated messages, which is the primary factor of communication time. In addition, most of their time complexities are too high to process large scale problems. In this paper, we present a hierarchical mapping algorithm (HMA), which is capable of mapping applications with irregular communication patterns. It first partitions tasks according to their run-time communication information. The tasks that communicate with each other more frequently are regarded as strongly connected. Based on their connectivity strength, the tasks are partitioned into supernodes based on the algorithms in spectral graph theory. The hierarchical partitioning reduces the mapping algorithm complexity to achieve scalability. Finally, the run-time communication information will be used again in fine tuning to explore better mappings. With the experiments, we show how the mapping algorithm helps to reduce the point-to-point communication time for the PDGEMM, a ScaLAPACK matrix multiplication computation kernel, up to 20% and the AMG2006, a tier 1 application of the Sequoia benchmark, up to 7%.

Download Full-text

Factored LT and Factored Raptor Codes for Large-Scale Distributed Matrix Multiplication

2020 IEEE International Symposium on Information Theory (ISIT) ◽

10.1109/isit44484.2020.9174314 ◽

2020 ◽

Author(s):

Asit Kumar Pradhan ◽

Anoosheh Heidarzadeh ◽

Krishna R. Narayanan

Keyword(s):

Large Scale ◽

Matrix Multiplication ◽

Raptor Codes

Download Full-text

Function block-based closed-loop adaptive machining for assembly interfaces of large-scale aircraft components

Robotics and Computer-Integrated Manufacturing ◽

10.1016/j.rcim.2020.101994 ◽

2020 ◽

Vol 66 ◽

pp. 101994 ◽

Cited By ~ 4

Author(s):

Wei Fan ◽

Lianyu Zheng ◽

Wei Ji ◽

Xun Xu ◽

Lihui Wang ◽

...

Keyword(s):

Large Scale ◽

Closed Loop ◽

Function Block ◽

Adaptive Machining ◽

Block Based

Download Full-text

Towards a Multi-array Architecture for Accelerating Large-scale Matrix Multiplication on FPGAs

2018 IEEE International Symposium on Circuits and Systems (ISCAS) ◽

10.1109/iscas.2018.8351474 ◽

2018 ◽

Cited By ~ 11

Author(s):

Junzhong Shen ◽

Yuran Qiao ◽

You Huang ◽

Mei Wen ◽

Chunyuan Zhang

Keyword(s):

Large Scale ◽

Matrix Multiplication ◽

Array Architecture ◽

Scale Matrix

Download Full-text

BMC-SDN: Blockchain-Based Multicontroller Architecture for Secure Software-Defined Networks

Wireless Communications and Mobile Computing ◽

10.1155/2021/9984666 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Abdelouahid Derhab ◽

Mohamed Guerroumi ◽

Mohamed Belaoued ◽

Omar Cheikhrouhou

Keyword(s):

Network Flow ◽

Large Scale ◽

Flow Rule ◽

Security Architecture ◽

Software Defined Networks ◽

Reputation Mechanism ◽

Master Controller ◽

Secure Software ◽

Block Based ◽

Large Scale Networks

Multicontroller software-defined networks have been widely adopted to enable management of large-scale networks. However, they are vulnerable to several attacks including false data injection, which creates topology inconsistency among controllers. To deal with this issue, we propose BMC-SDN, a security architecture that integrates blockchain and multicontroller SDN and divides the network into several domains. Each SDN domain is managed by one master controller that communicates through blockchain with the masters of the other domains. The master controller creates blocks of network flow updates, and its redundant controllers validate the new block based on a proposed reputation mechanism. The reputation mechanism rates the controllers, i.e., block creator and voters, after each voting operation using constant and combined adaptive fading reputation strategies. The evaluation results demonstrate a fast and optimal detection of fraudulent flow rule injection.

Download Full-text

Contextual Contracts for Component-Based Resource Abstraction in a Cloud of HPC Services

10.5753/wscad.2019.8670 ◽

2019 ◽

Cited By ~ 1

Author(s):

Wagner Al Alam ◽

Francisco Carvalho Junior

Keyword(s):

Cloud Computing ◽

Parallel Computing ◽

Large Scale ◽

Matrix Multiplication ◽

Small Scale ◽

Computing Systems ◽

Computing Platform ◽

Computing Platforms

The efforts to make cloud computing suitable for the requirements of HPC applications have motivated us to design HPC Shelf, a cloud computing platform of services for building and deploying parallel computing systems for large-scale parallel processing. We introduce Alite, the system of contextual contracts of HPC Shelf, aimed at selecting component implementations according to requirements of applications, features of targeting parallel computing platforms (e.g. clusters), QoS (Quality-of-Service) properties and cost restrictions. It is evaluated through a small-scale case study employing a componentbased framework for matrix-multiplication based on the BLAS library.

Download Full-text

Measuring Brand Favorability Using Large-Scale Social Media Data

Information Systems Research ◽

10.1287/isre.2021.1030 ◽

2021 ◽

Author(s):

Kunpeng Zhang ◽

Wendy Moe

Keyword(s):

Social Media ◽

Large Scale ◽

Graphical Model ◽

Sampling Technique ◽

Sampling Bias ◽

Monte Carlo Sampling ◽

Research Attention ◽

Social Media Data ◽

Block Based ◽

Media Data

For decades, brand managers have monitored brand health with the use of consumer surveys, which have been refined to address issues related to sampling bias, response bias, leading questions, etc. However, with the advance of Web 2.0 and the internet, consumers have turned to social media to express their opinions on a variety of topics and, subsequently, have generated an extremely large amount of interaction data with brands. Analyzing these publicly available data to measure brand health has attracted great research attention. In this study, we focus on developing a method to measure brand favorability while accounting for the measure biases exhibited by social media posters. Specifically, we propose a probabilistic graphical model–based collective inference framework and implement a block-based Markov chain Monte Carlo sampling technique to obtain an adjusted brand favorability measure that is correlated with traditional survey-based measures used by brands. To demonstrate the effectiveness of our model, we evaluate it using more than 3,300 brands and about 205 million unique users that interact with those brands collected through Facebook. Our model performs very well, providing brand managers with a new method to more accurately measure consumer opinions toward the brand using social media data.

Download Full-text