scholarly journals GPU Domain Specialization via Composable On-Package Architecture

2022 ◽  
Vol 19 (1) ◽  
pp. 1-23
Author(s):  
Yaosheng Fu ◽  
Evgeny Bolotin ◽  
Niladrish Chatterjee ◽  
David Nellans ◽  
Stephen W. Keckler

As GPUs scale their low-precision matrix math throughput to boost deep learning (DL) performance, they upset the balance between math throughput and memory system capabilities. We demonstrate that a converged GPU design trying to address diverging architectural requirements between FP32 (or larger)-based HPC and FP16 (or smaller)-based DL workloads results in sub-optimal configurations for either of the application domains. We argue that a C omposable O n- PA ckage GPU (COPA-GPU) architecture to provide domain-specialized GPU products is the most practical solution to these diverging requirements. A COPA-GPU leverages multi-chip-module disaggregation to support maximal design reuse, along with memory system specialization per application domain. We show how a COPA-GPU enables DL-specialized products by modular augmentation of the baseline GPU architecture with up to 4× higher off-die bandwidth, 32× larger on-package cache, and 2.3× higher DRAM bandwidth and capacity, while conveniently supporting scaled-down HPC-oriented designs. This work explores the microarchitectural design necessary to enable composable GPUs and evaluates the benefits composability can provide to HPC, DL training, and DL inference. We show that when compared to a converged GPU design, a DL-optimized COPA-GPU featuring a combination of 16× larger cache capacity and 1.6× higher DRAM bandwidth scales per-GPU training and inference performance by 31% and 35%, respectively, and reduces the number of GPU instances by 50% in scale-out training scenarios.

2020 ◽  
Vol 11 (1) ◽  
pp. 148-160 ◽  
Author(s):  
Weicong Kong ◽  
Zhao Yang Dong ◽  
Bo Wang ◽  
Junhua Zhao ◽  
Jie Huang

Author(s):  
M A Isayev ◽  
D A Savelyev

The comparison of different convolutional neural networks which are the core of the most actual solutions in the computer vision area is considers in hhe paper. The study includes benchmarks of this state-of-the-art solutions by some criteria, such as mAP (mean average precision), FPS (frames per seconds), for the possibility of real-time usability. It is concluded on the best convolutional neural network model and deep learning methods that were used at particular solution.


Sensors ◽  
2020 ◽  
Vol 20 (1) ◽  
pp. 322 ◽  
Author(s):  
Faraz Malik Awan ◽  
Yasir Saleem ◽  
Roberto Minerva ◽  
Noel Crespi

Machine/Deep Learning (ML/DL) techniques have been applied to large data sets in order to extract relevant information and for making predictions. The performance and the outcomes of different ML/DL algorithms may vary depending upon the data sets being used, as well as on the suitability of algorithms to the data and the application domain under consideration. Hence, determining which ML/DL algorithm is most suitable for a specific application domain and its related data sets would be a key advantage. To respond to this need, a comparative analysis of well-known ML/DL techniques, including Multilayer Perceptron, K-Nearest Neighbors, Decision Tree, Random Forest, and Voting Classifier (or the Ensemble Learning Approach) for the prediction of parking space availability has been conducted. This comparison utilized Santander’s parking data set, initiated while working on the H2020 WISE-IoT project. The data set was used in order to evaluate the considered algorithms and to determine the one offering the best prediction. The results of this analysis show that, regardless of the data set size, the less complex algorithms like Decision Tree, Random Forest, and KNN outperform complex algorithms such as Multilayer Perceptron, in terms of higher prediction accuracy, while providing comparable information for the prediction of parking space availability. In addition, in this paper, we are providing Top-K parking space recommendations on the basis of distance between current position of vehicles and free parking spots.


Author(s):  
Pablo San Juan ◽  
Rafael Rodríguez-Sánchez ◽  
Francisco D. Igual ◽  
Pedro Alonso-Jordá ◽  
Enrique S. Quintana-Ortí

Author(s):  
JZT Sim ◽  
QW Fong ◽  
WM Huang ◽  
CH Tan

With the advent of artificial intelligence (AI), machines are increasingly being used to complete complicated tasks, yielding remarkable results. Machine learning (ML) is the most relevant subset of AI in medicine, which will soon become an integral part of our everyday practice. Therefore, physicians should acquaint themselves with ML and AI, and their role as an enabler rather than a competitor. Herein, we introduce basic concepts and terms used in AI and ML, and aim to demystify commonly used AI/ML algorithms such as learning methods including neural networks/deep learning, decision tree and application domain in computer vision and natural language processing through specific examples. We discuss how machines are already being used to augment the physician’s decision-making process, and postulate the potential impact of ML on medical practice and medical research based on its current capabilities and known limitations. Moreover, we discuss the feasibility of full machine autonomy in medicine.


Sign in / Sign up

Export Citation Format

Share Document