Mobile Platform Challenges in Interactive Computer Vision

Author(s):  
Miguel Bordallo López

Computer vision can be used to increase the interactivity of existing and new camera-based applications. It can be used to build novel interaction methods and user interfaces. The computing and sensing needs of this kind of applications require a careful balance between quality and performance, a practical trade-off. This chapter shows the importance of using all the available resources to hide application latency and maximize computational throughput. The experience gained during the developing of interactive applications is utilized to characterize the constraints imposed by the mobile environment, discussing the most important design goals: high performance and low power consumption. In addition, this chapter discusses the use of heterogeneous computing via asymmetric multiprocessing to improve the throughput and energy efficiency of interactive vision-based applications.

Author(s):  
Mark Endrei ◽  
Chao Jin ◽  
Minh Ngoc Dinh ◽  
David Abramson ◽  
Heidi Poxon ◽  
...  

Rising power costs and constraints are driving a growing focus on the energy efficiency of high performance computing systems. The unique characteristics of a particular system and workload and their effect on performance and energy efficiency are typically difficult for application users to assess and to control. Settings for optimum performance and energy efficiency can also diverge, so we need to identify trade-off options that guide a suitable balance between energy use and performance. We present statistical and machine learning models that only require a small number of runs to make accurate Pareto-optimal trade-off predictions using parameters that users can control. We study model training and validation using several parallel kernels and more complex workloads, including Algebraic Multigrid (AMG), Large-scale Atomic Molecular Massively Parallel Simulator, and Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics. We demonstrate that we can train the models using as few as 12 runs, with prediction error of less than 10%. Our AMG results identify trade-off options that provide up to 45% improvement in energy efficiency for around 10% performance loss. We reduce the sample measurement time required for AMG by 90%, from 13 h to 74 min.


Author(s):  
Harold O. Fried ◽  
Loren W. Tauer

This article explores how well an individual manages his or her own talent to achieve high performance in an individual sport. Its setting is the Ladies Professional Golf Association (LPGA). The order-m approach is explained. Additionally, the data and the empirical findings are presented. The inputs measure fundamental golfing athletic ability. The output measures success on the LPGA tour. The correlation coefficient between earnings per event and the ability to perform under pressure is 0.48. The careers of golfers occur on the front end of the age distribution. There is a classic trade-off between the inevitable deterioration in the mental ability to handle the pressure and experience gained with time. The ability to perform under pressure peaks at age 37.


Author(s):  
Zhihui Chen ◽  
Jianyao Huang ◽  
Weifeng Zhang ◽  
Yankai Zhou ◽  
Xuyang Wei ◽  
...  

N-type semiconducting polymers are important materials for modern electronics but limited in variety and performance. To design a new n-type polymer semiconductor requires a judicious trade-off between structural parameters involving...


2014 ◽  
Vol 22 (4) ◽  
pp. 273-283 ◽  
Author(s):  
Robert Schöne ◽  
Jan Treibig ◽  
Manuel F. Dolz ◽  
Carla Guillen ◽  
Carmen Navarrete ◽  
...  

Energy costs nowadays represent a significant share of the total costs of ownership of High Performance Computing (HPC) systems. In this paper we provide an overview on different aspects of energy efficiency measurement and optimization. This includes metrics that define energy efficiency and a description of common power and energy measurement tools. We discuss performance measurement and analysis suites that use these tools and provide users the possibility to analyze energy efficiency weaknesses in their code. We also demonstrate how the obtained power and performance data can be used to locate inefficient resource usage or to create a model to predict optimal operation points. We further present interfaces in these suites that allow an automated tuning for energy efficiency and how these interfaces are used. We finally discuss how a hard power limit will change our view on energy efficient HPC in the future.


Author(s):  
Arun Agarwal ◽  
Sarat Kumar Patra

The new digital radio system DAB (Digital Audio Broadcasting), developed within the Eureka 147 project is a very innovative and universal multimedia broadcast system that has the potential to replace existing AM and FM audio broadcast services in many parts of the world in the near future. DAB employs coded OFDM technology that enables it for mobile reception and makes receivers highly robust against channel multipath fading effect. In this paper, we have analyzed the bit error rate (BER) performance of DAB system conforming to the parameters established by the ETSI (EN 300 401) using frequency interleaving and Forward Error Correction (FEC) in different transmission channels. The results shows DAB to be suitable radio broadcasting technology for high performance in mobile environment.


2020 ◽  
Vol 38 (3-4) ◽  
pp. 1-30
Author(s):  
Rakesh Kumar ◽  
Boris Grot

The front-end bottleneck is a well-established problem in server workloads owing to their deep software stacks and large instruction footprints. Despite years of research into effective L1-I and BTB prefetching, state-of-the-art techniques force a trade-off between metadata storage cost and performance. Temporal Stream prefetchers deliver high performance but require a prohibitive amount of metadata to accommodate the temporal history. Meanwhile, BTB-directed prefetchers incur low cost by using the existing in-core branch prediction structures but fall short on performance due to BTB’s inability to capture the massive control flow working set of server applications. This work overcomes the fundamental limitation of BTB-directed prefetchers, which is capturing a large control flow working set within an affordable BTB storage budget. We re-envision the BTB organization to maximize its control flow coverage by observing that an application’s instruction footprint can be mapped as a combination of its unconditional branch working set and, for each unconditional branch, a spatial encoding of the cache blocks around the branch target. Effectively capturing a map of the application’s instruction footprint in the BTB enables highly effective BTB-directed prefetching that outperforms the state-of-the-art prefetchers by up to 10% for equivalent storage budget.


2019 ◽  
Author(s):  
Vinícius Klôh ◽  
Matheus Gritz ◽  
Bruno Schulze ◽  
Mariza Ferro

Performance and energy efficiency are now critical concerns in high performance scientific computing. It is expected that requirements of the scientific problem should guide the orchestration of different techniques of energy saving, in order to improve the balance between energy consumption and application performance. To enable this balance, we propose the development of an autonomous framework to make this orchestration and present the ongoing research to this development, more specifically, focusing in the characterization of the scientific applications and the performance modeling tasks using Machine Learning.


Electronics ◽  
2021 ◽  
Vol 10 (19) ◽  
pp. 2386
Author(s):  
Raúl Nozal ◽  
Jose Luis Bosque

Heterogeneous systems are the core architecture of most computing systems, from high-performance computing nodes to embedded devices, due to their excellent performance and energy efficiency. Efficiently programming these systems has become a major challenge due to the complexity of their architectures and the efforts required to provide them with co-execution capabilities that can fully exploit the applications. There are many proposals to simplify the programming and management of acceleration devices and multi-core CPUs. However, in many cases, portability and ease of use compromise the efficiency of different devices—even more so when co-executing. Intel oneAPI, a new and powerful standards-based unified programming model, built on top of SYCL, addresses these issues. In this paper, oneAPI is provided with co-execution strategies to run the same kernel between different devices, enabling the exploitation of static and dynamic policies. This work evaluates the performance and energy efficiency for a well-known set of regular and irregular HPC benchmarks, using two heterogeneous systems composed of an integrated GPU and CPU. Static and dynamic load balancers are integrated and evaluated, highlighting single and co-execution strategies and the most significant key points of this promising technology. Experimental results show that co-execution is worthwhile when using dynamic algorithms and improves the efficiency even further when using unified shared memory.


Sign in / Sign up

Export Citation Format

Share Document