HIGH LEVEL SIMULATION OF SVP MANY-CORE SYSTEMS

The microthreaded many-core architecture is comprised of multiple clusters of fine-grained multi-threaded cores. The management of concurrency is supported in the instruction set architecture of the cores and the computational work in application is asynchronously delegated to different clusters of cores, where the cluster is allocated dynamically. Computer architects are always interested in analyzing the complex interaction amongst the dynamically allocated resources. Generally a detailed simulation with a cycle-accurate simulation of the execution time is used. However, the cycle-accurate simulator for the microthreaded architecture executes at the rate of 100,000 instructions per second, divided over the number of simulated cores. This means that the evaluation of a complex application executing on a contemporary multi-core machine can be very slow. To perform efficient design space exploration we present a co-simulation environment, where the detailed execution of instructions in the pipeline of microthreaded cores and the interactions amongst the hardware components are abstracted. We present the evaluation of the high-level simulation framework against the cycle-accurate simulation framework. The results show that the high-level simulator is faster and less complicated than the cycle-accurate simulator but with the cost of losing accuracy.

Download Full-text

Distractor-Aware Tracking with Multi-Task and Dynamic Feature Learning

Journal of Circuits System and Computers ◽

10.1142/s0218126621500316 ◽

2020 ◽

pp. 2150031

Author(s):

Weichun Liu ◽

Xiaoan Tang ◽

Chenglin Zhao

Keyword(s):

Correlation Filter ◽

Coarse Grained ◽

Dynamic Feature ◽

Semantic Features ◽

Low Level ◽

Fine Grained ◽

Semantic Embedding ◽

Training Stage ◽

Online Tracking ◽

High Level

Recently, deep trackers based on the siamese networking are enjoying increasing popularity in the tracking community. Generally, those trackers learn a high-level semantic embedding space for feature representation but lose low-level fine-grained details. Meanwhile, the learned high-level semantic features are not updated during online tracking, which results in tracking drift in presence of target appearance variation and similar distractors. In this paper, we present a novel end-to-end trainable Convolutional Neural Network (CNN) based on the siamese network for distractor-aware tracking. It enhances target appearance representation in both the offline training stage and online tracking stage. In the offline training stage, this network learns both the low-level fine-grained details and high-level coarse-grained semantics simultaneously in a multi-task learning framework. The low-level features with better resolution are complementary to semantic features and able to distinguish the foreground target from background distractors. In the online stage, the learned low-level features are fed into a correlation filter layer and updated in an interpolated manner to encode target appearance variation adaptively. The learned high-level features are fed into a cross-correlation layer without online update. Therefore, the proposed tracker benefits from both the adaptability of the fine-grained correlation filter and the generalization capability of the semantic embedding. Extensive experiments are conducted on the public OTB100 and UAV123 benchmark datasets. Our tracker achieves state-of-the-art performance while running with a real-time frame-rate.

Download Full-text

Exploring Many-Core Design Templates for FPGAs and ASICs

International Journal of Reconfigurable Computing ◽

10.1155/2012/439141 ◽

2012 ◽

Vol 2012 ◽

pp. 1-15 ◽

Cited By ~ 4

Author(s):

Ilia Lebedev ◽

Christopher Fletcher ◽

Shaoyi Cheng ◽

James Martin ◽

Austin Doupnik ◽

...

Keyword(s):

Graphics Processing Unit ◽

General Purpose ◽

Coarse Grained ◽

Processing Unit ◽

Fine Grained ◽

Data Parallel ◽

Level Data ◽

Graph Inference ◽

High Level ◽

Many Core

We present a highly productive approach to hardware design based on a many-core microarchitectural template used to implement compute-bound applications expressed in a high-level data-parallel language such as OpenCL. The template is customized on a per-application basis via a range of high-level parameters such as the interconnect topology or processing element architecture. The key benefits of this approach are that it (i) allows programmers to express parallelism through an API defined in a high-level programming language, (ii) supports coarse-grained multithreading and fine-grained threading while permitting bit-level resource control, and (iii) reduces the effort required to repurpose the system for different algorithms or different applications. We compare template-driven design to both full-custom and programmable approaches by studying implementations of a compute-bound data-parallel Bayesian graph inference algorithm across several candidate platforms. Specifically, we examine a range of template-based implementations on both FPGA and ASIC platforms and compare each against full custom designs. Throughout this study, we use a general-purpose graphics processing unit (GPGPU) implementation as a performance and area baseline. We show that our approach, similar in productivity to programmable approaches such as GPGPU applications, yields implementations with performance approaching that of full-custom designs on both FPGA and ASIC platforms.

Download Full-text

Spatiotemporal Prediction of Theft Risk with Deep Inception-Residual Networks

Smart Cities ◽

10.3390/smartcities4010013 ◽

2021 ◽

Vol 4 (1) ◽

pp. 204-216

Author(s):

Xinyue Ye ◽

Lian Duan ◽

Qiong Peng

Keyword(s):

New York ◽

New York City ◽

Emergency Service ◽

Prediction Models ◽

Smart Cities ◽

Low Level ◽

Fine Grained ◽

Crime Prediction ◽

High Level ◽

Better Than

Spatiotemporal prediction of crime is crucial for public safety and smart cities operation. As crime incidents are distributed sparsely across space and time, existing deep-learning methods constrained by coarse spatial scale offer only limited values in prediction of crime density. This paper proposes the use of deep inception-residual networks (DIRNet) to conduct fine-grained, theft-related crime prediction based on non-emergency service request data (311 events). Specifically, it outlines the employment of inception units comprising asymmetrical convolution layers to draw low-level spatiotemporal dependencies hidden in crime events and complaint records in the 311 dataset. Afterward, this paper details how residual units can be applied to capture high-level spatiotemporal features from low-level spatiotemporal dependencies for the final prediction. The effectiveness of the proposed DIRNet is evaluated based on theft-related crime data and 311 data in New York City from 2010 to 2015. The results confirm that the DIRNet obtains an average F1 of 71%, which is better than other prediction models.

Download Full-text

CJing: Combining Live Coding and VJing for Live Visual Performance

10.26686/wgtn.17138669.v1 ◽

2021 ◽

Author(s):

◽

Jack Voldemars Purvis

Keyword(s):

Real Time ◽

User Interfaces ◽

Graphical User Interfaces ◽

Visual Performance ◽

Low Level ◽

Fine Grained ◽

High Level

<p>Live coding focuses on improvising content by coding in textual interfaces, but this reliance on low level text editing impairs usability by not allowing for high level manipulation of content. VJing focuses on remixing existing content with graphical user interfaces and hardware controllers, but this focus on high level manipulation does not allow for fine-grained control where content can be improvised from scratch or manipulated at a low level. This thesis proposes the code jockey practice (CJing), a new hybrid practice that combines aspects of live coding and VJing practice. In CJing, a performer known as a code jockey (CJ) interacts with code, graphical user interfaces and hardware controllers to create or manipulate real-time visuals. CJing harnesses the strengths of live coding and VJing to enable flexible performances where content can be controlled at both low and high levels. Live coding provides fine-grained control where content can be improvised from scratch or manipulated at a low level while VJing provides high level manipulation where content can be organised, remixed and interacted with. To illustrate CJing, this thesis contributes Visor, a new environment for live visual performance that embodies the practice. Visor's design is based on key ideas of CJing and a study of live coders and VJs in practice. To evaluate CJing and Visor, this thesis reflects on the usage of Visor in live performances and feedback gathered from creative coders, live coders, and VJs who experimented with the environment.</p>

Download Full-text

CJing: Combining Live Coding and VJing for Live Visual Performance

10.26686/wgtn.17138669 ◽

2021 ◽

Author(s):

◽

Jack Voldemars Purvis

Keyword(s):

Real Time ◽

User Interfaces ◽

Graphical User Interfaces ◽

Visual Performance ◽

Low Level ◽

Fine Grained ◽

High Level

<p>Live coding focuses on improvising content by coding in textual interfaces, but this reliance on low level text editing impairs usability by not allowing for high level manipulation of content. VJing focuses on remixing existing content with graphical user interfaces and hardware controllers, but this focus on high level manipulation does not allow for fine-grained control where content can be improvised from scratch or manipulated at a low level. This thesis proposes the code jockey practice (CJing), a new hybrid practice that combines aspects of live coding and VJing practice. In CJing, a performer known as a code jockey (CJ) interacts with code, graphical user interfaces and hardware controllers to create or manipulate real-time visuals. CJing harnesses the strengths of live coding and VJing to enable flexible performances where content can be controlled at both low and high levels. Live coding provides fine-grained control where content can be improvised from scratch or manipulated at a low level while VJing provides high level manipulation where content can be organised, remixed and interacted with. To illustrate CJing, this thesis contributes Visor, a new environment for live visual performance that embodies the practice. Visor's design is based on key ideas of CJing and a study of live coders and VJs in practice. To evaluate CJing and Visor, this thesis reflects on the usage of Visor in live performances and feedback gathered from creative coders, live coders, and VJs who experimented with the environment.</p>

Download Full-text

Device Hopping

ACM Transactions on Architecture and Code Optimization ◽

10.1145/3471909 ◽

2021 ◽

Vol 18 (4) ◽

pp. 1-25

Author(s):

Paul Metzger ◽

Volker Seeker ◽

Christian Fensch ◽

Murray Cole

Keyword(s):

Programming Model ◽

Heterogeneous Systems ◽

Code Size ◽

Fine Grained ◽

Scheduling Policy ◽

High Level ◽

Many Core ◽

Execution Models ◽

Current Systems

Existing OS techniques for homogeneous many-core systems make it simple for single and multithreaded applications to migrate between cores. Heterogeneous systems do not benefit so fully from this flexibility, and applications that cannot migrate in mid-execution may lose potential performance. The situation is particularly challenging when a switch of language runtime would be desirable in conjunction with a migration. We present a case study in making heterogeneous CPU + GPU systems more flexible in this respect. Our technique for fine-grained application migration, allows switches between OpenMP, OpenCL, and CUDA execution, in conjunction with migrations from GPU to CPU, and CPU to GPU. To achieve this, we subdivide iteration spaces into slices, and consider migration on a slice-by-slice basis. We show that slice sizes can be learned offline by machine learning models. To further improve performance, memory transfers are made migration-aware. The complexity of the migration capability is hidden from programmers behind a high-level programming model. We present a detailed evaluation of our mid-kernel migration mechanism with the First Come, First Served scheduling policy. We compare our technique in a focused evaluation scenario against idealized kernel-by-kernel scheduling, which is typical for current systems, and makes perfect kernel to device scheduling decisions, but cannot migrate kernels mid-execution. Models show that up to 1.33× speedup can be achieved over these systems by adding fine-grained migration. Our experimental results with all nine applicable SHOC and Rodinia benchmarks achieve speedups of up to 1.30× (1.08× on average) over an implementation of a perfect but kernel-migration incapable scheduler when migrated to a faster device. Our mechanism and slice size choices introduce an average slowdown of only 2.44% if kernels never migrate. Lastly, our programming model reduces the code size by at least 88% if compared to manual implementations of migratable kernels.

Download Full-text

Teknik Data Mining Dalam Clustering Produksi Susu Segar Di Indonesia Dengan Algoritma K-Means

BRAHMANA: Jurnal Penerapan Kecerdasan Buatan ◽

10.30645/brahmana.v1i1.5 ◽

2019 ◽

Vol 1 (1) ◽

pp. 31-39

Author(s):

Ilham Safitra Damanik ◽

Sundari Retno Andani ◽

Dedi Sehendro

Keyword(s):

Data Mining ◽

Milk Production ◽

Clustering Algorithm ◽

Clustering Method ◽

Data Mining Techniques ◽

Low Level ◽

Fresh Milk ◽

Nutritional Needs ◽

High Level ◽

Level Cluster

Milk is an important intake to meet nutritional needs. Both consumed by children, and adults. Indonesia has many producers of fresh milk, but it is not sufficient for national milk needs. Data mining is a science in the field of computers that is widely used in research. one of the data mining techniques is Clustering. Clustering is a method by grouping data. The Clustering method will be more optimal if you use a lot of data. Data to be used are provincial data in Indonesia from 2000 to 2017 obtained from the Central Statistics Agency. The results of this study are in Clusters based on 2 milk-producing groups, namely high-dairy producers and low-milk producing regions. From 27 data on fresh milk production in Indonesia, two high-level provinces can be obtained, namely: West Java and East Java. And 25 others were added in 7 provinces which did not follow the calculation of the K-Means Clustering Algorithm, including in the low level cluster.

Download Full-text

Determinants of Abuse of the Bodies of the Dead

Rossijskoe Pravo. Obrazovanie, Praktika, Nauka ◽

10.34076/2410-2709-2019-4-71-77 ◽

2019 ◽

pp. 71-77

Author(s):

Margarita Khomyakova

Keyword(s):

Low Income ◽

Criminal Code ◽

Value Judgments ◽

Legal Norms ◽

Low Level ◽

Medical Institutions ◽

Specific Object ◽

The Subject ◽

The Dead ◽

High Level

The author analyzes definitions of the concepts of determinants of crime given by various scientists and offers her definition. In this study, determinants of crime are understood as a set of its causes, the circumstances that contribute committing them, as well as the dynamics of crime. It is noted that the Russian legislator in Article 244 of the Criminal Code defines the object of this criminal assault as public morality. Despite the use of evaluative concepts both in the disposition of this norm and in determining the specific object of a given crime, the position of criminologists is unequivocal: crimes of this kind are immoral and are in irreconcilable conflict with generally accepted moral and legal norms. In the paper, some views are considered with regard to making value judgments which could hardly apply to legal norms. According to the author, the reasons for abuse of the bodies of the dead include economic problems of the subject of a crime, a low level of culture and legal awareness; this list is not exhaustive. The main circumstances that contribute committing abuse of the bodies of the dead and their burial places are the following: low income and unemployment, low level of criminological prevention, poor maintenance and protection of medical institutions and cemeteries due to underperformance of state and municipal bodies. The list of circumstances is also open-ended. Due to some factors, including a high level of latency, it is not possible to reflect the dynamics of such crimes objectively. At the same time, identification of the determinants of abuse of the bodies of the dead will reduce the number of such crimes.

Download Full-text

EXPRESS: Marketers Project Their Personal Preferences onto Consumers: Overcoming the Threat of Egocentric Decision Making

Journal of Marketing Research ◽

10.1177/0022243721998378 ◽

2021 ◽

pp. 002224372199837

Author(s):

Walter Herzog ◽

Johannes D. Hattula ◽

Darren W. Dahl

Keyword(s):

Decision Making ◽

Consumer Preferences ◽

Research Question ◽

Pilot Studies ◽

Low Level ◽

Marketing Managers ◽

False Consensus Effect ◽

False Consensus ◽

High Level

This research explores how marketing managers can avoid the so-called false consensus effect—the egocentric tendency to project personal preferences onto consumers. Two pilot studies were conducted to provide evidence for the managerial importance of this research question and to explore how marketing managers attempt to avoid false consensus effects in practice. The results suggest that the debiasing tactic most frequently used by marketers is to suppress their personal preferences when predicting consumer preferences. Four subsequent studies show that, ironically, this debiasing tactic can backfire and increase managers’ susceptibility to the false consensus effect. Specifically, the results suggest that these backfire effects are most likely to occur for managers with a low level of preference certainty. In contrast, the results imply that preference suppression does not backfire but instead decreases false consensus effects for managers with a high level of preference certainty. Finally, the studies explore the mechanism behind these results and show how managers can ultimately avoid false consensus effects—regardless of their level of preference certainty and without risking backfire effects.

Download Full-text