coarse grained
Recently Published Documents





2022 ◽  
Vol 15 (1) ◽  
pp. 1-30
Seyedramin Rasoulinezhad ◽  
Esther Roorda ◽  
Steve Wilton ◽  
Philip H. W. Leong ◽  
David Boland

The underlying goal of FPGA architecture research is to devise flexible substrates that implement a wide variety of circuits efficiently. Contemporary FPGA architectures have been optimized to support networking, signal processing, and image processing applications through high-precision digital signal processing (DSP) blocks. The recent emergence of machine learning has created a new set of demands characterized by: (1) higher computational density and (2) low precision arithmetic requirements. With the goal of exploring this new design space in a methodical manner, we first propose a problem formulation involving computing nested loops over multiply-accumulate (MAC) operations, which covers many basic linear algebra primitives and standard deep neural network (DNN) kernels. A quantitative methodology for deriving efficient coarse-grained compute block architectures from benchmarks is then proposed together with a family of new embedded blocks, called MLBlocks. An MLBlock instance includes several multiply-accumulate units connected via a flexible routing, where each configuration performs a few parallel dot-products in a systolic array fashion. This architecture is parameterized with support for different data movements, reuse, and precisions, utilizing a columnar arrangement that is compatible with existing FPGA architectures. On synthetic benchmarks, we demonstrate that for 8-bit arithmetic, MLBlocks offer 6× improved performance over the commercial Xilinx DSP48E2 architecture with smaller area and delay; and for time-multiplexed 16-bit arithmetic, achieves 2× higher performance per area with the same area and frequency. All source codes and data, along with documents to reproduce all the results in this article, are available at .

Particuology ◽  
2022 ◽  
Vol 68 ◽  
pp. 44-56
Zhaoyang Li ◽  
Kaiwei Chu ◽  
Zongqing Zhou ◽  
Yixiong Feng ◽  
Aimin Wang

2022 ◽  
Vol 13 (2) ◽  
pp. 1-25
Guangliang Gao ◽  
Zhifeng Bao ◽  
Jie Cao ◽  
A. K. Qin ◽  
Timos Sellis

Accurate house prediction is of great significance to various real estate stakeholders such as house owners, buyers, and investors. We propose a location-centered prediction framework that differs from existing work in terms of data profiling and prediction model. Regarding data profiling, we make an important observation as follows – besides the in-house features such as floor area, the location plays a critical role in house price prediction. Unfortunately, existing work either overlooked it or had a coarse grained measurement of locations. Thereby, we define and capture a fine-grained location profile powered by a diverse range of location data sources, including transportation profile, education profile, suburb profile based on census data, and facility profile. Regarding the choice of prediction model, we observe that a variety of approaches either consider the entire data for modeling, or split the entire house data and model each partition independently. However, such modeling ignores the relatedness among partitions, and for all prediction scenarios, there may not be sufficient training samples per partition for the latter approach. We address this problem by conducting a careful study of exploiting the Multi-Task Learning (MTL) model. Specifically, we map the strategies for splitting the entire house data to the ways the tasks are defined in MTL, and select specific MTL-based methods with different regularization terms to capture and exploit the relatedness among tasks. Based on real-world house transaction data collected in Melbourne, Australia, we design extensive experimental evaluations, and the results indicate a significant superiority of MTL-based methods over state-of-the-art approaches. Meanwhile, we conduct an in-depth analysis on the impact of task definitions and method selections in MTL on the prediction performance, and demonstrate that the impact of task definitions on prediction performance far exceeds that of method selections.

2022 ◽  
Vol 40 (1) ◽  
pp. 1-23
Jiaxing Shen ◽  
Jiannong Cao ◽  
Oren Lederman ◽  
Shaojie Tang ◽  
Alex “Sandy” Pentland

User profiling refers to inferring people’s attributes of interest ( AoIs ) like gender and occupation, which enables various applications ranging from personalized services to collective analyses. Massive nonlinguistic audio data brings a novel opportunity for user profiling due to the prevalence of studying spontaneous face-to-face communication. Nonlinguistic audio is coarse-grained audio data without linguistic content. It is collected due to privacy concerns in private situations like doctor-patient dialogues. The opportunity facilitates optimized organizational management and personalized healthcare, especially for chronic diseases. In this article, we are the first to build a user profiling system to infer gender and personality based on nonlinguistic audio. Instead of linguistic or acoustic features that are unable to extract, we focus on conversational features that could reflect AoIs. We firstly develop an adaptive voice activity detection algorithm that could address individual differences in voice and false-positive voice activities caused by people nearby. Secondly, we propose a gender-assisted multi-task learning method to combat dynamics in human behavior by integrating gender differences and the correlation of personality traits. According to the experimental evaluation of 100 people in 273 meetings, we achieved 0.759 and 0.652 in F1-score for gender identification and personality recognition, respectively.

Daniel Varela ◽  
José Santos

AbstractProtein folding is the dynamic process by which a protein folds into its final native structure. This is different to the traditional problem of the prediction of the final protein structure, since it requires a modeling of how protein components interact over time to obtain the final folded structure. In this study we test whether a model of the folding process can be obtained exclusively through machine learning. To this end, protein folding is considered as an emergent process and the cellular automata tool is used to model the folding process. A neural cellular automaton is defined, using a connectionist model that acts as a cellular automaton through the protein chain to define the dynamic folding. Differential evolution is used to automatically obtain the optimized neural cellular automata that provide protein folding. We tested the methods with the Rosetta coarse-grained atomic model of protein representation, using different proteins to analyze the modeling of folding and the structure refinement that the modeling can provide, showing the potential advantages that such methods offer, but also difficulties that arise.

Metals ◽  
2022 ◽  
Vol 12 (1) ◽  
pp. 159
Nicholas Olynik ◽  
Bin Cheng ◽  
David J. Sprouster ◽  
Chad M. Parish ◽  
Jason R. Trelewicz

Exploiting grain boundary engineering in the design of alloys for extreme environments provides a promising pathway for enhancing performance relative to coarse-grained counterparts. Due to its attractive properties as a plasma facing material for fusion devices, tungsten presents an opportunity to exploit this approach in addressing the significant materials challenges imposed by the fusion environment. Here, we employ a ternary alloy design approach for stabilizing W against recrystallization and grain growth while simultaneously enhancing its manufacturability through powder metallurgical processing. Mechanical alloying and grain refinement in W-10 at.% Ti-(10,20) at.% Cr alloys are accomplished through high-energy ball milling with transitions in the microstructure mapped as a function of milling time. We demonstrate the multi-modal nature of the resulting nanocrystalline grain structure and its stability up to 1300 °C with the coarser grain size population correlated to transitions in crystallographic texture that result from the preferred slip systems in BCC W. Field-assisted sintering is employed to consolidate the alloy powders into bulk samples, which, due to the deliberately designed compositional features, are shown to retain ultrafine grain structures despite the presence of minor carbides formed during sintering due to carbon impurities in the ball-milled powders.

Electronics ◽  
2022 ◽  
Vol 11 (2) ◽  
pp. 273
Zeyu Li ◽  
Zhao Huang ◽  
Quan Wang ◽  
Junjie Wang

With the rapid reduction of CMOS process size, the FPGAs with high-silicon accumulation technology are becoming more sensitive to aging effects. This reduces the reliability and service life of the device. The offline aging-aware layout planning based on balance stress is an effective solution. However, the existing methods need to take a long time to solve the floorplanner, and the corresponding layout solutions occupy many on-chip resources. To this end, we proposed an efficient Aging Mitigation and Resource Optimization Floorplanner (AMROFloor) for FPGAs. First, the layout solution is implemented on the Virtual Coarse-Grained Runtime Reconfigurable Architecture, which contributes to avoiding rule constraints for placement and routing. Second, the Maximize Reconfigurable Regions Algorithm (MRRA) is proposed to quickly determine the RRs’ number and size to save the solving time and ensure an effective solution. Furthermore, the Resource Combination Algorithm (RCA) is proposed to optimize the on-chip resources, reducing the on-Chip Resource Utilization (CRU) while achieving the same aging relief effect. Experiments were simulated and implemented on Xilinx FPGA. The results demonstrate that the AMROFloor method designed in this paper can extend the Mean Time to Failure (MTTF) by 13.8% and optimize the resource overhead by 19.2% on average compared to the existing aging-aware layout solutions.

Sign in / Sign up

Export Citation Format

Share Document