simulation speed
Recently Published Documents


TOTAL DOCUMENTS

124
(FIVE YEARS 43)

H-INDEX

9
(FIVE YEARS 2)

2022 ◽  
Vol 32 (1) ◽  
pp. 1-21
Author(s):  
Jan Moritz Joseph ◽  
Lennart Bamberg ◽  
Imad Hajjar ◽  
Behnam Razi Perjikolaei ◽  
Alberto García-Ortiz ◽  
...  

We introduce Ratatoskr , an open-source framework for in-depth power, performance, and area (PPA) analysis in Networks-on-Chips (NoCs) for 3D-integrated and heterogeneous System-on-Chips (SoCs). It covers all layers of abstraction by providing an NoC hardware implementation on Register Transfer Level (RTL), an NoC simulator on cycle-accurate level and an application model on transaction level. By this comprehensive approach, Ratatoskr can provide the following specific PPA analyses: Dynamic power of links can be measured within 2.4% accuracy of bit-level simulations while maintaining cycle-accurate simulation speed. Router power is determined from RTL-to-gate-level synthesis combined with cycle-accurate simulations. The performance of the whole NoC can be measured both via cycle-accurate and RTL simulations. The performance (i.e., timing) of individual routers and the NoC area are obtained from RTL synthesis results. Despite these manifold features, Ratatoskr offers easy two-step user interaction: (1) A single point-of-entry allows setting design parameters. (2) PPA reports are generated automatically. For both the input and the output, different levels of abstraction can be chosen for high-level rapid network analysis or low-level improvement of architectural details. The synthesizable NoC-RTL model shows improved total router power and area in comparison to a conventional standard router. As a forward-thinking and unique feature not found in other NoC PPA-measurement tools, Ratatoskr supports heterogeneous 3D integration that is one of the most promising integration paradigms for upcoming SoCs. Thereby, Ratatoskr lays the groundwork to design their communication architectures. The framework is publicly available at https://github.com/ratatoskr-project .


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Yongho Kim ◽  
Gilnam Ryu ◽  
Yongho Choi

Simulation speed depends on code structures. Hence, it is crucial how to build a fast algorithm. We solve the Allen–Cahn equation by an explicit finite difference method, so it requires grid calculations implemented by many for-loops in the simulation code. In terms of programming, many for-loops make the simulation speed slow. We propose a model architecture containing a pad and a convolution operation on the Allen–Cahn equation for fast computation while maintaining accuracy. Also, the GPU operation is used to boost up the speed more. In this way, the simulation of other differential equations can be improved. In this paper, various numerical simulations are conducted to confirm that the Allen–Cahn equation follows motion by mean curvature and phase separation in two-dimensional and three-dimensional spaces. Finally, we demonstrate that our algorithm is much faster than an unoptimized code and the CPU operation.


2021 ◽  
Vol 11 (22) ◽  
pp. 10885
Author(s):  
Natalia Koteleva ◽  
Valentin Kuznetsov ◽  
Natalia Vasilyeva

Digital technology is being introduced into all areas of human activity. However, there are a number of challenges in implementing these technologies. These include the delayed return on investment, the lack of visibility for decision-makers and, most importantly, the lack of human capacity to develop and implement digital technologies. Therefore, creating a digital training simulator for the industry is an actual task. This paper focuses on the first step in creating a digital training simulator for the industry: developing a dynamic process model. The process chosen is flotation, as it is one of the most common mineral processing methods. The simulation was performed in AVEVA Dynamic Simulation software. The model is based on a determination of reaction rate constants, for which, experiments were conducted on a laboratory pneumomechanical flotation machine with a bottom drive. The resulting model was scaled up to industrial size and its dynamic properties were investigated. In addition, the basic scheme of a computer simulator was considered, and the testing of the communication channels of a dynamic model with systems, equipment and software for digitalizing was conducted. The developed model showed acceptable results for its intended purpose, namely, an exact match to the technological process in terms of time. This helps to account for inertia and a fast response on all tested communication channels, as well as being acceptable for the real-time simulation speed of the solver.


2021 ◽  
Vol 2111 (1) ◽  
pp. 012054
Author(s):  
M.A. Hamid ◽  
S.A. Rahman ◽  
I.A. Darmawan ◽  
M. Fatkhurrokhman ◽  
M. Nurtanto

Abstract Testing the performance efficiency aspect was carried out to test the performance efficiency of the Unity 3D and Blender-based virtual laboratory media during the COVID-19 pandemic at the Electrical Engineering Vocational Laboratory. This test is carried out to test the performance of the media that has been created. The aspects tested are access speed, process speed, and simulation speed when run. Tests were conducted to measure processor and memory consumption through real time monitoring using MSI Afterburner. Divided into 2 stages of testing, namely time behavior and resource utilization. Time-behavior is focused on how long it takes the media or software to provide a response time to perform an action from a certain function. Resource-utilization is the degree to which software uses some resources when doing something under certain conditions.


2021 ◽  
Vol 20 (5s) ◽  
pp. 1-24
Author(s):  
Gokul Krishnan ◽  
Sumit K. Mandal ◽  
Manvitha Pannala ◽  
Chaitali Chakrabarti ◽  
Jae-Sun Seo ◽  
...  

In-memory computing (IMC) on a monolithic chip for deep learning faces dramatic challenges on area, yield, and on-chip interconnection cost due to the ever-increasing model sizes. 2.5D integration or chiplet-based architectures interconnect multiple small chips (i.e., chiplets) to form a large computing system, presenting a feasible solution beyond a monolithic IMC architecture to accelerate large deep learning models. This paper presents a new benchmarking simulator, SIAM, to evaluate the performance of chiplet-based IMC architectures and explore the potential of such a paradigm shift in IMC architecture design. SIAM integrates device, circuit, architecture, network-on-chip (NoC), network-on-package (NoP), and DRAM access models to realize an end-to-end system. SIAM is scalable in its support of a wide range of deep neural networks (DNNs), customizable to various network structures and configurations, and capable of efficient design space exploration. We demonstrate the flexibility, scalability, and simulation speed of SIAM by benchmarking different state-of-the-art DNNs with CIFAR-10, CIFAR-100, and ImageNet datasets. We further calibrate the simulation results with a published silicon result, SIMBA. The chiplet-based IMC architecture obtained through SIAM shows 130 and 72 improvement in energy-efficiency for ResNet-50 on the ImageNet dataset compared to Nvidia V100 and T4 GPUs.


2021 ◽  
Author(s):  
Jiamin Jiang

Abstract It is very challenging to simulate unconventional reservoirs efficiently and accurately. Transient flow can last for a long time and sharp solution (pressure, saturation, compositions) gradients are induced because of the severe permeability contrast between fracture and matrix. Although high-resolution models for well and fracture are required to achieve adequate resolution, they are computationally too demanding for practical field models with many stages of hydraulic fracture. The paper aims to innovate localization strategies that take advantage of locality on timestep and Newton iteration levels. The strategies readily accommodate to complicated flow mechanisms and multiscale fracture networks in unconventional reservoirs. Large simulation speed-up can be obtained if performing localized computations only for the solution regions that will change. We develop an a-priori method to exploit the locality, based on the diffusive character of the Newton updates of pressure. The method makes adequate estimate of the active computational gridblock for the next iterate. The active gridblock set marks the ones need to be solved, and then the solution to local linear system is accordingly computed. Fully Implicit Scheme is used for time discretization. We study several challenging multi-phase and compositional model cases with explicit fractures. The test results demonstrate that significant solution locality of variables exist on timestep and iteration levels. A nonlinear solution update usually has sparsity, and the nonlinear convergence is restricted by a limited fraction of the simulation model. Through aggressive localization, the proposed methods can prevent overly conservative estimate, and thus achieve significant computational speedup. In comparison to a standard Newton method, the novel solver techniques achieve greatly improved solving efficiency. Furthermore, the Newton convergence exhibits no degradation, and there is no impact on the solution accuracy. Previous works in the literature largely relate to the meshing aspect that accommodates to horizontal wells and hydraulic fractures. We instead develop new nonlinear strategies to perform localization. In particular, the adaptive DD method produces proper domain partitions according to the fluid flow and nonlinear updates. This results in an effective strategy that maintains solution accuracy and convergence behavior.


2021 ◽  
Vol 9 ◽  
Author(s):  
Haoshu Shao ◽  
Xu Cai ◽  
Heming Yan ◽  
Jiapei Zhou ◽  
Yao Qin ◽  
...  

With the continuing increase of offshore wind farm power scale, it is urgent to propose a simplified wind farm model, which aggregates the entire wind farm into single or several aggregated wind turbine generators (WTGs), aiming to save computing resources and improve simulation speed. A novel aggregation algorithm that considers the power loss of offshore submarine cable is proposed, which is different from the traditional wind farm modeling method that adopts amplifying transformer as aggregation medium. Moreover, multi machine aggregation (MMA) algorithm is furtherly proposed to improve the aggregation accuracy. Simulation results verify that the proposed aggregation method can present the dynamic characteristics of wind farm with high accuracy, and can be popularized for other types of wind farm.


Molecules ◽  
2021 ◽  
Vol 26 (19) ◽  
pp. 5839
Author(s):  
Alexander Zlobin ◽  
Igor Diankin ◽  
Sergey Pushkarev ◽  
Andrey Golovin

Organophosphate hydrolases are promising as potential biotherapeutic agents to treat poisoning with pesticides or nerve gases. However, these enzymes often need to be further engineered in order to become useful in practice. One example of such enhancement is the alteration of enantioselectivity of diisopropyl fluorophosphatase (DFPase). Molecular modeling techniques offer a unique opportunity to address this task rationally by providing a physical description of the substrate-binding process. However, DFPase is a metalloenzyme, and correct modeling of metal cations is a challenging task generally coming with a tradeoff between simulation speed and accuracy. Here, we probe several molecular mechanical parameter combinations for their ability to empower long simulations needed to achieve a quantitative description of substrate binding. We demonstrate that a combination of the Amber19sb force field with the recently developed 12-6 Ca2+ models allows us to both correctly model DFPase and obtain new insights into the DFP binding process.


2021 ◽  
pp. 193229682110322
Author(s):  
Jana Schmitzer ◽  
Carolin Strobel ◽  
Ronald Blechschmidt ◽  
Adrian Tappe ◽  
Heiko Peuscher

Background: Numerical simulations, also referred to as in silico trials, are nowadays the first step toward approval of new artificial pancreas (AP) systems. One suitable tool to run such simulations is the UVA/Padova Type 1 Diabetes Metabolic Simulator (T1DMS). It was used by Toffanin et al. to provide data about safety and efficacy of AndroidAPS, one of the most wide-spread do-it-yourself AP systems. However, the setup suffered from slow simulation speed. The objective of this work is to speed up simulation by implementing the algorithm directly in MATLAB®/Simulink®. Method: Firstly, AndroidAPS is re-implemented in MATLAB® and verified. Then, the function is incorporated into T1DMS. To evaluate the new setup, a scenario covering 2 days in real time is run for 30 virtual patients. The results are compared to those presented in the literature. Results: Unit tests and integration tests proved the equivalence of the new implementation and the original AndroidAPS code. Simulation of the scenario required approximately 15 minutes, corresponding to a speed-up factor of roughly 1000 with respect to real time. The results closely resemble those presented by Toffanin et al. Discrepancies were to be expected because a different virtual population was considered. Also, some parameters could not be extracted from and harmonized with the original setup. Conclusions: The new implementation facilitates extensive in silico trials of AndroidAPS due to the significant reduction of runtime. This provides a cheap and fast means to test new versions of the algorithm before they are shared with the community.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Stephan Fischer ◽  
Marc Dinh ◽  
Vincent Henry ◽  
Philippe Robert ◽  
Anne Goelzer ◽  
...  

AbstractDetailed whole-cell modeling requires an integration of heterogeneous cell processes having different modeling formalisms, for which whole-cell simulation could remain tractable. Here, we introduce BiPSim, an open-source stochastic simulator of template-based polymerization processes, such as replication, transcription and translation. BiPSim combines an efficient abstract representation of reactions and a constant-time implementation of the Gillespie’s Stochastic Simulation Algorithm (SSA) with respect to reactions, which makes it highly efficient to simulate large-scale polymerization processes stochastically. Moreover, multi-level descriptions of polymerization processes can be handled simultaneously, allowing the user to tune a trade-off between simulation speed and model granularity. We evaluated the performance of BiPSim by simulating genome-wide gene expression in bacteria for multiple levels of granularity. Finally, since no cell-type specific information is hard-coded in the simulator, models can easily be adapted to other organismal species. We expect that BiPSim should open new perspectives for the genome-wide simulation of stochastic phenomena in biology.


Sign in / Sign up

Export Citation Format

Share Document