scholarly journals Accelerating Correlated Quantum Chemistry Calculations Using Graphical Processing Units and a Mixed Precision Matrix Multiplication Library

2009 ◽  
Vol 6 (1) ◽  
pp. 135-144 ◽  
Author(s):  
Roberto Olivares-Amaya ◽  
Mark A. Watson ◽  
Richard G. Edgar ◽  
Leslie Vogt ◽  
Yihan Shao ◽  
...  

2008 ◽  
Vol 112 (10) ◽  
pp. 2049-2057 ◽  
Author(s):  
Leslie Vogt ◽  
Roberto Olivares-Amaya ◽  
Sean Kermes ◽  
Yihan Shao ◽  
Carlos Amador-Bedolla ◽  
...  


2010 ◽  
Vol 12 (4) ◽  
pp. 40-51 ◽  
Author(s):  
Mark Watson ◽  
Roberto Olivares-Amaya ◽  
Richard G. Edgar ◽  
Alan Aspuru-Guzik


2013 ◽  
Vol 9 (3) ◽  
pp. 396-401 ◽  
Author(s):  
Yohsuke Hagiwara ◽  
Kazuki Ohno ◽  
Masaya Orita ◽  
Ryota Koga ◽  
Toshio Endo ◽  
...  






2021 ◽  
Author(s):  
Ariel Gale ◽  
Eugen Hruska ◽  
Fang Liu

Pressure plays essential roles in chemistry by altering structures and controlling chemical reactions. The extreme-pressure polarizable continuum model (XP-PCM) is an emerging method with an efficient quantum mechanical description of small and medium-size molecules at high pressure (on the order of GPa). However, its application to large molecular systems was previously hampered by CPU computation bottleneck: the Pauli repulsion potential unique to XP-PCM requires the evaluation of a large number of electric field integrals, resulting in significant computational overhead compared to the gas-phase or standard-pressure polarizable continuum model calculations. Here, we exploit advances in Graphical Processing Units (GPUs) to accelerate the XP-PCM integral evaluations. This enables high-pressure quantum chemistry simulation of proteins that used to be computationally intractable. We benchmarked the performance using 18 small proteins in aqueous solutions. Using a single GPU, our method evaluates the XP-PCM free energy of a protein with over 500 atoms and 4000 basis functions within half an hour. The time taken by the XP-PCM-integral evaluation is typically 1\% of the time taken for a gas-phase density functional theory (DFT) on the same system. The overall XP-PCM calculations require less computational effort than that for their gas-phase counterpart due to the improved convergence of self-consistent field iterations. Therefore, the description of the high-pressure effects with our GPU accelerated XP-PCM is feasible for any molecule tractable for gas-phase DFT calculation. We have also validated the accuracy of our method on small molecules whose properties under high pressure are known from experiments or previous theoretical studies.



2021 ◽  
Vol 47 (2) ◽  
pp. 1-26
Author(s):  
Field G. Van Zee ◽  
Devangi N. Parikh ◽  
Robert A. Van De Geijn

We approach the problem of implementing mixed-datatype support within the general matrix multiplication ( gemm ) operation of the BLAS-like Library Instantiation Software framework, whereby each matrix operand A , B , and C may be stored as single- or double-precision real or complex values. Another factor of complexity, whereby the matrix product and accumulation are allowed to take place in a precision different from the storage precisions of either A or B , is also discussed. We first break the problem into orthogonal dimensions, considering the mixing of domains separately from mixing precisions. Support for all combinations of matrix operands stored in either the real or complex domain is mapped out by enumerating the cases and describing an implementation approach for each. Supporting all combinations of storage and computation precisions is handled by typecasting the matrices at key stages of the computation—during packing and/or accumulation, as needed. Several optional optimizations are also documented. Performance results gathered on a 56-core Marvell ThunderX2 and a 52-core Intel Xeon Platinum demonstrate that high performance is mostly preserved, with modest slowdowns incurred from unavoidable typecast instructions. The mixed-datatype implementation confirms that combinatorial intractability is avoided, with the framework relying on only two assembly microkernels to implement 128 datatype combinations.



Sign in / Sign up

Export Citation Format

Share Document