<p>This work presents a
Gaussian process regression (GPR) model on top of a novel graph representation
of chemical molecules that predicts thermodynamic properties of pure substances
in single, double, and triple phases. A transferable molecular graph
representation is proposed as the input for a marginalized graph kernel, which
is the major component of the covariance function in our GPR models. Radial
basis function kernels of temperature and pressure are also incorporated into
the covariance function when necessary. We predicted three types of
representative properties of pure substances in single, double, and triple phases,
i.e., critical temperature, vapor-liquid equilibrium (VLE) density, and
pressure-temperature density. The data is collected from Knovel
Data Analysis Beta: NIST ThermoDynamics Pure Compounds. The accuracy of the
models is nearly identical to the precision of the experimental measurements.
Moreover, the reliability of our predictions can be quantified on a per-sample
basis using the posterior uncertainty of the GPR model. We compare our model
against Morgan fingerprints and a graph neural network to further demonstrate
the advantage of the proposed method. The
marginalized graph kernel is computed using GraphDot package at <a href="https://github.com/yhtang/GraphDot">https://github.com/yhtang/GraphDot</a>. All codes used in this work can be found at <a href="https://github.com/Xiangyan93/Chem-Graph-Kernel-Machine">https://github.com/Xiangyan93/Chem-Graph-Kernel-Machine</a>.</p>