Abstract. Bayesian inference of microbial soil respiration models is often based on the
assumptions that the residuals are independent (i.e., no temporal or spatial
correlation), identically distributed (i.e., Gaussian noise), and have
constant variance (i.e., homoscedastic). In the presence of model
discrepancy, as no model is perfect, this study shows that these assumptions
are generally invalid in soil respiration modeling such that residuals have
high temporal correlation, an increasing variance with increasing magnitude
of CO2 efflux, and non-Gaussian distribution. Relaxing these three
assumptions stepwise results in eight data models. Data models are the basis
of formulating likelihood functions of Bayesian inference. This study
presents a systematic and comprehensive investigation of the impacts of data
model selection on Bayesian inference and predictive performance. We use
three mechanistic soil respiration models with different levels of model
fidelity (i.e., model discrepancy) with respect to the number of carbon pools
and the explicit representations of soil moisture controls on carbon
degradation; therefore, we have different levels of model complexity with
respect to the number of model parameters. The study shows that data models
have substantial impacts on Bayesian inference and predictive performance of
the soil respiration models such that the following points are true: (i) the
level of complexity of the best model is generally justified by the
cross-validation results for different data models; (ii) not accounting for
heteroscedasticity and autocorrelation might not necessarily result in biased
parameter estimates or predictions, but will definitely underestimate
uncertainty; (iii) using a non-Gaussian data model improves the parameter
estimates and the predictive performance; and (iv) accounting for autocorrelation
only or joint inversion of correlation and heteroscedasticity can be problematic
and requires special treatment. Although the conclusions of this study are empirical, the analysis may provide insights
for selecting appropriate data models for soil respiration modeling.