Abstract. The work here complements the overview analysis of the modelling systems participating in the third phase of the Air Quality Model Evaluation International Initiative (AQMEII3) by focusing on the performance for hourly surface ozone by two modelling systems, Chimere for Europe and CMAQ for North America. The evaluation strategy outlined in the course of the three phases of the AQMEII activity, aimed to build up a diagnostic methodology for model evaluation, is pursued here and novel diagnostic methods are proposed. In addition to evaluating the base case simulation in which all model components are configured in their standard mode, the analysis also makes use of sensitivity simulations in which the models have been applied by altering and/or zeroing lateral boundary conditions, emissions of anthropogenic precursors, and ozone dry deposition. To help understand of the causes of model deficiencies, the error components (bias, variance, and covariance) of the base case and of the sensitivity runs are analysed in conjunction with time-scale considerations and error modelling using the available error fields of temperature, wind speed, and NOx concentration. The results reveal the effectiveness and diagnostic power of the methods devised (which remains the main scope of this study), allowing the detection of the time scale and the fields that the two models are most sensitive to. The representation of planetary boundary layers (PBL) dynamics is pivotal to both models. In particular: i) The fluctuations slower than −1.5 days account for 70–85 % of the total ozone quadratic error; ii) A recursive, systematic error with daily periodicity is detected, responsible for 10–20 % of the quadratic total error; iii) Errors in representing the timing of the daily transition between stability regimes in the PBL are responsible for a covariance error as large as 9 ppb (as much as the standard deviation of the network-average ozone observations in summer in both Europe and North America); iv) The CMAQ ozone error has a weak/negligible dependence on the errors in NO2 and wind speed, while the error in NO2 significantly impacts the ozone error produced by Chimere; v) On a continent wide monitoring network-average, a zeroing out of anthropogenic emissions produces an error increase of 45 % (25 %) during summer and of 56 % (null) during winter for Chimere (CMAQ), while a zeroing out of lateral boundary conditions results in an ozone error increase of 30 % during summer and of 180 % during winter (CMAQ).