<p>We use large deviation theory to study persistent extreme events of temperature, like heat waves or cold spells. We consider the mid-latitudes of a simplified yet Earth-like general circulation model of the atmosphere and numerically estimate large deviation rate functions of near-surface temperature averages over different spatial scales. We find that, in order to represent persistent extreme events based on large deviation theory, one has to look at temporal averages of spatially averaged observables. The spatial averaging scale is crucial, and has to correspond with the scale of the event of interest. Accordingly, the computed rate functions indicate substantially different statistical properties of temperature averages over intermediate spatial scales (larger, but still of the order of the typical scale), as compared to the ones related to any other scale. Thus, heat waves (or cold spells) can be interpreted as large deviations of temperature averaged over intermediate spatial scales. Furthermore, we find universal characteristics of rate functions, based on the equivalence of temporal, spatial, and spatio-temporal rate functions if we perform a re-normalisation by the integrated auto-correlation.</p>