Abstract. The Earth's land surface and the atmosphere are strongly interlinked through the exchange of energy and matter (e.g. water and carbon). This coupled behaviour causes various land–atmosphere feedbacks and an insufficient understanding of these feedbacks contributes to uncertain global climate model projections. For example, a crucial role of the land surface in exacerbating summer heat waves in mid-latitude regions has been identified empirically for high-impact heatwaves, but individual climate models differ widely in their respective representation of land-atmosphere coupling. Here, we combine an ensemble of observations-based and simulated temperature (T) and evapotranspiration (ET) datasets and investigate coincidences of T anomalies with ET anomalies as a proxy for land-atmosphere interactions during periods of anomalously warm temperatures. We demonstrate that a relatively large fraction of state-of-the-art climate models from the Coupled Model Intercomparison Project (CMIP5) archive produces systematically too frequent coincidences of high T anomalies with negative ET anomalies in mid-latitude regions during the warm season and in several tropical regions year-round. Further, we show that these coincidences (high T, low ET), as diagnosed by the land-coupling coincidence metrics, are closely related to the variability and extremes of simulated temperatures across a multi-model ensemble. Thus, our approach offers a physically consistent, diagnostic-based avenue to evaluate these ensembles, and subsequently reduce model biases in simulated and predicted extreme temperatures. Following this idea, we derive a land-coupling constraint based on the spread of 54 combinations of T-ET benchmarking datasets and consequently retain only a subset of CMIP5 models that produce a land-coupling behaviour that is compatible with these observations-based benchmark estimates. The constrained multi-model projections exhibit lower temperature extremes in regions where models show substantial spread in T-ET coupling, and in addition, biases in the climate model ensemble are consistently reduced.