Abstract. Single footprint retrievals of carbon monoxide from the Atmospheric Infrared Sounder (AIRS) are evaluated using aircraft in situ observations. The aircraft data are from the HIAPER Pole-to-Pole (HIPPO, 2009–2011), the first three Atmospheric Tomography Mission (ATom, 2016–2017) campaigns and the National Oceanic and Atmospheric Administration (NOAA) Global Monitoring Laboratory (GML) Global Greenhouse Gas Reference Network Aircraft Program from 2006–2017. The retrievals are obtained using an optimal estimation approach within the MUlti-SpEctra, MUlti-SpEcies, MUlti-Sensors (MUSES) algorithm. Retrieval biases and estimated errors are evaluated across a range of latitudes from the sub-polar to tropical regions over both ocean and land points. AIRS MUSES CO profiles were compared with HIPPO, ATom, and NOAA GML aircraft observations with a coincidence of 9 hours and 50 km to estimate retrieval biases and standard deviations. Comparisons were done for different pressure levels and column averages, latitudes, day, night, land, and ocean observations. We find mean biases of +6.6 % +/− 4.6 %, +0.6 % +/− 3.2 %, −6.1 % +/− 3.0 %, and 1.4 % +/− 3.6 %, for 750 hPa, 510 hPa, 287 hPa, and the column averages, respectively. The mean standard deviation is 15 %, 11 %, 12 %, and 9 % at these same pressure levels, respectively. Observation errors (theoretical errors) from the retrievals were found to be broadly consistent in magnitude with those estimated empirically from ensembles of satellite aircraft comparisons. The GML Aircraft Program comparisons generally had higher standard deviations and biases than the HIPPO and ATom comparisons. Since the GML aircraft flights do not go as high as the HIPPO and ATom flights, results from these GML comparisons are more sensitive to the choice of method for extrapolation of the aircraft profile above the uppermost measurement altitude. The AIRS retrieval performance shows little sensitivity to surface type (land or ocean) or day or night but some sensitivity to latitude. Comparisons to the NOAA GML set spanning the years 2006–2017 show that the AIRS retrievals are able to capture the distinct seasonal cycles but show a high bias of ~20 % in the lower troposphere during the summer when observed CO mixing ratios are at annual minimum values. The retrieval bias drift was examined over the same period and found to be small at < 0.5 % over the 2006–2017 time period.