Understanding Variance in Pilot Performance Ratings
Two studies were designed to investigate how pilots of different rank evaluate flight-deck performance. In each study, the pilots were asked to assess sets of three different videotaped scenarios featuring pilots in a simulator exhibiting poor, average, and good performance. Study 1, which included 92 airline pilots of differing rank, was aimed at comparing how individuals rate performance. The subjects used a standardized assessment form, which included six criteria, each having a 5-point rating scale. Analysis of the first study revealed that there was considerable variance in the performance ratings between flight examiners, captains, and first officers. The second study was designed to better understand the variance. Eighteen pilots (six flight examiners, six captains, and six first officers) working in pairs evaluated performances, in a modified think-aloud protocol. The results showed that there were good reasons for the observed variances. The results are discussed in relation to inter-rater reliability.