Study of Existing Metrics Used in Measurement of Ideation Effectiveness
In recent years, many new idea generation methods have been developed to generate innovative concepts. The effectiveness of those methods is evaluated by applying a set of metrics to the resulting concepts. Several metrics have been proposed for this purpose, including quality, novelty, and variety metrics, but the inter-rater reliability of those metrics has not been investigated extensively. In this paper, the inter-rater reliability of three existing metrics is analyzed by applying them to the results of a representative idea generation study. The effects on inter-rater agreement of analyzing concepts at the overall concept level versus the feature level are investigated, along with the impacts of alternative scales for specific metrics. In general, the inter-rater reliability of the metrics is found to be relatively low, with the most reliable results obtained at the feature level. The use of different scales also affects inter-rater reliability, but the effect is less significant. In addition to their low levels of repeatability, the metrics differ in how novelty is appraised.