consensual assessment technique
Recently Published Documents


TOTAL DOCUMENTS

46
(FIVE YEARS 13)

H-INDEX

13
(FIVE YEARS 1)

2021 ◽  
pp. 1-14
Author(s):  
Kristen Edwards ◽  
Aoran Peng ◽  
Scarlett Miller ◽  
Faez Ahmed

Abstract A picture is worth a thousand words, and in design metric estimation, a word may be worth a thousand features. Pictures are awarded this worth because they encode a plethora of information. When evaluating designs, we aim to capture a range of information, including usefulness, uniqueness, and novelty of a design. The subjective nature of these concepts makes their evaluation difficult. Still, many attempts have been made and metrics developed to do so, because design evaluation is integral to the creation of novel solutions. The most common metrics used are the consensual assessment technique (CAT) and the Shah, Vargas-Hernandez, and Smith (SVS) method. While CAT is accurate and often regarded as the “gold standard,” it relies on using expert ratings, making CAT expensive and time-consuming. Comparatively, SVS is less resource-demanding, but often criticized as lacking sensitivity and accuracy. We utilize the complementary strengths of both methods through machine learning. This study investigates the potential of machine learning to predict expert creativity assessments from non-expert survey results. The SVS method results in a text-rich dataset about a design. We utilize these textual design representations and the deep semantic relationships that natural language encodes to predict more desirable design metrics, including CAT metrics. We demonstrate the ability of machine learning models to predict design metrics from the design itself and SVS survey information. We show that incorporating natural language processing improves prediction results across design metrics, and that clear distinctions in the predictability of certain metrics exist.


2021 ◽  
Author(s):  
Selina Weiss ◽  
Oliver Wilhelm ◽  
Patrick Kyllonen

The assessment of creativity presents major challenges. The many competing and complementary ideas on measuring creativity have resulted in a wide diversity of measures, making it difficult for potential users to decide on their appropriateness. Prior research has proposed creativity assessment taxonomies, but we argue that these have shortcomings because they often were not designed to (a) assess the essential assessment features and (b) are insufficiently specified for reliably categorizing extant measures. Based on prior categorization approaches, we propose a new framework for categorizing creativity measures including the following attributes: (a) measurement approach (self-report, other-report, ability tests), (b) construct (e.g., creative interests and attitudes, creative achievements, divergent thinking), (c) data type generated (e.g., questionnaire data vs. accomplishments counts), (d) prototypical scoring method (e.g., consensual assessment technique; CAT), and (e) psychometric problems. We identified 228 creativity measures appearing in the literature since 1900 and classified each measure according to their task attributes by two independent raters (rater agreement Cohen’s kappa .83 to 1.00 for construct). We provide a summary of convergent validity evidence and psychometric shortcomings. We conclude with recommendations for using the taxonomy and some psychometric desiderata for future research.


2021 ◽  
Author(s):  
Kristen M. Edwards ◽  
Aoran Peng ◽  
Scarlett R. Miller ◽  
Faez Ahmed

Abstract A picture is worth a thousand words, and in design metric estimation, a word may be worth a thousand features. Pictures are awarded this worth because of their ability to encode a plethora of information. When evaluating designs, we aim to capture a range of information as well, information including usefulness, uniqueness, and novelty of a design. The subjective nature of these concepts makes their evaluation difficult. Despite this, many attempts have been made and metrics developed to do so, because design evaluation is integral to innovation and the creation of novel solutions. The most common metrics used are the consensual assessment technique (CAT) and the Shah, Vargas-Hernandez, and Smith (SVS) method. While CAT is accurate and often regarded as the “gold standard,” it heavily relies on using expert ratings as a basis for judgement, making CAT expensive and time consuming. Comparatively, SVS is less resource-demanding, but it is often criticized as lacking sensitivity and accuracy. We aim to take advantage of the distinct strengths of both methods through machine learning. More specifically, this study seeks to investigate the possibility of using machine learning to facilitate automated creativity assessment. The SVS method results in a text-rich dataset about a design. In this paper we utilize these textual design representations and the deep semantic relationships that words and sentences encode, to predict more desirable design metrics, including CAT metrics. We demonstrate the ability of machine learning models to predict design metrics from the design itself and SVS Survey information. We demonstrate that incorporating natural language processing (NLP) improves prediction results across all of our design metrics, and that clear distinctions in the predictability of certain metrics exist. Our code and additional information about our work are available at http://decode.mit.edu/projects/nlp-design-eval/.


2021 ◽  
Vol 1 ◽  
pp. 263-272
Author(s):  
Yuan Yin ◽  
Ji Han ◽  
Shu Huang ◽  
Haoyu Zuo ◽  
Peter Childs

AbstractThis paper asked participants to assess four selected expert-rated Taiwan International Student Design Competition (TISDC) products using four methods: Consensual Assessment Technique (CAT), Creative Product Semantic Scale (CPSS), Product Creativity Measurement Instrument (PCMI), and revised Creative Solution Diagnosis Scale (rCSDS). The results revealed that, between experts and non-experts, the ranking results by the CAT and CPSS were the same, while the ranking results of the rCSDS were different. The CAT, CPSS, and TISDC methods provided the same results indicating that raters may return the same results on creativity assessment, and the results are not affected by the selected methods.If it is necessary to use non-experts to assess creativity and the creativity results are expected to be the same with that of experts, asking non-expert raters to use CPSS to assess creativity and then ranking the creativity score is more reliable. The study offers a contribution to the creativity domain on deciding which methods may be more reliable from a comparison perspective.


2021 ◽  
pp. 030573562098878
Author(s):  
Maud Hickey ◽  
Daniel Healy ◽  
Casey Schmidt

The purpose of this study was to determine how inter-rater reliability scores for iPad improvisations and clarinet improvisations would compare between two different creativity assessment measures—the Consensual Assessment Technique (CAT) and the Test of Ability to Improvise (TAI). In addition, we examined how the overall and subscore ratings for each measure related to each other. Improvisation files were collected from 43 students who had 2 to 3 years’ experience on the clarinet. Two independent panels of judges rated the improvisations using either the CAT or the TAI. Results showed no relationships between the composite or subscores of the two measures. Inter-rater reliability ratings were moderate, and slightly higher on the TAI than the CAT except for the subscore of creativity, where the CAT reliability scores were higher. Further research is needed to understand the more nuanced differences between these two measures, as well as to find a valid a reliable tool for the measurement of creativity and improvisation for school-aged children.


2020 ◽  
pp. 1-12
Author(s):  
Daniela Zahn ◽  
Ursula Canton ◽  
Victoria Boyd ◽  
Laura Hamilton ◽  
Josianne Mamo ◽  
...  

Author(s):  
Georgios Koronis ◽  
Rianne Wally Meurzec ◽  
Arlindo Silva ◽  
Marco Leite ◽  
Elsa Henriques ◽  
...  

AbstractThe purpose of this work is to compare the creative outcome in the educational context of students belonging to two different cultures, namely Singaporean and Portuguese and determine whether they respond differently to the same design brief. The participants from both samples equal 121 student designers and span from 18–25 years old. Students were randomly distributed within a uniform, standard of student performance, which allowed for fair comparison between groups. Expert judges were employed to judge the creativity of concept sketches generated during a Collaborative Sketching exercise. To evaluate the creative outcome, we employed the Consensual Assessment Technique based on a rubric-based system developed in our earlier works. The analysis of variance procedure revealed no statistically significant difference between the averaged total scores of the two groups on the appropriateness measure. However, the student designers from both samples showed statistically significant differences when provided with a baseline brief in the novelty measure. In consideration of the overall creativity scores, a relatively equivalent performance is observed across the two universities.


Sign in / Sign up

Export Citation Format

Share Document