From Standards to Rubrics: Comparing Full-Range to At-Level Applications of an Item-Level Scoring Rubric on an Oral Proficiency Assessment

Author(s):  
Troy L. Cox ◽  
Randall S. Davies
2021 ◽  
pp. 026765832098805
Author(s):  
Kazuya Saito ◽  
Yuwei Liu

There is emerging evidence that collocation use plays a primary role in determining various dimensions of L2 oral proficiency assessment and development. The current study presents the results of three experiments which examined the relationship between the degree of association in collocation use (operationalized as t scores and mutual information scores) and the intuitive judgements of L2 comprehensibility (i.e. ease of understanding). The topic was approached from the angles of different task conditions (Study 1), rater background (first language or L1 vs. second language or L2) (Study 2) and cross-sectional vs. longitudinal analyses (Study 3). The findings showed that: (1) collocation emerged as a medium-to-strong determinant of L2 comprehensibility in structured (picture description) compared to free (oral interview) oral production tasks; (2) with sufficient immersion experience, L2 raters can demonstrate as much sensitivity to collocation as L1 raters; and (3) conversational experience is associated with more coherent and mutually-exclusive combinations of words in L2 speech, resulting in greater L2 comprehensibility development.


2019 ◽  
Vol 36 (4) ◽  
pp. 481-504 ◽  
Author(s):  
Okim Kang ◽  
Don Rubin ◽  
Alyssa Kermad

As a result of the fact that judgments of non-native speech are closely tied to social biases, oral proficiency ratings are susceptible to error because of rater background and social attitudes. In the present study we seek first to estimate the variance attributable to rater background and attitudinal variables on novice raters’ assessments of L2 spoken English. Second, we examine the effects of minimal training in reducing the potency of those trait-irrelevant rater factors. Accordingly, we examined the relative impact of rater differences on TOEFL iBT® speaking scores. Eighty-two untrained raters judged 112 speech samples produced by TOEFL® examinees. Findings revealed that approximately 20% of untrained raters’ score variance was, in part, a result of their background and attitudinal factors. The strongest predictor was the raters’ own native speaker status. However, minimal online training dramatically reduced the impact of rater background and attitudinal variables for a subsample of high- and low-severity raters. Implications suggest that brief and user-friendly rater-training sessions offer the promise of mitigating rater bias, at least in the short run. This procedure can be adopted in assessment and other related fields of applied linguistics.


Sign in / Sign up

Export Citation Format

Share Document