Language Testing | ScienceGate

Development and initial validation of productive vocabulary tests for isiZulu, Siswati and English in South Africa

Language Testing ◽

10.1177/02655322211063785 ◽

2022 ◽

pp. 026553222110637

Author(s):

Carien Wilsenach ◽

Maxine Schaefer

Keyword(s):

South Africa ◽

Oral Language ◽

First Language ◽

Vocabulary Knowledge ◽

African Languages ◽

Initial Validation ◽

African Context ◽

Oral Language Skills ◽

Early Primary ◽

The Cost

Multilingualism in education is encouraged in South Africa, and children are expected to become bilingual and biliterate during the early primary grades. Much focus has been placed on measuring literacy in children’s first language, often the medium of instruction (MOI), and English, the language typically used as MOI from fourth grade. However, vocabulary development in African contexts is underexplored, owing to the cost of existing English standardized tests, and the comparatively fewer linguistically and contextually appropriate vocabulary assessments in African languages. To address this gap, we document the development of corpus-informed contextually appropriate tests of productive vocabulary in isiZulu, Siswati, and English, which were used for a project evaluation. The initial validation phase included 412 children. Both tests were reliable and were concurrently validated with reading comprehension tests in each language, and oral language skills in English. This study contributes to our understanding of the factors that affect the variation in vocabulary knowledge in an African context, including age, grade repetition, and vocabulary in the other language. Only English vocabulary was affected by the remote rural location of the school. We recommend some modifications to the tests before they are validated further in other populations.

Download Full-text

Critical language assessment literacy of EFL teachers: Scale construction and validation

Language Testing ◽

10.1177/02655322211057040 ◽

2022 ◽

pp. 026553222110570

Author(s):

Zia Tajeddin ◽

Mohammad Khatib ◽

Mohsen Mahdavi

Keyword(s):

Construct Validity ◽

Internal Consistency ◽

National Policy ◽

Language Assessment ◽

Language Teachers ◽

Assessment Literacy ◽

Snowball Sampling ◽

Five Factors ◽

Assessment Policies ◽

High Level

Critical language assessment (CLA) has been addressed in numerous studies. However, the majority of the studies have overlooked the need for a practical framework to measure the CLA dimension of teachers’ language assessment literacy (LAL). This gap prompted us to develop and validate a critical language assessment literacy (CLAL) scale to further underscore the role of CLA principles and their practice as an essential part of teachers’ LAL. In the first phase, a pool of items was generated through a comprehensive review of the related studies. In the quantitative phase, the developed scale was administered to 255 English as a foreign language teachers selected through convenience and snowball sampling. The data were analyzed through exploratory factor analysis for construct validity and Cronbach’s alpha for estimating internal consistency. The results showed that the items loaded on five factors: (a) teachers’ knowledge of assessment objectives, scopes, and types; (b) assessment use consequences; (c) fairness; (d) assessment policies; and (e) national policy and ideology. It was found that the scale had a high level of internal consistency and construct validity, which suggests that this scale has the potential to be useful in assessing language teachers’ CLAL and to raise language teachers’ awareness of CLAL constructs.

Download Full-text

Innovation and expansion in Language Testing for changing times

Language Testing ◽

10.1177/02655322211053212 ◽

2022 ◽

Vol 39 (1) ◽

pp. 3-6

Author(s):

Luke Harding ◽

Paula Winke

Keyword(s):

Language Testing ◽

Changing Times

Download Full-text

Gauging the impact of literacy and educational background on receptive vocabulary test scores

Language Testing ◽

10.1177/02655322211049097 ◽

2021 ◽

pp. 026553222110490

Author(s):

Bart Deygers ◽

Marieke Vanbuel

Keyword(s):

Test Performance ◽

Educational Background ◽

Receptive Vocabulary ◽

Vocabulary Test ◽

Dutch Version ◽

Highly Educated ◽

Performance Differences ◽

Item Functioning ◽

Low Educated ◽

The Impact

The Peabody Picture Vocabulary Test (PPVT) is a widely used test of receptive vocabulary, but no researchers to date have examined the performance of low-educated, low-literate L2 adults, or compared these individuals’ performances to their more highly educated peers. In this study, we used many-facet Rasch analysis and mixed-effects linear regression to determine the impact of educational background and other demographic variables on PPVT test performance. The analyses rely on the performance data of 1014 adult learners of Dutch as a second language on the Dutch version of the PPVT (PPVT-III-NL). The results show that a substantial proportion of score variance can be attributed to educational background variables and to the educational tracks the participants followed. These tracks, which cater to the needs of different L2 learner profiles, appear to exacerbate rather than mediate any performance differences. Although this study provides evidence of performance differences and differential item functioning resulting from linguistic, demographic, and educational variables, it offers no data to invalidate the use of the PPVT on low-educated L2 adults.

Download Full-text

Hong Kong secondary students’ perspectives on selecting test difficulty level and learner washback: Effects of a graded approach to assessment

Language Testing ◽

10.1177/02655322211050600 ◽

2021 ◽

pp. 026553222110506

Author(s):

Chi Lai Tsang ◽

Talia Isaacs

Keyword(s):

Hong Kong ◽

Language Learning ◽

Secondary Students ◽

English Language ◽

Learning Experience ◽

Difficulty Level ◽

High Stakes ◽

Mediating Variables ◽

Exploratory Factor Analyses ◽

Learner Perceptions

This sequential mixed-methods study investigates washback on learning in a high-stakes school exit examination by examining learner perceptions and reported behaviours in relation to learners’ beliefs and language learning experience, the role of other stakeholders in the washback mechanism, and socio-educational forces. The focus is the graded approach of the Hong Kong Diploma of Secondary Education English Language Examination (HKDSE-English), incorporated in 2012, that allows test-takers to choose between easier and more difficult sections for reading and listening-integrated skills papers. Inductive coding of focus groups involving 12 secondary students fed into the development of the Washback on Students’ Learning questionnaire, which was administered to another 150 learners. Exploratory factor analyses of identified washback effects revealed four major types straddling different settings (classrooms, tutorial schools, learners’ personal environment), and seven categories of mediating variables pertaining to learners themselves, other stakeholders, and societal influences. Simultaneous multiple regressions identified influential clusters of mediating variables and showed the strongest predictors for each macro-level washback type varied. At least one intrinsic and one extrinsic factor category significantly contributed to all types, reaffirming learner washback as a socially situated, negotiated construct. Implications related to the consequences, use, and fairness of the graded approach are discussed.

Download Full-text

Reading is a multidimensional construct at child-L2-English-literacy onset, but comprises fewer dimensions over time: Evidence from multidimensional IRT analysis

Language Testing ◽

10.1177/02655322211045296 ◽

2021 ◽

pp. 026553222110452

Author(s):

Shangchao Min ◽

Kyoungwon Bishop ◽

Howard Gary Cook

Keyword(s):

Content Knowledge ◽

Test Performance ◽

Reading Ability ◽

Large Scale ◽

English For Academic Purposes ◽

Reading Tests ◽

Proficiency Levels ◽

Multidimensional Construct ◽

Level Increase ◽

English Literacy

This study explored the interplay between content knowledge and reading ability in a large-scale multistage adaptive English for academic purposes (EAP) reading assessment at a range of ability levels across 1–12 graders. The datasets for this study were item-level responses to the reading tests of ACCESS for ELLs Online 2.0. A sample of 10,000 test takers were each time randomly drawn from the test-taking population at five grade clusters without manipulation on proficiency levels, and then with manipulation on proficiency levels. The results indicated that although the bi-factor multidimensional item response theory (MIRT) model fit the data significantly better than the unidimensional two-parameter logistic (2PL) model for Grade 1, no clear evidence can be found regarding the dimensionality of the test for Grades 2–12. However, content knowledge was consistently found to contribute substantially to test performance for low-ability-level test takers across all grade clusters. The findings indicate that EAP reading ability is a multidimensional construct in the onset of EAP reading ability development, but the presence of multidimensionality decreases as proficiency level and grade level increase. This study provides insights into the developmental pattern of the interplay between language and content in EAP reading contexts.

Download Full-text

What the analytic versus holistic scoring of international teaching assistants can reveal: Lexical grammar matters

Language Testing ◽

10.1177/02655322211040020 ◽

2021 ◽

pp. 026553222110400

Author(s):

Wenyue Ma

Keyword(s):

International Teaching Assistants ◽

Holistic Scoring ◽

Grammatical Competence ◽

International Teaching Assistant ◽

Speaking Proficiency ◽

International Teaching ◽

Highly Correlated ◽

High Level ◽

Composite Scores ◽

The Relationship

Second-language (L2) testing researchers have explored the relationship between speakers’ overall speaking ability, reflected by holistic scores, and the speakers’ performance on speaking subcomponents, reflected by analytic scores (e.g., McNamara, 1990; Sato, 2011). These research studies have advanced applied linguists’ understanding of how raters view the components of effective speaking skills, but the authors of the studies either used analytic composite scores, instead of true holistic ratings, or ran regression analyses with highly correlated subscores, which is problematic. To address these issues, 10 experienced ITA raters rated the speaking of 127 international teaching assistant (ITA) candidates using a four-component analytic rubric. In addition, holistic ratings were provided for the 127 test takers from a separate (earlier) scoring by two experienced ITA raters. The two types of scores differentiated examinees in similar ways. The variability observed in students’ holistic scores was reflected in their analytic scores. However, among the four analytic subscales, examinees’ scores on Lexical and Grammatical Competence had the greatest differentiating power. Its scores indicated with a high level of accuracy who passed the test and who did not. The paper discusses the components contributing to ITAs’ L2 oral speaking proficiency, and reviews pedagogical implications.

Download Full-text

A comparative judgment approach to assessing Chinese Sign Language interpreting

Language Testing ◽

10.1177/02655322211038977 ◽

2021 ◽

pp. 026553222110389

Author(s):

Chao Han ◽

Xiaoyan Xiao

Keyword(s):

Sign Language ◽

Comparative Judgment ◽

Binary Outcomes ◽

Future Research ◽

Evidential Support ◽

Valid Assessment ◽

Novel Method ◽

The One ◽

Sign Language Interpreting

The quality of sign language interpreting (SLI) is a gripping construct among practitioners, educators and researchers, calling for reliable and valid assessment. There has been a diverse array of methods in the extant literature to measure SLI quality, ranging from traditional error analysis to recent rubric scoring. In this study, we want to expand the terrain of SLI assessment, by exploring and evaluating a novel method, known as comparative judgment (CJ), to assess SLI quality. Briefly, CJ involves judges to compare two like objects/items and make a decision by choosing the one with higher quality. The binary outcomes from repeated comparisons by a group of judges are then modelled statistically to produce standardized estimates of perceived quality for each object/item. We recruited 12 expert judges to operationalize CJ via a computerized system to assess the quality of Chinese Sign Language interpreting produced by 36 trainee interpreters. Overall, our analysis of quantitative and qualitative data provided preliminary evidential support for the validity and utility of CJ in SLI assessment. We discussed these results in relation to previous SLI literature, and suggested future research to cast light on CJ’s usefulness in applied assessment contexts.

Download Full-text

Interpreting testing and assessment: A state-of-the-art review

Language Testing ◽

10.1177/02655322211036100 ◽

2021 ◽

pp. 026553222110361

Author(s):

Chao Han

Keyword(s):

State Of The Art ◽

Professional Certification ◽

Automatic Assessment ◽

Prospective Students ◽

Assessment Practice ◽

High Stakes ◽

Future Directions ◽

Assessment Tasks ◽

Interpreter Education ◽

Selection Of

Over the past decade, testing and assessing spoken-language interpreting has garnered an increasing amount of attention from stakeholders in interpreter education, professional certification, and interpreting research. This is because in these fields assessment results provide a critical evidential basis for high-stakes decisions, such as the selection of prospective students, the certification of interpreters, and the confirmation/refutation of research hypotheses. However, few reviews exist providing a comprehensive mapping of relevant practice and research. The present article therefore aims to offer a state-of-the-art review, summarizing the existing literature and discovering potential lacunae. In particular, the article first provides an overview of interpreting ability/competence and relevant research, followed by main testing and assessment practice (e.g., assessment tasks, assessment criteria, scoring methods, specificities of scoring operationalization), with a focus on operational diversity and psychometric properties. Second, the review describes a limited yet steadily growing body of empirical research that examines rater-mediated interpreting assessment, and casts light on automatic assessment as an emerging research topic. Third, the review discusses epistemological, psychometric, and practical challenges facing interpreting testers. Finally, it identifies future directions that could address the challenges arising from fast-changing pedagogical, educational, and professional landscapes.

Download Full-text

Aptis test review

Language Testing ◽

10.1177/02655322211032873 ◽

2021 ◽

pp. 026553222110328

Author(s):

Ji-young Shin ◽

Rodrigo A. Rodríguez-Fuentes ◽

Aleksandra M. Swatek ◽

April Ginther

Keyword(s):

Test Review

Download Full-text

Language Testing
Latest Publications

TOTAL DOCUMENTS

H-INDEX

Published By Sage Publications

Development and initial validation of productive vocabulary tests for isiZulu, Siswati and English in South Africa

Critical language assessment literacy of EFL teachers: Scale construction and validation

Innovation and expansion in Language Testing for changing times

Gauging the impact of literacy and educational background on receptive vocabulary test scores

Hong Kong secondary students’ perspectives on selecting test difficulty level and learner washback: Effects of a graded approach to assessment

Reading is a multidimensional construct at child-L2-English-literacy onset, but comprises fewer dimensions over time: Evidence from multidimensional IRT analysis

What the analytic versus holistic scoring of international teaching assistants can reveal: Lexical grammar matters

A comparative judgment approach to assessing Chinese Sign Language interpreting

Interpreting testing and assessment: A state-of-the-art review

Aptis test review

Export Citation Format

Language TestingLatest Publications

TOTAL DOCUMENTS

H-INDEX

Published By Sage Publications

Development and initial validation of productive vocabulary tests for isiZulu, Siswati and English in South Africa

Critical language assessment literacy of EFL teachers: Scale construction and validation

Innovation and expansion in Language Testing for changing times

Gauging the impact of literacy and educational background on receptive vocabulary test scores

Hong Kong secondary students’ perspectives on selecting test difficulty level and learner washback: Effects of a graded approach to assessment

Reading is a multidimensional construct at child-L2-English-literacy onset, but comprises fewer dimensions over time: Evidence from multidimensional IRT analysis

What the analytic versus holistic scoring of international teaching assistants can reveal: Lexical grammar matters

A comparative judgment approach to assessing Chinese Sign Language interpreting

Interpreting testing and assessment: A state-of-the-art review

Aptis test review

Language Testing
Latest Publications