scholarly journals A Comparability Study of Text Difficulty and Task Characteristics of Parallel Academic IELTS Reading Tests

2019 ◽  
Vol 13 (1) ◽  
pp. 31
Author(s):  
Linyu Liao

As a high-stakes standardized test, IELTS is expected to have comparable forms of test papers so that test takers from different test administration on different dates receive comparable test scores. Therefore, this study examined the text difficulty and task characteristics of four parallel academic IELTS reading tests to reveal to what extent the four tests were comparable in terms of text difficulty, construct coverage, response format, item scope, and task scope. The Coh-Metrix-TEA software was used for the text difficulty analyses and expert judgments were used for task characteristics analyses. The results show that the four reading tests were partly comparable in text difficulty, comparable in terms of construct coverage and item scope, but not comparable in terms of response format and task scope. Based on the findings, implications were discussed on test development and future research.

2009 ◽  
Vol 17 ◽  
pp. 22
Author(s):  
Sharon E. Paulson ◽  
Gregory J. Marchant

This article examines the role of student demographic characteristics in standardized achievement test scores at both the individual level and aggregated at the state, district, school levels. For several data sets, the majority of the variance among states, districts, and schools was related to demographic characteristics. Where these background variables outside of the control of schools significantly affected averaged scores, and test scores result in high stakes consequences, benefits and sanctions may be inappropriately applied. Furthermore, disaggregating the data by race, SES, limited English, or other groupings ignores the significant confounding and cumulative effects of belonging to more than one disadvantaged group. With these approaches to evaluation being fundamental to the No Child Left Behind mandates, the danger of misinterpretation and inappropriate application of sanctions is substantial.


2019 ◽  
Vol 2 (10) ◽  
pp. 22-34
Author(s):  
Muhammad Madi Abdullah ◽  
Hui Lin Lau

This study examines five perceived task characteristics variables (decision-making autonomy, task variety, task significance, task identity, feedback from job) and its effect on employee engagement at six private college employees in Sarawak. Despite extensive research on employee engagement, very little empirical research has examined this area of study specifically in the context of Sarawak private colleges. A survey questionnaire was employed to collect the data. A total of 284 responses were analysed using SPSS version 22.0. The results of this study revealed that only feedback from job, task identity and decision-making autonomy are significantly and positively associated with employee engagement. Task significance and task variety were not significantly related to employee engagement. Most importantly, decision-making autonomy was perceived as a dominant task characteristic for employee engagement; it was associated with significant improvements on private colleges’ employee engagement. This study contributes to a better understanding of the effect of perceived task characteristics on employee engagement among private colleges in Sarawak. The implications for practice and suggestions for future research are discussed.


2021 ◽  
Vol 12 ◽  
Author(s):  
Don Yao ◽  
Matthew P. Wallace

It is not uncommon for immigration-seekers to be actively involved in taking various language tests for immigration purposes. Given the large-scale and high-stakes nature those language tests possess, the validity issues (e.g., appropriate score-based interpretations and decisions) associated with them are of great importance as test scores may play a gate-keeping role in immigration. Though interest in investigating the validity of language tests for immigration purposes is becoming prevalent, there has to be a systematic review of the research foci and results of this body of research. To address this need, the current paper critically reviewed 11 validation studies on language assessment for immigration over the last two decades to identify what has been focused on and what has been overlooked in the empirical research and to discuss current research interests and future research trends. Assessment Use Argument (AUA) framework of Bachman and Palmer (2010), comprising four inferences (i.e., assessment records, interpretations, decisions, and consequences), was adopted to collect and examine evidence of test validity. Results showed the consequences inference received the most investigations focusing on immigration-seekers’ and policymakers’ perceptions on test consequences, while the decisions inference was the least probed stressing immigration-seekers’ attitude towards the impartiality of decision-making. It is recommended that further studies could explore more kinds of stakeholders (e.g., test developers) in terms of their perceptions on the test and investigate more about the fairness of decision-making based on test scores. Additionally, the current AUA framework includes only positive and negative consequences that an assessment may engender but does not take compounded consequences into account. It is suggested that further research could enrich the framework. The paper sheds some light on the field of language assessment for immigration and brings about theoretical, practical, and political implications for different kinds of stakeholders (e.g., researchers, test developers, and policymakers).


2021 ◽  
Vol 12 ◽  
Author(s):  
Georg Hosoya ◽  
Sigrid Blömeke ◽  
Katja Eilerts ◽  
Lars Jenßen ◽  
Michael Eid

This study examined absolute and relative judgment accuracies of German early childhood (EC) teachers with respect to the mathematical skills of the children under their supervision. The two types of judgment accuracies are crucial prerequisites for pacing activities in EC education and offering differentiated educational activities adapted to individual skill levels of children. Data from 39 EC teachers and 268 children were analyzed using multilevel modeling. Teachers rated the skills of children on a structured observation instrument (“Kinder Diagnose Tool,” KiDiT). Children were assessed on their mathematical skills with a standardized test (“Mathematische Basiskompetenzen im Kindesalter,” MBK-0). On average, 65% of the variation in judgments of teachers on the KiDiT could be explained by MBK-0 scores of children, which suggest that teachers are—on average—able to rank children within their groups. Teachers were also able to judge the mathematical level of skills of children as assessed by the MBK-0. Neither mathematical content knowledge (MCK) of teachers nor their mathematics pedagogical content knowledge (MPCK) or general pedagogical knowledge (GPK) moderated the relationship between judgments of teachers and test scores of children or the relationship between the level of the judgments and the level of test scores. Conclusions for future research and practice are drawn.


2020 ◽  
Vol 16 (1) ◽  
pp. 87-121
Author(s):  
Bárbara Eizaga-Rebollar ◽  
Cristina Heras-Ramírez

AbstractThe study of pragmatic competence has gained increasing importance within second language assessment over the last three decades. However, its study in L2 language testing is still scarce. The aim of this paper is to research the extent to which pragmatic competence as defined by the Common European Framework of Reference for Languages (CEFR) has been accommodated in the task descriptions and rating scales of two of the most popular Oral Proficiency Interviews (OPIs) at a C1 level: Cambridge’s Certificate in Advanced English (CAE) and Trinity’s Integrated Skills in English (ISE) III. To carry out this research, OPI tests are first defined, highlighting their differences from L2 pragmatic tests. After pragmatic competence in the CEFR is examined, focusing on the updates in the new descriptors, CAE and ISE III formats, structure and task characteristics are compared, showing that, while the formats and some characteristics are found to differ, the structures and task types are comparable. Finally, we systematically analyse CEFR pragmatic competence in the task skills and rating scale descriptors of both OPIs. The findings show that the task descriptions incorporate mostly aspects of discourse and design competence. Additionally, we find that each OPI is seen to prioritise different aspects of pragmatic competence within their rating scale, with CAE focusing mostly on discourse competence and fluency, and ISE III on functional competence. Our study shows that the tests fail to fully accommodate all aspects of pragmatic competence in the task skills and rating scales, although the aspects they do incorporate follow the CEFR descriptors on pragmatic competence. It also reveals a mismatch between the task competences being tested and the rating scale. To conclude, some research lines are proposed.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Dmitri Rozgonjuk ◽  
Karin Täht ◽  
Kristjan Vassil

Abstract Background The excessive use of Internet-based technologies has received a considerable attention over the past years. Despite this, there is relatively little research on how general Internet usage patterns at and outside of school as well as on weekends may be associated with mathematics achievement. Moreover, only a handful of studies have implemented a longitudinal or repeated-measures approach on this research question. The aim of the current study was to fill that gap. Specifically, we investigated the potential associations of Internet use at and outside of school as well as on weekends with mathematics test performance in both high- and low-stakes testing conditions over a period of 3 years in a representative sample of Estonian teenagers. Methods PISA 2015 survey data in conjunction with national educational registry data were used for the current study. Specifically, Internet use at and outside of school as well as on weekends were queried during the PISA 2015 survey. In addition, the data set included PISA mathematics test results from 4113 Estonian 9th-grade students. Furthermore, 3758 of these students also had a 9th-grade national mathematics exam score from a couple of months after the PISA survey. Finally, of these students, the results of 12th-grade mathematics national exam scores were available for 1612 and 1174 students for “wide” (comprehensive) and “narrow” (less comprehensive) mathematics exams, respectively. Results The results showed that the rather low-stakes PISA mathematics test scores correlated well with the high-stakes national mathematics exam scores obtained from the 9th (completed a couple of months after the PISA survey) and 12th grade (completed approximately 3 years after the PISA survey), with correlation values ranging from r = .438 to .557. Furthermore, socioeconomic status index was positively correlated with all mathematics scores (ranging from r = .162 to .305). Controlled for age and gender, the results also showed that students who reported using Internet the longest tended to have, on average, the lowest mathematics scores in all tests across 3 years. Although effect sizes were generally small, they seemed to be more pronounced in Internet use at school. Conclusions Based on these results, one may notice that significantly longer time spent on Internet use at and outside of school as well as on weekends may be associated with poorer mathematics performance. These results are somewhat in line with research outlining the potentially negative associations between longer time spent on digital technology use and daily life outcomes.


2021 ◽  
pp. 1-24
Author(s):  
Avidit Acharya ◽  
Kirk Bansak ◽  
Jens Hainmueller

Abstract We introduce a constrained priority mechanism that combines outcome-based matching from machine learning with preference-based allocation schemes common in market design. Using real-world data, we illustrate how our mechanism could be applied to the assignment of refugee families to host country locations, and kindergarteners to schools. Our mechanism allows a planner to first specify a threshold $\bar g$ for the minimum acceptable average outcome score that should be achieved by the assignment. In the refugee matching context, this score corresponds to the probability of employment, whereas in the student assignment context, it corresponds to standardized test scores. The mechanism is a priority mechanism that considers both outcomes and preferences by assigning agents (refugee families and students) based on their preferences, but subject to meeting the planner’s specified threshold. The mechanism is both strategy-proof and constrained efficient in that it always generates a matching that is not Pareto dominated by any other matching that respects the planner’s threshold.


2012 ◽  
Vol 65 (11) ◽  
pp. 1487-1508 ◽  
Author(s):  
Yan Zhang ◽  
LiWen Hou

The authors examine cross-cultural differences in the effects of gender diversity on group identification and performance using workgroups from American and Chinese firms. Nationality is found to moderate the relationship between gender diversity and group identification in that gender diversity associates more positively with group identification in Chinese workgroups than in American workgroups. Nationality does not moderate the gender diversity–group performance relationship: although the Chinese sample shows a positive association between gender diversity and group performance, the American sample shows no association. A second longitudinal study explores the mechanisms of relationship conflict and task conflict by which gender diversity benefits group performance in China. Results show that gender-diverse groups perform better than homogeneous groups by decreasing relationship conflict and task conflict. Future research directions and practical implications are discussed.


Sign in / Sign up

Export Citation Format

Share Document