scholarly journals Language Assessment for Immigration: A Review of Validation Research Over the Last Two Decades

2021 ◽  
Vol 12 ◽  
Author(s):  
Don Yao ◽  
Matthew P. Wallace

It is not uncommon for immigration-seekers to be actively involved in taking various language tests for immigration purposes. Given the large-scale and high-stakes nature those language tests possess, the validity issues (e.g., appropriate score-based interpretations and decisions) associated with them are of great importance as test scores may play a gate-keeping role in immigration. Though interest in investigating the validity of language tests for immigration purposes is becoming prevalent, there has to be a systematic review of the research foci and results of this body of research. To address this need, the current paper critically reviewed 11 validation studies on language assessment for immigration over the last two decades to identify what has been focused on and what has been overlooked in the empirical research and to discuss current research interests and future research trends. Assessment Use Argument (AUA) framework of Bachman and Palmer (2010), comprising four inferences (i.e., assessment records, interpretations, decisions, and consequences), was adopted to collect and examine evidence of test validity. Results showed the consequences inference received the most investigations focusing on immigration-seekers’ and policymakers’ perceptions on test consequences, while the decisions inference was the least probed stressing immigration-seekers’ attitude towards the impartiality of decision-making. It is recommended that further studies could explore more kinds of stakeholders (e.g., test developers) in terms of their perceptions on the test and investigate more about the fairness of decision-making based on test scores. Additionally, the current AUA framework includes only positive and negative consequences that an assessment may engender but does not take compounded consequences into account. It is suggested that further research could enrich the framework. The paper sheds some light on the field of language assessment for immigration and brings about theoretical, practical, and political implications for different kinds of stakeholders (e.g., researchers, test developers, and policymakers).

2016 ◽  
Vol 138 (5) ◽  
Author(s):  
Bryony DuPont ◽  
Ridwan Azam ◽  
Scott Proper ◽  
Eduardo Cotilla-Sanchez ◽  
Christopher Hoyle ◽  
...  

As demand for electricity in the U.S. continues to increase, it is necessary to explore the means through which the modern power supply system can accommodate both increasing affluence (which is accompanied by increased per-capita consumption) and the continually growing global population. Though there has been a great deal of research into the theoretical optimization of large-scale power systems, research into the use of an existing power system as a foundation for this growth has yet to be fully explored. Current successful and robust power generation systems that have significant renewable energy penetration—despite not having been optimized a priori—can be used to inform the advancement of modern power systems to accommodate the increasing demand for electricity. This work explores how an accurate and state-of-the-art computational model of a large, regional energy system can be employed as part of an overarching power systems optimization scheme that looks to inform the decision making process for next generation power supply systems. Research scenarios that explore an introductory multi-objective power flow analysis for a case study involving a regional portion of a large grid will be explored, along with a discussion of future research directions.


2019 ◽  
Vol 13 (1) ◽  
pp. 31
Author(s):  
Linyu Liao

As a high-stakes standardized test, IELTS is expected to have comparable forms of test papers so that test takers from different test administration on different dates receive comparable test scores. Therefore, this study examined the text difficulty and task characteristics of four parallel academic IELTS reading tests to reveal to what extent the four tests were comparable in terms of text difficulty, construct coverage, response format, item scope, and task scope. The Coh-Metrix-TEA software was used for the text difficulty analyses and expert judgments were used for task characteristics analyses. The results show that the four reading tests were partly comparable in text difficulty, comparable in terms of construct coverage and item scope, but not comparable in terms of response format and task scope. Based on the findings, implications were discussed on test development and future research.


2019 ◽  
Vol 37 (2) ◽  
pp. 235-253
Author(s):  
Jonathan Trace

Originally designed to measure reading and passage comprehension in L1 readers, cloze tests continue to be used for L2 assessment purposes. However, there remain disputes about whether or not cloze items can measure beyond local comprehension information, as well as whether or not they are purely a test of reading alone, or if performance can be generalized to broader claims about proficiency. The current study sets out to address both of these issues by drawing on a large pool of cloze items ( k = 449) taken from 15 cloze passages that were administered to 675 L1 and 2246 L2 examinees. In conjunction with test scores, a large-scale L1 experiment was conducted using Amazon’s Mechanical Turk to determine the level of minimum context required to answer each item. Using Rasch analysis, item function was compared across both groups, with results indicating that cloze items can draw on information at both the sentence and passage level. This seems to suggest further that cloze tests generally tend to measure reading in both L1 and L2 examinees. These findings have important implications for the continued use of cloze tests, particularly in classroom and high-stakes contexts where they are commonly found.


2021 ◽  
Vol 4 (3) ◽  
Author(s):  
Tuçe Öztürk Karataş ◽  

In the 21st century, with the rise of the popularity of standardized or large-scale tests, their high-stakes have started to be apparent. High-stake tests are not new, but in most cases, their current use as social practice tends to shape individuals’ futures. Currently the new trend for their quality discussion aims to critically evaluate tests through the focus on their functions, use and power in their testing discourse, whereas traditionally what was included in this discussion was only their psychometric features. Regarding those tests as social practices, examining the functions, consequences and use of tests in their own discourses is at the heart of this new perspective. Driven by the tenet of such a critical perspective, this study aims to first provide a better understanding of ‘discourse of test’, and then describe the social dimensions comprising discourse of language tests. Finally, this study concludes with some suggestions for adapting a critical perspective to improve the discourse of tests and enhance their quality.


2021 ◽  
Author(s):  
Tuçe Öztürk Karataş

In the 21st century, with the rise of the popularity of standardized or large-scale tests, their high-stakes have started to be apparent. High-stake tests are not new, but in most cases, their current use as social practice tends to shape individuals’ futures. Currently the new trend for their quality discussion aims to critically evaluate tests through the focus on their functions, use and power in their testing discourse, whereas traditionally what was included in this discussion was only their psychometric features. Regarding those tests as social practices, examining the functions, consequences and use of tests in their own discourses is at the heart of this new perspective. Driven by the tenet of such a critical perspective, this study aims to first provide a better understanding of ‘discourse of test’, and then describe the social dimensions comprising discourse of language tests. Finally, this study concludes with some suggestions for adapting a critical perspective to improve the discourse of tests and enhance their quality.


2021 ◽  
Vol 14 (12) ◽  
pp. 55
Author(s):  
Jing Zhang

This paper reviews a total of 20 empirical research studies concerning parents’ behavior under the context of high-stakes language assessment, aiming to reveal the impact of the assessment on parents’ behavior. The results show that (1) parents are typically involved in high-stakes language assessment process; (2) their involvement practice includes general (such as hiring tutors for children) and extreme involvement behavior (such as participating in movement against high-stakes testing); (3) no unanimous conclusion is reached concerning the effectiveness of parents’ involvement in high-stakes language assessment; (4) multiple factors that affect parents’ involvement in high-stakes language assessment are identified, including parents’ perceptions of tests, their educational background, and the time they spend with their children. This study concludes that tests might influence the ways parents are involved in children’s education. However, not all parents might be influenced by testing, and testing might have a positive impact on some parents but a negative impact on others. This synthesis has several practical implications. Firstly, it indicates that parents’ involvement behavior in the context of high-stakes language assessment deserves to be further investigated. Secondly, it points that various intervention programs should be provided for parents to help them better support their children’s learning and test preparation. The paper also offers several suggestions for future research.  


2019 ◽  
Vol 35 (1) ◽  
Author(s):  
Dinh Minh Thu

Validity in language testing and assessment has its long fundamental role in research along with reliability (Bachman & Palmer, 1996). This paper analyses basic theories and empirical research on language test validity in order to provide the notion, the classification of language test validity, the validation working frames and the trends of empirical research. Four key findings come out from the analysis. Firstly, language test validity refers to an evaluative judgment of the language test quality on the ground of evidence of the integrated components of test content, criterion and consequences through the interpretation of the meaning and utility of test scores. Secondly, construct validity is a dominating term in modern validity classification. The chronic division of construct validity into prior and post ones can help researchers have a clearer validation option. Plus, test validation can be grounded in light of Messick (1989), Bachman (1996) and Weir (2005). Finally, almost all empirical research on test validity the researcher has addressed concerns international and national high-stakes proficiency tests. The research results open gaps in test validation research for the future.


2013 ◽  
Vol 27 (6) ◽  
pp. 532-544 ◽  
Author(s):  
Babatunde Ogunfowora ◽  
Joshua S. Bourdage ◽  
Brenda Nguyen

The majority of research on self–monitoring has focused on the positive aspects of this personality trait. The goal of the present research was to shed some light on the potential negative side of self–monitoring and resulting consequences in two independent studies. Study 1 demonstrated that, in addition to being higher on Extraversion, high self–monitors are also more likely to be low on Honesty–Humility, which is characterized by a tendency to be dishonest and driven by self–gain. Study 2 was designed to investigate the consequences of this dishonest side of self–monitoring using two previously unexamined outcomes: moral disengagement and unethical business decision making. Results showed that high self–monitors are more likely to engage in unethical business decision making and that this relationship is mediated by the propensity to engage in moral disengagement. In addition, these negative effects of self–monitoring were found to be due to its low Honesty–Humility aspect, rather than its high Extraversion side. Further investigation showed similar effects for the Other–Directedness and Acting (but not Extraversion) self–monitoring subscales. These findings provide valuable insight into previously unexamined negative consequences of self–monitoring and suggest important directions for future research on self–monitoring. Copyright © 2013 European Association of Personality Psychology


2014 ◽  
Vol 45 (4) ◽  
pp. 337-350 ◽  
Author(s):  
Kerry Danahy Ebert ◽  
Cheryl M. Scott

Purpose Both narrative language samples and norm-referenced language tests can be important components of language assessment for school-age children. The present study explored the relationship between these 2 tools within a group of children referred for language assessment. Method The study is a retrospective analysis of clinical records from 73 school-age children. Participants had completed an oral narrative language sample and at least one norm-referenced language test. Correlations between microstructural language sample measures and norm-referenced test scores were compared for younger (6- to 8-year-old) and older (9- to 12-year-old) children. Contingency tables were constructed to compare the 2 types of tools, at 2 different cutpoints, in terms of which children were identified as having a language disorder. Results Correlations between narrative language sample measures and norm-referenced tests were stronger for the younger group than the older group. Within the younger group, the level of language assessed by each measure contributed to associations among measures. Contingency analyses revealed moderate overlap in the children identified by each tool, with agreement affected by the cutpoint used. Conclusions Narrative language samples may complement norm-referenced tests well, but age combined with narrative task can be expected to influence the nature of the relationship.


HOW ◽  
2020 ◽  
Vol 27 (2) ◽  
pp. 135-155
Author(s):  
Frank Giraldo

Large-scale language testing uses statistical information to account for the quality of an assessment system. In this reflection article, I explain how basic statistics can be used meaningfully in the context of classroom language assessment. The paper explores a series of statistical calculations that can be used to examine test scores and assessment decisions in the language classroom. Therefore, interpretations for criterion-referenced assessment underlie the paper. Finally, I discuss limitations and include recommendations for teachers to use statistics.


Sign in / Sign up

Export Citation Format

Share Document