Cross-cultural student assessment: Issues facing white teachers and black students

1988 ◽  
Vol 7 (2) ◽  
pp. 3-6 ◽  
Author(s):  
Nadine Brummer
2019 ◽  
Vol 24 (3) ◽  
pp. 231-242 ◽  
Author(s):  
Herbert W. Marsh ◽  
Philip D. Parker ◽  
Reinhard Pekrun

Abstract. We simultaneously resolve three paradoxes in academic self-concept research with a single unifying meta-theoretical model based on frame-of-reference effects across 68 countries, 18,292 schools, and 485,490 15-year-old students. Paradoxically, but consistent with predictions, effects on math self-concepts were negative for: • being from countries where country-average achievement was high; explaining the paradoxical cross-cultural self-concept effect; • attending schools where school-average achievement was high; demonstrating big-fish-little-pond-effects (BFLPE) that generalized over 68 countries, Organisation for Economic Co-operation and Development (OECD)/non-OECD countries, high/low achieving schools, and high/low achieving students; • year-in-school relative to age; unifying different research literatures for associated negative effects for starting school at a younger age and acceleration/skipping grades, and positive effects for starting school at an older age (“academic red shirting”) and, paradoxically, even for repeating a grade. Contextual effects matter, resulting in significant and meaningful effects on self-beliefs, not only at the student (year in school) and local school level (BFLPE), but remarkably even at the macro-contextual country-level. Finally, we juxtapose cross-cultural generalizability based on Programme for International Student Assessment (PISA) data used here with generalizability based on meta-analyses, arguing that although the two approaches are similar in many ways, the generalizability shown here is stronger in terms of support for the universality of the frame-of-reference effects.


2020 ◽  
Vol 51 (10) ◽  
pp. 794-813
Author(s):  
Jia He ◽  
Joanne M. Chung ◽  
Fons J. R. van de Vijver

This study aims to examine different scale usage correction procedures that are meant to enhance the cross-cultural comparability of Likert scale data. Specifically, we examined a priori study design (i.e., anchoring vignettes and overclaiming) and post hoc statistical procedures (i.e., ipsatization and extreme response style correction) in data from the 2012 Programme for International Student Assessment across 64 countries. We analyzed both original item responses and corrected item scores from two targeted scales in an integrative fashion by using multilevel confirmatory factor analysis and multilevel regressions. Results indicate that mean levels and structural relations varied across the correction procedures, although the psychological meaning of the constructs examined did not change. Furthermore, scores were least affected by these procedures for females who did not repeat a grade and students with higher math achievement. We discuss the implications of our findings and offer recommendations for researchers who are considering scale usage correction procedures.


2018 ◽  
Vol 49 (8) ◽  
pp. 1247-1268 ◽  
Author(s):  
Hana Vonkova ◽  
Ondrej Papajoanu ◽  
Jiri Stipek

The overclaiming technique (OCT) is a novel way of measuring how socially desirable responding influences survey responses. It has the potential to enhance the cross-cultural comparability of survey data. It allows the identification of respondents’ knowledge accuracy and exaggeration by comparing their assessments of familiarity with existing and nonexisting concepts in a particular field of knowledge. We aim to compare the response patterns of countries and world regions based on their OCT accuracy and exaggeration-index values and validate these OCT scores using external variables. We also introduce a general model for the categorization of respondents based on their OCT indices values. We use the Programme for International Student Assessment (PISA) 2012 data from 64 countries ( N = 275,904). We found considerable differences in response patterns across world regions: high accuracy values in East Asia, low accuracy values in Southern and Central America, high exaggeration values in Southern Europe, and low exaggeration values in Western Europe. Furthermore, we show that familiarity with math concepts changes substantially after adjustment using the OCT. The correlation between unadjusted math familiarity and math test scores is weak and nonsignificant (.13) whereas after adjustment the correlation becomes strong and significant (.66). Concerning other indicators such as gross domestic product (GDP), public expenditure in education, and Corruption Perceptions Index (CPI), before adjustment the correlation is negative and even significant for CPI (–.11, –.22, and –.45) whereas after adjustment the correlation becomes, though nonsignificant, positive (.05, .19, and .07). We also discuss the OCT in the context of other methods indicative of culturally preferred scale usage.


1986 ◽  
Vol 59 (2) ◽  
pp. 671-674
Author(s):  
Debra W. Townsend ◽  
Gary Fontaine

The validity of the Depression Adjective Check List and a specially developed check list were assessed with 48 black and 53 white juvenile offenders (70 boys, 31 girls) against self-ratings and ratings of 2 black and 2 white teachers. Both check lists correlated significantly with self- (−.29, −.30) and teachers' ratings (.36, .26) for white offenders, although teachers' ratings were correlated unexpectedly negatively with self-ratings (−.41). Only the special check list scores correlated with the two ratings for black offenders. The necessity of limiting use of tests to populations for which they have been validated is discussed.


Author(s):  
Betül Alatlı

This study aimed to investigate cross-cultural measurement invariance of the PISA (Programme for International Student Assessment, 2015) science literacy test and items and to carry out a bias study on the items which violate measurement invariance. The study used a descriptive review model. The sample of the study consisted of 2224 students taking the S12 test booklet from Australia, France, Singapore, and Turkey. Measurement invariance analyses for the test were done using Multi-Group Confirmatory Factor Analysis (MGCFA). Differential Item Functioning (DIF), in other words, measurement invariance of the test items, was analyzed using the item response theory log-likelihood ratio (IRTLR), Hierarchical Generalized Linear Model (HGLM), and the Simultaneous Item Bias Test (SIBTEST) methods.According to the findings, the test was determined to exhibit structural invariance across cultures. The highest number of items showing DIF was observed in the comparisons of Australia-Singapore and Australia-France with 35%. The number of items showing DIF, with 24%, determined in bilateral comparisons which included Turkey, the only country taking the translated form among other countries, did not show a significant difference compared to the other comparisons. While the lowest number of items showing DIF was obtained from Singapore-France samples with 12%, the rate of items indicating DIF in the France-Turkey samples was 18%. On the other hand, 35% of the items showed cross cultural measurement invariance. An item bias study was carried out based on expert opinions on items identified and released as showing DIF in the comparisons of Turkey with Australia and Singapore.According to the findings, translation-bound differentiation of the items, familiarity of a culture group with the contents of the items, polysemy in the expressions or words used in the items, the format, or the stylistic characteristics of the items were determined to be the cause of the bias in the skills measured with the items.


Sign in / Sign up

Export Citation Format

Share Document