scholarly journals Linking With External Covariates: Examining Accuracy by Anchor Type, Test Length, Ability Difference, and Sample Size

2019 ◽  
Vol 43 (8) ◽  
pp. 597-610 ◽  
Author(s):  
Anthony D. Albano ◽  
Marie Wiberg

Research has recently demonstrated the use of multiple anchor tests and external covariates to supplement or substitute for common anchor items when linking and equating with nonequivalent groups. This study examines the conditions under which external covariates improve linking and equating accuracy, with internal and external anchor tests of varying lengths and groups of differing abilities. Pseudo forms of a state science test were equated within a resampling study where sample size ranged from 1,000 to 10,000 examinees and anchor tests ranged in length from eight to 20 items, with reading and math scores included as covariates. Frequency estimation linking with an anchor test and external covariate was found to produce the most accurate results under the majority of conditions studied. Practical applications of linking with anchor tests and covariates are discussed.

2021 ◽  
pp. 001312452110045
Author(s):  
Jie Min

The current study investigated the effects of school mobility on the academic achievement of different racial/ethnic groups in four cohorts of students from a very large urban school district. In this study, I compared within-year and between-year mobility and, most importantly, account for all the schools students attended over the study period. Using a multiple membership model (MMM), the findings confirmed that, for all student groups, academic achievement was affected more by within-year school mobility than between-year school mobility. Black students had the highest mobility rates, both for between- and within-year mobility. Although Asian-American students achieved higher reading and math scores on average, they were more negatively impacted by within-year school mobility compared to other groups. The current study was able to pinpoint the students most at risk for negative outcomes following within-year mobility. The findings are discussed in the context of policy recommendations that can be adopted by school districts.


1994 ◽  
Vol 21 (6) ◽  
pp. 1074-1080 ◽  
Author(s):  
J. Llamas ◽  
C. Diaz Delgado ◽  
M.-L. Lavertu

In this paper, an improved probabilistic method for flood analysis using the probable maximum flood, the beta function, and orthogonal Jacobi’s polynomials is proposed. The shape of the beta function depends on the sample's characteristics and the bounds of the phenomenon. On the other hand, a serial of Jacobi’s polynomials has been used improving the beta function and increasing its convergence degree toward the real flood probability density function. This mathematical model has been tested using a sample of 1000 generated beta random data. Finally, some practical applications with real data series, from important Quebec's rivers, have been performed; the model solutions for these rivers showed the accuracy of this new method in flood frequency estimation. Key words: probable maximum flood, beta function, orthogonal polynomials, distribution function, flood frequency estimation, data generation, convergency.


2009 ◽  
Vol 4 (2) ◽  
pp. 46-61
Author(s):  
Geoffrey K. Leigh ◽  
Cynthia Robinson ◽  
Steven Bernard Hollingsworth

Building on the increasing number of programs designed to enhance brain development, a program developed in Korea, Brain Respiration, was adapted to a school in Nevada. Classes were offered twice weekly to a class of fourth and fifth grade students with control group classes assessed in the same school. Self-report surveys, teacher observations, and standardized reading and math scores were used to determine effects of the program on the students. Some differences were found in the pretest for the survey and the observation, with control groups scoring higher. There were differences in some post-test scores, with treatment group children scoring higher when differences did occur. There also were differences in the reading and math scores, with control groups scoring higher than the overall treatment group, but not higher when compared to those actively participating in the program. Such differences are discussed as well as other issues possibly influencing the effects.


2017 ◽  
Vol 28 (4) ◽  
pp. 1019-1043 ◽  
Author(s):  
Shi-Fang Qiu ◽  
Xiao-Song Zeng ◽  
Man-Lai Tang ◽  
Wai-Yin Poon

Double sampling is usually applied to collect necessary information for situations in which an infallible classifier is available for validating a subset of the sample that has already been classified by a fallible classifier. Inference procedures have previously been developed based on the partially validated data obtained by the double-sampling process. However, it could happen in practice that such infallible classifier or gold standard does not exist. In this article, we consider the case in which both classifiers are fallible and propose asymptotic and approximate unconditional test procedures based on six test statistics for a population proportion and five approximate sample size formulas based on the recommended test procedures under two models. Our results suggest that both asymptotic and approximate unconditional procedures based on the score statistic perform satisfactorily for small to large sample sizes and are highly recommended. When sample size is moderate or large, asymptotic procedures based on the Wald statistic with the variance being estimated under the null hypothesis, likelihood rate statistic, log- and logit-transformation statistics based on both models generally perform well and are hence recommended. The approximate unconditional procedures based on the log-transformation statistic under Model I, Wald statistic with the variance being estimated under the null hypothesis, log- and logit-transformation statistics under Model II are recommended when sample size is small. In general, sample size formulae based on the Wald statistic with the variance being estimated under the null hypothesis, likelihood rate statistic and score statistic are recommended in practical applications. The applicability of the proposed methods is illustrated by a real-data example.


2001 ◽  
Vol 1 (3) ◽  
pp. 295-317 ◽  
Author(s):  
Valentina A. Bali

Proposition 227, passed by California voters in 1998, aimed to dismantle bilingual programs in public schools and to replace them with English-only programs. Bilingual education, a long-standing program in California, involved mostly Hispanic students of limited English skills who were taught initially in their native language, and then were gradually transitioned into English-only classes. Using individual-level data from one southern California school district, I find that in 1998, before Proposition 227, limited-English-proficient (LEP) students enrolled in bilingual classes had lower scores in reading than LEP students who were not enrolled in bilingual classes, and who were, in general, more proficient in English. In math, bilingual students had test scores as good as those of non-bilingual LEPs. But in 1999, after Proposition 227, the same set of bilingual students had reading and math scores that were no worse than those of non-bilingual LEPs. Proposition 227, which interrupted bilingual programs and emphasized English instruction, did not set bilingual LEP students back relative to non-bilingual LEPs, and it may have even benefited them.


2016 ◽  
Vol 106 (10) ◽  
pp. 2783-2816 ◽  
Author(s):  
David Card ◽  
Laura Giuliano

We evaluate a tracking program in a large urban district where schools with at least one gifted fourth grader create a separate “gifted/high achiever” classroom. Most seats are filled by non-gifted high achievers, ranked by previous-year test scores. We study the program's effects on the high achievers using (i) a rank-based regres sion discontinuity design, and (ii) a between-school/cohort analysis. We find significant effects that are concentrated among black and Hispanic participants. Minorities gain 0.5 standard deviation units in fourth-grade reading and math scores, with persistent gains through sixth grade. We find no evidence of negative or positive spillovers on nonparticipants. (JEL I21, J21, J24)


1986 ◽  
Vol 9 (3) ◽  
pp. 208-213 ◽  
Author(s):  
William D. Dundon ◽  
Trevor E. Sewell ◽  
John L. Manni ◽  
David Goldstein

The WISC-R subtest scores of 159 black LD children of low socioeconomic status were recategorized into Spatial (Sp), Conceptual (C), and Sequential (Sq) scales as recommended by Bannatyne (1974). As a group, the sample displayed the classic Sp > C > Sq pattern. However, only 18 of the subjects (11.3%) were identified in accordance with the requirement that the differences between categories be statistically reliable for each individual. This subgroup was matched with LD controls not demonstrating the Bannatyne pattern. Analyses of longitudinal reading and math scores revealed no differences between groups. It was concluded that the diagnostic utility of the Bannatyne pattern is questionable.


2009 ◽  
Vol 49 (3) ◽  
pp. 291-322 ◽  
Author(s):  
Lynn M. Sargeant

Although music has long had a place in the school, its position has often been precarious, relegated to odd hours and odd locations, and starved of both funds and attention. While at times music and the arts have enjoyed considerable support, these subjects are often the last ones added and the first ones cut from the curriculum. Yet, the arts have passionate advocates as well, including parents and pedagogues who support a holistic model of education that emphasizes humanistic values and aesthetics as well as utilitarian training. Still, music educators have struggled to justify their subject, often relying on extrinsic arguments to support its inclusion in the curriculum. Music, one is told, helps students raise their reading and math scores, improves their self-discipline, and builds community. Such arguments are rarely persuasive to voters concerned with eliminating expensive “frills” or to officials trying to balance tight budgets and raise test scores. Local newspapers bear witness to this struggle, as music and art programs fight to stay alive in American schools. This story, so potent today, has a long history. It dates back to the nineteenth century and the very birth of school music programs. It crosses continents, having as much currency in Europe as it does in North America. Debates over music in the schools are nothing less than debates over the meaning and purpose of education. Music is not one of the “three ‘R's.” Yet, precisely because of music's peripheral


Author(s):  
Riswan Riswan

The Item Response Theory (IRT) model contains one or more parameters in the model. These parameters are unknown, so it is necessary to predict them. This paper aims (1) to determine the sample size (N) on the stability of the item parameter (2) to determine the length (n) test on the stability of the estimate parameter examinee (3) to determine the effect of the model on the stability of the item and the parameter to examine (4) to find out Effect of sample size and test length on item stability and examinee parameter estimates (5) Effect of sample size, test length, and model on item stability and examinee parameter estimates. This paper is a simulation study in which the latent trait (q) sample simulation is derived from a standard normal population of ~ N (0.1), with a specific Sample Size (N) and test length (n) with the 1PL, 2PL and 3PL models using Wingen. Item analysis was carried out using the classical theory test approach and modern test theory. Item Response Theory and data were analyzed through software R with the ltm package. The results showed that the larger the sample size (N), the more stable the estimated parameter. For the length test, which is the greater the test length (n), the more stable the estimated parameter (q).


Sign in / Sign up

Export Citation Format

Share Document