Randomization Inference for Accounting Researchers

Large Sample and Jackknife Procedures for Small Sample Orthogonal Least Squares Inference

Communications in Statistics - Simulation and Computation ◽

10.1080/03610917508548348 ◽

1975 ◽

Vol 4 (2) ◽

pp. 193-202

Author(s):

Peter Anderson

Keyword(s):

Least Squares ◽

Small Sample ◽

Large Sample ◽

Orthogonal Least Squares

Download Full-text

High heterogeneity undermines generalization of differential expression results in RNA-Seq analysis

Human Genomics ◽

10.1186/s40246-021-00308-5 ◽

2021 ◽

Vol 15 (1) ◽

Author(s):

Weitong Cui ◽

Huaru Xue ◽

Lei Wei ◽

Jinghua Jin ◽

Xuewen Tian ◽

...

Keyword(s):

Gene Expression ◽

Differential Expression ◽

Small Sample ◽

Differentially Expressed ◽

Cancer Type ◽

Rna Seq ◽

Sample Sizes ◽

Large Sample ◽

Expression Levels ◽

Gene Expression Levels

Abstract Background RNA sequencing (RNA-Seq) has been widely applied in oncology for monitoring transcriptome changes. However, the emerging problem that high variation of gene expression levels caused by tumor heterogeneity may affect the reproducibility of differential expression (DE) results has rarely been studied. Here, we investigated the reproducibility of DE results for any given number of biological replicates between 3 and 24 and explored why a great many differentially expressed genes (DEGs) were not reproducible. Results Our findings demonstrate that poor reproducibility of DE results exists not only for small sample sizes, but also for relatively large sample sizes. Quite a few of the DEGs detected are specific to the samples in use, rather than genuinely differentially expressed under different conditions. Poor reproducibility of DE results is mainly caused by high variation of gene expression levels for the same gene in different samples. Even though biological variation may account for much of the high variation of gene expression levels, the effect of outlier count data also needs to be treated seriously, as outlier data severely interfere with DE analysis. Conclusions High heterogeneity exists not only in tumor tissue samples of each cancer type studied, but also in normal samples. High heterogeneity leads to poor reproducibility of DEGs, undermining generalization of differential expression results. Therefore, it is necessary to use large sample sizes (at least 10 if possible) in RNA-Seq experimental designs to reduce the impact of biological variability and DE results should be interpreted cautiously unless soundly validated.

Download Full-text

Updating Existing Travel Simulation Models with Small-Sample Survey Data Using Parameter Scaling Methods

Transportation Research Record Journal of the Transportation Research Board ◽

10.3141/1607-08 ◽

1997 ◽

Vol 1607 (1) ◽

pp. 55-61

Author(s):

W. Thomas Walker ◽

Scott H. Brady ◽

Charles Taylor

Keyword(s):

Survey Data ◽

Primary Source ◽

Simulation Models ◽

Small Sample ◽

Sample Survey ◽

Secondary Source ◽

Sample Surveys ◽

Large Sample ◽

Scaling Methods ◽

Home Interview

The travel simulation models for many metropolitan areas were originally developed and calibrated with older large-sample travel surveys that can no longer be undertaken given today’s funding constraints. Small-sample travel surveys have been collected as part of model update activities required by the Intermodal Surface Transportation Efficiency Act and the Clean Air Act Amendments. Although providing useful information, these surveys are inadequate for calibrating elaborate simulation models by traditional techniques. Parameter transfer scaling based on small-sample surveys and other secondary source data can be a cost-effective alternative to large-sample surveys when existing models are being updated, particularly when the models tend to be robust and the required changes are relatively small. The use of parameter scaling methods to update the Delaware Valley Planning Commission’s existing travel simulation models is demonstrated. All available sources of data are incorporated into the update process including current survey data, census work trips from the Census Transportation Planning Package (CTPP), transit ridership checks, highway screenline counts, and Highway Performance Monitoring System travel estimates. A synopsis of experience with parameter scaling techniques including the model changes and resulting accuracy is provided. Overall, small-sample-based parameter scaling techniques were judged to be effective. The census CTPP data were evaluated versus the home interview and were found to be useful in the model recalibration effort as a source of small-area employment data by place of work and as a supplement to home interview data for model validation. However, a home interview survey is required as the primary source of travel data for both work and nonwork trips.

Download Full-text

Small Sample Dating in China

Radiocarbon ◽

10.1017/s0033822200014314 ◽

1994 ◽

Vol 36 (1) ◽

pp. 47-49 ◽

Cited By ~ 6

Author(s):

Weijian Zhou ◽

M. J. Head ◽

Lauri Kaihola

Keyword(s):

Liquid Scintillation ◽

Small Sample ◽

Quaternary Geology ◽

Liquid Scintillation Spectrometer ◽

Large Sample ◽

National University ◽

Dating Method ◽

Age Limit

The Xi'an Laboratory of Loess and Quaternary Geology has developed a small sample 14C dating facility consisting of a Wallac 1220 Quantulus™ liquid scintillation spectrometer, and a miniature benzene synthesis line based on the synthesis procedures used at the Australian National University (ANU). This line can produce ca. 0.3-ml benzene samples, which are then measured for 14C activity using 0.3-ml Teflon vials developed by Wallac Oy. The counting performance of the Quantulus™ spectrometer using 0.3-ml vials has been evaluated, and a potential age limit of ca. 45,000 BP has been obtained for samples containing up to 250 mg carbon. This dating facility fills the gap between large sample (2.4–6 g carbon) and microsample (<1 mg carbon) handling to form a 14C dating method sequence.

Download Full-text

On Some Estimates of Poverty Measures

Calcutta Statistical Association Bulletin ◽

10.1177/0008068319880108 ◽

1988 ◽

Vol 37 (1-2) ◽

pp. 81-90

Author(s):

P. Maiti ◽

M. Pal

Keyword(s):

Small Sample ◽

Large Sample ◽

Poverty Measures

There are now a number of poverty measures available in the literatures. Some of the measures are alternative to each other and some claimed to be superior in some sense to many others. While significant work has been done in developing the alternative measurts, not much attention has been paid to the problem of estimation of these indices. Estimation does not pose very serious problems in the large sample, but when one deals with a small sample, which may typically be the case in reality, situations become quite different. In fact usual estimators become biased for some of the indices. In this paper, alternative estimators for these cases have been proposed. Other properties of the estimators and some other relevant issues have also been examined.

Download Full-text

Alternative Bias Approximations in Regressions with a Lagged-Dependent Variable

Econometric Theory ◽

10.1017/s0266466600007337 ◽

1993 ◽

Vol 9 (1) ◽

pp. 62-80 ◽

Cited By ~ 41

Author(s):

Jan F. Kiviet ◽

Garry D.A. Phillips

Keyword(s):

Least Squares ◽

Mean Squared Error ◽

Multiple Linear Regression Model ◽

Small Sample ◽

Large Sample ◽

Squared Error ◽

Coefficient Vector ◽

Lagged Dependent Variable ◽

Biased Estimators ◽

The One

The small sample bias of the least-squares coefficient estimator is examined in the dynamic multiple linear regression model with normally distributed whitenoise disturbances and an arbitrary number of regressors which are all exogenous except for the one-period lagged-dependent variable. We employ large sample (T → ∞) and small disturbance (σ → 0) asymptotic theory and derive and compare expressions to O(T−1) and to O(σ2), respectively, for the bias in the least-squares coefficient vector. In some simulations and for an empirical example, we examine the mean (squared) error of these expressions and of corrected estimation procedures that yield estimates that are unbiased to O(T−l) and to O(σ2), respectively. The large sample approach proves to be superior, easily applicable, and capable of generating more efficient and less biased estimators.

Download Full-text

On the distribution and moments of the strength of a bundle of filaments

Journal of Applied Probability ◽

10.2307/3211948 ◽

1970 ◽

Vol 7 (3) ◽

pp. 712-720 ◽

Cited By ~ 32

Author(s):

M. W. Suh ◽

B. B. Bhattacharyya ◽

A. Grandage

Keyword(s):

Small Sample ◽

Probabilistic Argument ◽

Large Sample

SummarySmall sample and large sample properties of the bundle strength of parallel filaments studied earlier by Daniels (1945) and Sen, Bhattacharyya, Suh (1969) have been developed here by probabilistic argument. The statistics belong to a family or class of statistics, each of which forms a reverse semi-martingale sequence. Certain moment properties are also discussed.

Download Full-text

On the Subrange and Its Application to the R-Chart

Applied Sciences ◽

10.3390/app112411632 ◽

2021 ◽

Vol 11 (24) ◽

pp. 11632

Author(s):

En Xie ◽

Yizhong Ma ◽

Linhan Ouyang ◽

Chanseok Park

Keyword(s):

Standard Deviation ◽

Sample Size ◽

Correction Factor ◽

Relative Efficiency ◽

Least Squares Method ◽

Small Sample ◽

Large Sample Size ◽

Large Sample ◽

Sample Range ◽

Conventional Sample

The conventional sample range is widely used for the construction of an R-chart. In an R-chart, the sample range estimates the standard deviation, especially in the case of a small sample size. It is well known that the performance of the sample range degrades in the case of a large sample size. In this paper, we investigate the sample subrange as an alternative to the range. This subrange includes the range as a special case. We recognize that we can improve the performance of estimating the standard deviation by using the subrange, especially in the case of a large sample size. Note that the original sample range is biased. Thus, the correction factor is used to make it unbiased. Likewise, the original subrange is also biased. In this paper, we provide the correction factor for the subrange. To compare the sample subranges with different trims to the conventional sample range or the sample standard deviation, we provide the theoretical relative efficiency and its values, which can be used to select the best trim of the subrange with the sense of maximizing the relative efficiency. For a practical guideline, we also provide a simple formula for the best trim amount, which is obtained by the least-squares method. It is worth noting that the breakdown point of the conventional sample range is always zero, while that of the sample subrange increases proportionally to a trim amount. As an application of the proposed method, we illustrate how to incorporate it into the construction of the R-chart.

Download Full-text

Performance of the Beta-Binomial Model for Clustered Binary Responses: Comparison with Generalized Estimating Equations

Journal of Modern Applied Statistical Methods ◽

10.22237/jmasm/1619482380 ◽

2021 ◽

Vol 19 (1) ◽

pp. 2-25

Author(s):

Seongah Im

Keyword(s):

Monte Carlo ◽

Monte Carlo Simulations ◽

Generalized Estimating Equations ◽

Estimating Equations ◽

Small Sample ◽

Binomial Model ◽

Sample Sizes ◽

Binary Responses ◽

Large Sample ◽

Generalized Estimating

This study examined performance of the beta-binomial model in comparison with GEE using clustered binary responses resulting in non-normal outcomes. Monte Carlo simulations were performed under varying intracluster correlations and sample sizes. The results showed that the beta-binomial model performed better for small sample, while GEE performed well under large sample.

Download Full-text

Maiasaura, a model organism for extinct vertebrate population biology: a large sample statistical assessment of growth dynamics and survivorship

Paleobiology ◽

10.1017/pab.2015.19 ◽

2015 ◽

Vol 41 (4) ◽

pp. 503-527 ◽

Cited By ~ 45

Author(s):

Holly N. Woodward ◽

Elizabeth A. Freedman Fowler ◽

James O. Farlow ◽

John R. Horner

Keyword(s):

Mortality Rate ◽

Population Biology ◽

Skeletal Maturity ◽

Model Organism ◽

Growth Dynamics ◽

Small Sample ◽

Peak Performance ◽

First Year ◽

Large Sample ◽

Fossil Records

AbstractFossil bone microanalyses reveal the ontogenetic histories of extinct tetrapods, but incomplete fossil records often result in small sample sets lacking statistical strength. In contrast, a histological sample of 50 tibiae of the hadrosaurid dinosaurMaiasaura peeblesorumallows predictions of annual growth and ecological interpretations based on more histologic data than any previous large sample study. Tibia length correlates well (R2>0.9) with diaphyseal circumference, cortical area, and bone wall thickness, thereby allowing longitudinal predictions of annual body size increases based on growth mark circumference measurements. With an avian level apposition rate of 86.4 μm/day,Maiasauraachieved over half of asymptotic tibia diaphyseal circumference within its first year. Mortality rate for the first year was 89.9% but a seven year period of peak performance followed, when survivorship (mean mortality rate=12.7%) was highest. During the third year of life,Maiasauraattained 36% (x=1260 kg) of asymptotic body mass, growth rate was decelerating (18.2 μm/day), cortical vascular orientation changed, and mortality rate briefly increased. These transitions may indicate onset of sexual maturity and corresponding reallocation of resources to reproduction. Skeletal maturity and senescence occurred after 8 years, at which point the mean mortality rate increased to 44.4%. Compared withAlligator, an extant relative,Maiasauraexhibits rapid cortical increase early in ontogeny, whileAlligatorcortical growth is much lower and protracted throughout ontogeny. Our life history synthesis ofMaiasaurautilizes the largest histological sample size for any extinct tetrapod species thus far, demonstrating how large sample microanalyses strengthen paleobiological interpretations.

Download Full-text