Parameters Estimation for Wear-out Failure Period of Three-Parameter Weibull Distribution

The shape parameter estimation using the minimum-variance linear estimator with hyperparameter (MVLE-H) method is believed to be effective for a wear-out failure period in a small sample. In the process of the estimation, our method uses the hyperparameter and estimate shape parameters of the MVLE-H method. To obtain the optimal hyperparameter c, it takes a long time, even in the case of the small sample. The main purpose of this paper is to remove the restriction of small samples. We observed that if we set the shape parameters, for sample size n and c, we can use the regression equation to infer the optimal c from n. So we searched in five increments and complemented the hyperparameter for the remaining sample sizes with a linear regression line. We used Monte Carlo simulations (MCSs) to determine the optimal hyperparameter for various sample sizes and shape parameters of the MVLE-H method. Intrinsically, we showed that the MVLE-H method performs well by determining the hyperparameter. Further, we showed that the location and scale parameter estimations are improved using the shape parameter estimated by the MVLE-H method. We verified the validity of the MVLE-H method using MCSs and a numerical example.

Download Full-text

G-computation and machine learning for estimating the causal effects of binary exposure statuses on binary outcomes

Scientific Reports ◽

10.1038/s41598-021-81110-0 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Florent Le Borgne ◽

Arthur Chatton ◽

Maxime Léger ◽

Rémi Lenain ◽

Yohann Foucher

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Statistical Power ◽

Small Sample ◽

Causal Effects ◽

Small Samples ◽

Support Vector ◽

Sample Sizes ◽

Super Learner ◽

Small Sample Sizes

AbstractIn clinical research, there is a growing interest in the use of propensity score-based methods to estimate causal effects. G-computation is an alternative because of its high statistical power. Machine learning is also increasingly used because of its possible robustness to model misspecification. In this paper, we aimed to propose an approach that combines machine learning and G-computation when both the outcome and the exposure status are binary and is able to deal with small samples. We evaluated the performances of several methods, including penalized logistic regressions, a neural network, a support vector machine, boosted classification and regression trees, and a super learner through simulations. We proposed six different scenarios characterised by various sample sizes, numbers of covariates and relationships between covariates, exposure statuses, and outcomes. We have also illustrated the application of these methods, in which they were used to estimate the efficacy of barbiturates prescribed during the first 24 h of an episode of intracranial hypertension. In the context of GC, for estimating the individual outcome probabilities in two counterfactual worlds, we reported that the super learner tended to outperform the other approaches in terms of both bias and variance, especially for small sample sizes. The support vector machine performed well, but its mean bias was slightly higher than that of the super learner. In the investigated scenarios, G-computation associated with the super learner was a performant method for drawing causal inferences, even from small sample sizes.

Download Full-text

Implications of Small Samples for Generalization: Adjustments and Rules of Thumb

Evaluation Review ◽

10.1177/0193841x16655665 ◽

2016 ◽

Vol 41 (5) ◽

pp. 472-505 ◽

Cited By ~ 16

Author(s):

Elizabeth Tipton ◽

Kelly Hallberg ◽

Larry V. Hedges ◽

Wendy Chan

Keyword(s):

Observational Studies ◽

Small Sample ◽

Average Treatment Effect ◽

Small Samples ◽

Sample Sizes ◽

Random Samples ◽

Rules Of Thumb ◽

Large Populations ◽

Small Sample Sizes ◽

Combine Information

Background: Policy makers and researchers are frequently interested in understanding how effective a particular intervention may be for a specific population. One approach is to assess the degree of similarity between the sample in an experiment and the population. Another approach is to combine information from the experiment and the population to estimate the population average treatment effect (PATE). Method: Several methods for assessing the similarity between a sample and population currently exist as well as methods estimating the PATE. In this article, we investigate properties of six of these methods and statistics in the small sample sizes common in education research (i.e., 10–70 sites), evaluating the utility of rules of thumb developed from observational studies in the generalization case. Result: In small random samples, large differences between the sample and population can arise simply by chance and many of the statistics commonly used in generalization are a function of both sample size and the number of covariates being compared. The rules of thumb developed in observational studies (which are commonly applied in generalization) are much too conservative given the small sample sizes found in generalization. Conclusion: This article implies that sharp inferences to large populations from small experiments are difficult even with probability sampling. Features of random samples should be kept in mind when evaluating the extent to which results from experiments conducted on nonrandom samples might generalize.

Download Full-text

Procedures for the Analysis of Differential Item Functioning (DIF) for Small Sample Sizes

Evaluation & the Health Professions ◽

10.1177/0163278705278276 ◽

2005 ◽

Vol 28 (3) ◽

pp. 283-294 ◽

Cited By ~ 50

Author(s):

Jin-Shei Lai ◽

Jeanne Teresi ◽

Richard Gershon

Keyword(s):

Differential Item Functioning ◽

Small Sample ◽

Small Samples ◽

Sample Sizes ◽

Education And Health ◽

Item Functioning ◽

Related Quality ◽

Health Related ◽

Small Sample Sizes

An item with differential item functioning (DIF) displays different statistical properties, conditional on a matching variable. The presence of DIF in measures can invalidate the conclusions of medical outcome studies. Numerous approaches have been developed to examine DIF in many areas, including education and health-related quality of life. There is little consensus in the research community regarding selection of one best method, and most methods require large sample sizes. This article describes some approaches to examine DIF with small samples (e.g., less than 200).

Download Full-text

Seeing distinct groups where there are none: spurious patterns from between-group PCA

10.1101/706101 ◽

2019 ◽

Cited By ~ 3

Author(s):

Andrea Cardini ◽

Paul O’Higgins ◽

F. James Rohlf

Keyword(s):

Dimensional Space ◽

Small Sample ◽

Small Samples ◽

Sample Sizes ◽

Large Samples ◽

Large Numbers ◽

Group Variation ◽

Small Sample Sizes ◽

Analysis Of Variation ◽

Dimensions Of Variation

AbstractUsing sampling experiments, we found that, when there are fewer groups than variables, between-groups PCA (bgPCA) may suggest surprisingly distinct differences among groups for data in which none exist. While apparently not noticed before, the reasons for this problem are easy to understand. A bgPCA captures the g-1 dimensions of variation among the g group means, but only a fraction of the ∑ni − g dimensions of within-group variation (ni are the sample sizes), when the number of variables, p, is greater than g-1. This introduces a distortion in the appearance of the bgPCA plots because the within-group variation will be underrepresented, unless the variables are sufficiently correlated so that the total variation can be accounted for with just g-1 dimensions. The effect is most obvious when sample sizes are small relative to the number of variables, because smaller samples spread out less, but the distortion is present even for large samples. Strong covariance among variables largely reduces the magnitude of the problem, because it effectively reduces the dimensionality of the data and thus enables a larger proportion of the within-group variation to be accounted for within the g-1-dimensional space of a bgPCA. The distortion will still be relevant though its strength will vary from case to case depending on the structure of the data (p, g, covariances etc.). These are important problems for a method mainly designed for the analysis of variation among groups when there are very large numbers of variables and relatively small samples. In such cases, users are likely to conclude that the groups they are comparing are much more distinct than they really are. Having many variables but just small sample sizes is a common problem in fields ranging from morphometrics (as in our examples) to molecular analyses.

Download Full-text

What is the extent of prokaryotic diversity?

Philosophical Transactions of the Royal Society B Biological Sciences ◽

10.1098/rstb.2006.1921 ◽

2006 ◽

Vol 361 (1475) ◽

pp. 2023-2037 ◽

Cited By ~ 72

Author(s):

Thomas P Curtis ◽

Ian M Head ◽

Mary Lunn ◽

Stephen Woodcock ◽

Patrick D Schloss ◽

...

Keyword(s):

Microbial Diversity ◽

Local Community ◽

Practical Importance ◽

Real Life ◽

Small Sample ◽

Small Samples ◽

Sample Sizes ◽

Clone Libraries ◽

Migration Rates ◽

Small Sample Sizes

The extent of microbial diversity is an intrinsically fascinating subject of profound practical importance. The term ‘diversity’ may allude to the number of taxa or species richness as well as their relative abundance. There is uncertainty about both, primarily because sample sizes are too small. Non-parametric diversity estimators make gross underestimates if used with small sample sizes on unevenly distributed communities. One can make richness estimates over many scales using small samples by assuming a species/taxa-abundance distribution. However, no one knows what the underlying taxa-abundance distributions are for bacterial communities. Latterly, diversity has been estimated by fitting data from gene clone libraries and extrapolating from this to taxa-abundance curves to estimate richness. However, since sample sizes are small, we cannot be sure that such samples are representative of the community from which they were drawn. It is however possible to formulate, and calibrate, models that predict the diversity of local communities and of samples drawn from that local community. The calibration of such models suggests that migration rates are small and decrease as the community gets larger. The preliminary predictions of the model are qualitatively consistent with the patterns seen in clone libraries in ‘real life’. The validation of this model is also confounded by small sample sizes. However, if such models were properly validated, they could form invaluable tools for the prediction of microbial diversity and a basis for the systematic exploration of microbial diversity on the planet.

Download Full-text

Rasch Versus Classical Equating in the Context of Small Sample Sizes

Educational and Psychological Measurement ◽

10.1177/0013164419878483 ◽

2019 ◽

Vol 80 (3) ◽

pp. 499-521

Author(s):

Ben Babcock ◽

Kari J. Hodge

Keyword(s):

Item Difficulty ◽

Real Data ◽

Past Research ◽

Small Sample ◽

Small Samples ◽

Sample Sizes ◽

Certification Exam ◽

Small Sample Sizes ◽

The Rasch Model ◽

Better Than

Equating and scaling in the context of small sample exams, such as credentialing exams for highly specialized professions, has received increased attention in recent research. Investigators have proposed a variety of both classical and Rasch-based approaches to the problem. This study attempts to extend past research by (1) directly comparing classical and Rasch techniques of equating exam scores when sample sizes are small ( N≤ 100 per exam form) and (2) attempting to pool multiple forms’ worth of data to improve estimation in the Rasch framework. We simulated multiple years of a small-sample exam program by resampling from a larger certification exam program’s real data. Results showed that combining multiple administrations’ worth of data via the Rasch model can lead to more accurate equating compared to classical methods designed to work well in small samples. WINSTEPS-based Rasch methods that used multiple exam forms’ data worked better than Bayesian Markov Chain Monte Carlo methods, as the prior distribution used to estimate the item difficulty parameters biased predicted scores when there were difficulty differences between exam forms.

Download Full-text

MaxEnt’s parameter configuration and small samples: Are we paying attention to recommendations?

10.1101/080457 ◽

2016 ◽

Author(s):

Narkis S. Morales ◽

Ignacio C. Fernándezb ◽

Victoria Baca-Gonzálezd

Keyword(s):

Species Distribution ◽

Small Sample ◽

Small Samples ◽

Distribution Model ◽

Environmental Niche ◽

Sample Sizes ◽

Environmental Niche Modeling ◽

Parameter Configuration ◽

Small Sample Sizes ◽

Modelling Process

AbstractEnvironmental niche modeling (ENM) is commonly used to develop probabilistic maps of species distribution. Among available ENM techniques, MaxEnt has become one of the most popular tools for modeling species distribution, with hundreds of peer-reviewed articles published each year. MaxEnt’s popularity is mainly due to the use of a graphical interface and automatic parameter configuration capabilities. However, recent studies have shown that using the default automatic configuration may not be always appropriate because it can produce non-optimal models; particularly when dealing with a small number of species presence points. Thus, the recommendation is to evaluate the best potential combination of parameters (feature classes and regularization multiplier) to select the most appropriate model. In this work we reviewed 244 articles from 142 journals between 2013 and 2015 to assess whether researchers are following recommendations to avoid using the default parameter configuration when dealing with small sample sizes, or if they are using MaxEnt as a “black box tool”. Our results show that in only 16% of analyzed articles authors evaluated best feature classes, in 6.9% evaluated best regularization multipliers, and in a meager 3.7% evaluated simultaneously both parameters before producing the definitive distribution model. These results are worrying, because publications are potentially reporting over-complex or over-simplistic models that can undermine the applicability of their results. Of particular importance are studies used to inform policy making. Therefore, researchers, practitioners, reviewers and editors need to be very judicious when dealing with MaxEnt, particularly when the modelling process is based on small sample sizes.

Download Full-text

Differences of Type I error rates for ANOVA and Multilevel-Linear-Models using SAS and SPSS for repeated measures designs

Meta-Psychology ◽

10.15626/mp.2018.898 ◽

2019 ◽

Vol 3 ◽

Author(s):

Nicolas Haverkamp ◽

André Beauducel

Keyword(s):

Repeated Measures ◽

Linear Models ◽

Type I Error ◽

Error Rates ◽

Small Sample ◽

Small Samples ◽

Type I ◽

Sample Sizes ◽

Type I Error Rates ◽

Multilevel Linear Models

To derive recommendations on how to analyze longitudinal data, we examined Type I error rates of Multilevel Linear Models (MLM) and repeated measures Analysis of Variance (rANOVA) using SAS and SPSS. We performed a simulation with the following specifications: To explore the effects of high numbers of measurement occasions and small sample sizes on Type I error, measurement occasions of m = 9 and 12 were investigated as well as sample sizes of n = 15, 20, 25 and 30. Effects of non-sphericity in the population on Type I error were also inspected: 5,000 random samples were drawn from two populations containing neither a within-subject nor a between-group effect. They were analyzed including the most common options to correct rANOVA and MLM-results: The Huynh-Feldt-correction for rANOVA (rANOVA-HF) and the Kenward-Roger-correction for MLM (MLM-KR), which could help to correct progressive bias of MLM with an unstructured covariance matrix (MLM-UN). Moreover, uncorrected rANOVA and MLM assuming a compound symmetry covariance structure (MLM-CS) were also taken into account. The results showed a progressive bias for MLM-UN for small samples which was stronger in SPSS than in SAS. Moreover, an appropriate bias correction for Type I error via rANOVA-HF and an insufficient correction by MLM-UN-KR for n < 30 were found. These findings suggest MLM-CS or rANOVA if sphericity holds and a correction of a violation via rANOVA-HF. If an analysis requires MLM, SPSS yields more accurate Type I error rates for MLM-CS and SAS yields more accurate Type I error rates for MLM-UN.

Download Full-text

Estimation of the Shape Parameter of a Wear-Out Failure Period for a Three-Parameter Weibull Distribution in a Small Sample

International Journal of Statistics and Probability ◽

10.5539/ijsp.v9n6p39 ◽

2020 ◽

Vol 9 (6) ◽

pp. 39

Author(s):

Toru Ogura ◽

Takatoshi Sugiyama ◽

Nariaki Sugiura

Keyword(s):

Monte Carlo ◽

Weibull Distribution ◽

Sample Size ◽

Unbiased Estimator ◽

Shape Parameter ◽

Mean Squared Error ◽

Minimum Variance ◽

Small Sample ◽

Scale Parameters ◽

Squared Error

We propose a method to estimate a shape parameter for a three-parameter Weibull distribution. The proposed method first derives an unbiased estimator for the shape parameter independent of the location and scale parameters and then estimates the shape parameter using a minimum-variance linear unbiased estimator. Since the proposed method is expressed using a hyperparameter, its optimal hyperparameter is searched using Monte Carlo simulations. The recommended hyperparameter used for estimating the shape parameter depends on the sample size, and this causes no problems since the sample size is known when data is obtained. The proposed method is evaluated using a bias and a root mean squared error, and the results are very promising when the population shape parameter is 2 or more in the Weibull distribution representing the wear-out failure period. A numerical dataset is analyzed to demonstrate the practical use of the proposed method.

Download Full-text

Use of proper statistical techniques for research studies with small samples

AJP Lung Cellular and Molecular Physiology ◽

10.1152/ajplung.00238.2017 ◽

2017 ◽

Vol 313 (5) ◽

pp. L873-L877 ◽

Cited By ~ 12

Author(s):

Charity J. Morgan

Keyword(s):

Statistical Methods ◽

Statistical Tests ◽

Small Sample ◽

Small Samples ◽

Statistical Techniques ◽

Sample Sizes ◽

Small Sample Sizes ◽

Research Studies

In this review I discuss the appropriateness of various statistical methods for use with small sample sizes. I review the assumptions and limitations of these methods and provide recommendations for figures and statistical tests.

Download Full-text