two phase sampling
Recently Published Documents


TOTAL DOCUMENTS

219
(FIVE YEARS 22)

H-INDEX

14
(FIVE YEARS 0)

Author(s):  
Manoj Kumar Chaudhary ◽  
Amit Kumar ◽  
Gautam K. Vishwakarma

In the present paper, we have proposed some improved estimators of the population mean utilizing the information on two auxiliary variables adopting the idea of two-phase sampling under non-response. In order to propose the estimators, we have assumed that the study variable and first auxiliary variable suffer from non-response while the second (additional) auxiliary variable is free from non-response. We have derived the expressions for biases and mean square errors of the proposed estimators and compared them with that of usual estimator and some well known existing estimators of the population mean. The theoretical results have also been illustrated with some empirical data.



2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Sunwoo Han ◽  
Brian D. Williamson ◽  
Youyi Fong

Abstract Background While random forests are one of the most successful machine learning methods, it is necessary to optimize their performance for use with datasets resulting from a two-phase sampling design with a small number of cases—a common situation in biomedical studies, which often have rare outcomes and covariates whose measurement is resource-intensive. Methods Using an immunologic marker dataset from a phase III HIV vaccine efficacy trial, we seek to optimize random forest prediction performance using combinations of variable screening, class balancing, weighting, and hyperparameter tuning. Results Our experiments show that while class balancing helps improve random forest prediction performance when variable screening is not applied, class balancing has a negative impact on performance in the presence of variable screening. The impact of the weighting similarly depends on whether variable screening is applied. Hyperparameter tuning is ineffective in situations with small sample sizes. We further show that random forests under-perform generalized linear models for some subsets of markers, and prediction performance on this dataset can be improved by stacking random forests and generalized linear models trained on different subsets of predictors, and that the extent of improvement depends critically on the dissimilarities between candidate learner predictions. Conclusion In small datasets from two-phase sampling design, variable screening and inverse sampling probability weighting are important for achieving good prediction performance of random forests. In addition, stacking random forests and simple linear models can offer improvements over random forests.



Author(s):  
Aamir Raza ◽  
Muhammad Noor-ul-Amin

The estimation of population mean is not meaningful using ordinary least square method when data contains some outliers. In the current study, we proposed efficient estimators of population mean using robust regression in two phase sampling. An extensive simulation study is conduct to examine the efficiency of proposed estimators in terms of mean square error (MSE). Real life example and extensive simulation study are cited to demonstrate the performance of the proposed estimators. Theoretical example and simulation studies showed that the suggested estimators are more efficient than the considered estimators in the presence of outliers.



2021 ◽  
pp. 457-482
Author(s):  
Sharon L. Lohr




Author(s):  
B. K. Singh

Abstract: In this paper, authors have proposed a class of exponential dual to ratio type compromised imputation technique and corresponding point estimator in two-phase sampling design. Two different sampling designs in two-phase sampling are compared under imputed data. The bias and M.S.E. of suggested estimator is derived in the form of population parameters using the concept of large sample approximation. Numerical study is performed over two populations using the expressions of bias and M.S.E. and efficiency compared with existing estimators. Keywords: Missing data, Bias, Mean squared error (M.S.E), Two-phase sampling, SRSWOR, Compromised Imputation (C.I.).



2021 ◽  
Vol 19 (1) ◽  
pp. 2-16
Author(s):  
Gajendra Kumar Vishwakarma ◽  
Sayed Mohammed Zeeshan

A method to lower the MSE of a proposed estimator relative to the MSE of the linear regression estimator under two-phase sampling scheme is developed. Estimators are developed to estimate the mean of the variate under study with the help of auxiliary variate (which are unknown but it can be accessed conveniently and economically). The mean square errors equations are obtained for the proposed estimators. In addition, optimal sample sizes are obtained under the given cost function. The comparison study has been done to set up conditions for which developed estimators are more effective than other estimators with novelty. The empirical study is also performed to supplement the claim that the developed estimators are more efficient.





Sign in / Sign up

Export Citation Format

Share Document