phase sampling
Recently Published Documents


TOTAL DOCUMENTS

315
(FIVE YEARS 64)

H-INDEX

18
(FIVE YEARS 3)

Author(s):  
Manoj Kumar Chaudhary ◽  
Amit Kumar ◽  
Gautam K. Vishwakarma

In the present paper, we have proposed some improved estimators of the population mean utilizing the information on two auxiliary variables adopting the idea of two-phase sampling under non-response. In order to propose the estimators, we have assumed that the study variable and first auxiliary variable suffer from non-response while the second (additional) auxiliary variable is free from non-response. We have derived the expressions for biases and mean square errors of the proposed estimators and compared them with that of usual estimator and some well known existing estimators of the population mean. The theoretical results have also been illustrated with some empirical data.


2021 ◽  
Author(s):  
Haojie Xu ◽  
Gaofeng Jin ◽  
Jianan Wu ◽  
Huanan Guo ◽  
Xun Luo ◽  
...  

2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Sunwoo Han ◽  
Brian D. Williamson ◽  
Youyi Fong

Abstract Background While random forests are one of the most successful machine learning methods, it is necessary to optimize their performance for use with datasets resulting from a two-phase sampling design with a small number of cases—a common situation in biomedical studies, which often have rare outcomes and covariates whose measurement is resource-intensive. Methods Using an immunologic marker dataset from a phase III HIV vaccine efficacy trial, we seek to optimize random forest prediction performance using combinations of variable screening, class balancing, weighting, and hyperparameter tuning. Results Our experiments show that while class balancing helps improve random forest prediction performance when variable screening is not applied, class balancing has a negative impact on performance in the presence of variable screening. The impact of the weighting similarly depends on whether variable screening is applied. Hyperparameter tuning is ineffective in situations with small sample sizes. We further show that random forests under-perform generalized linear models for some subsets of markers, and prediction performance on this dataset can be improved by stacking random forests and generalized linear models trained on different subsets of predictors, and that the extent of improvement depends critically on the dissimilarities between candidate learner predictions. Conclusion In small datasets from two-phase sampling design, variable screening and inverse sampling probability weighting are important for achieving good prediction performance of random forests. In addition, stacking random forests and simple linear models can offer improvements over random forests.


Author(s):  
Aamir Raza ◽  
Muhammad Noor-ul-Amin

The estimation of population mean is not meaningful using ordinary least square method when data contains some outliers. In the current study, we proposed efficient estimators of population mean using robust regression in two phase sampling. An extensive simulation study is conduct to examine the efficiency of proposed estimators in terms of mean square error (MSE). Real life example and extensive simulation study are cited to demonstrate the performance of the proposed estimators. Theoretical example and simulation studies showed that the suggested estimators are more efficient than the considered estimators in the presence of outliers.


2021 ◽  
pp. 457-482
Author(s):  
Sharon L. Lohr

Author(s):  
B. K. Singh

Abstract: In this paper, authors have proposed a class of exponential dual to ratio type compromised imputation technique and corresponding point estimator in two-phase sampling design. Two different sampling designs in two-phase sampling are compared under imputed data. The bias and M.S.E. of suggested estimator is derived in the form of population parameters using the concept of large sample approximation. Numerical study is performed over two populations using the expressions of bias and M.S.E. and efficiency compared with existing estimators. Keywords: Missing data, Bias, Mean squared error (M.S.E), Two-phase sampling, SRSWOR, Compromised Imputation (C.I.).


Author(s):  
Nadia Mushtaq

Variations in the population can be estimated by variance estimation. In this study, we consider variance estimation procedure using scrambled randomized response for sensitive variable using multi-auxiliary variables in multi-phase sampling. Under Noor-ul-Amin et al. (2018) RRT model, generalized exponential regression type estimator for case-1and case-2 are derived. A simulation study is presented to illustrate the application and computational details. It is observed that proposed model showed better results under both cases.


Sign in / Sign up

Export Citation Format

Share Document