scholarly journals Optimal two-stage sampling for mean estimation in multilevel populations when cluster size is informative

2020 ◽  
pp. 096228022095283
Author(s):  
Francesco Innocenti ◽  
Math JJM Candel ◽  
Frans ES Tan ◽  
Gerard JP van Breukelen

To estimate the mean of a quantitative variable in a hierarchical population, it is logistically convenient to sample in two stages (two-stage sampling), i.e. selecting first clusters, and then individuals from the sampled clusters. Allowing cluster size to vary in the population and to be related to the mean of the outcome variable of interest (informative cluster size), the following competing sampling designs are considered: sampling clusters with probability proportional to cluster size, and then the same number of individuals per cluster; drawing clusters with equal probability, and then the same percentage of individuals per cluster; and selecting clusters with equal probability, and then the same number of individuals per cluster. For each design, optimal sample sizes are derived under a budget constraint. The three optimal two-stage sampling designs are compared, in terms of efficiency, with each other and with simple random sampling of individuals. Sampling clusters with probability proportional to size is recommended. To overcome the dependency of the optimal design on unknown nuisance parameters, maximin designs are derived. The results are illustrated, assuming probability proportional to size sampling of clusters, with the planning of a hypothetical survey to compare adolescent alcohol consumption between France and Italy.

2003 ◽  
Vol 46 (3) ◽  
pp. 113-115 ◽  
Author(s):  
Karel Karpaš ◽  
Pavel Šponer

The aim of this study is to present our experience with two-stage reimplantation in the management of the infected hip arthroplasty. Between January 1993 and December 2001 the replacement of the total hip arthroplasty in two stages was performed in 18 patients. There were 7 male and 11 female patients and the average age was 62 years. The mean follow-up after revision was 3.5 years. The mean postoperative Harris Hip Score averaged 78 (50–96) points. None of 18 patients had a recurrence of the infection. Two-stage reconstruction of the infected hip is preferred to one-stage exchange arthroplasty at our department because of higher rate of eradication of the infection.


2020 ◽  
Vol 29 (11) ◽  
pp. 3396-3408 ◽  
Author(s):  
Mary Gregg ◽  
Somnath Datta ◽  
Doug Lorenz

In the analysis of clustered data, inverse cluster size weighting has been shown to be resistant to the potentially biasing effects of informative cluster size, where the number of observations within a cluster is associated with the outcome variable of interest. The method of inverse cluster size reweighting has been implemented to establish clustered data analogues of common tests for independent data, but the method has yet to be extended to tests of categorical data. Many variance estimators have been implemented across established cluster-weighted tests, but potential effects of differing methods on test performance has not previously been explored. Here, we develop cluster-weighted estimators of marginal proportions that remain unbiased under informativeness, and derive analogues of three popular tests for clustered categorical data, the one-sample proportion, goodness of fit, and independence chi square tests. We construct these tests using several variance estimators and show substantial differences in the performance of cluster-weighted tests based on variance estimation technique, with variance estimators constructed under the null hypothesis maintaining size closest to nominal. We illustrate the proposed tests through an application to a data set of functional measures from patients with spinal cord injuries participating in a rehabilitation program.


2018 ◽  
Vol 38 (10) ◽  
pp. 1817-1834
Author(s):  
Francesco Innocenti ◽  
Math J.J.M. Candel ◽  
Frans E.S. Tan ◽  
Gerard J.P. Breukelen

Educatio ◽  
2022 ◽  
Vol 16 (2) ◽  
pp. 156-161
Author(s):  
Rudini Rudini ◽  
◽  
Hanofi Harianto ◽  
Ridwan Ridwan ◽  
Zulfa Azizaturrohmi ◽  
...  

The research aimed at finding out the use of a diary in improving the writing ability of the English students of Hamzanwadi University. The problems formulated in this research were (1) Is the use of diary effective in teaching writing for the English students of Hamzanwadi University? (2) How effective is using a diary in teaching writing for the English students of Hamzanwadi University? The research design of this study was one group pretest and posttest. The population of this research was the second-semester students of Hamzanwadi University, which consisted of 105 students in 4 classes. The present researcher took class D as the sample that consisted of 20 students. The present researcher generated a simple random sampling by obtaining an exhaustive list of a population and then randomly selecting a certain number of individuals to comprise the sample. A pretest and a posttest were given to the students to collect the data. The result of the data analysis indicated that the mean score of the pretest was 34.86 while in the posttest was 48.00. In testing the hypothesis, the result of the t-test was -9.706. The null hypothesis was rejected, and the alternative hypothesis was accepted. So, it can be said that using a diary was significantly effective in teaching writing.


1991 ◽  
Vol 21 (3) ◽  
pp. 265-269 ◽  
Author(s):  
J. Cuvellier ◽  
P. Meynadier ◽  
P. Pujo ◽  
O. Sublemontier ◽  
J-P Visticot ◽  
...  

Author(s):  
Fuhmei Wang ◽  
Jung-Der Wang

Health services provided through the telecommunications system aim to improve the population’s health and well-being. This research aims to explore what digital, economic, and health factors are associated with the provision of telehealth services, especially in ageing communities. Applying Organization for Economic Cooperation and Development (OECD) countries’ experiences, this research tries to construct a logistic regression model between adopting a telehealth system or not, a binary outcome variable, and a group of potentially explanatory variables. Estimation results showed that there were thresholds for telehealth provision: The demand for telehealth service usually began when the provision of telecommunication accessibility reached 50%, the proportion of elders exceeded 10%, or the proportion of health spending occupied more than 3–5% of the gross domestic product (GDP); the slope of each variable seemed to correspond with an increase in demand for such a provision. A growing number of individuals in OECD countries are now readily served by telehealth systems under the COVID-19 pandemic. These findings could be regarded as a model for other countries for implementing the necessary infrastructure early on when any of these parameters reaches its threshold. Moreover, telehealth applied in developing countries could be elevated for wider populations to access basic health services and for the remote delivery of health care. A rational decision could be made to appropriately use additional resources in telehealth provision. With accessible e-health services, the population’s health could be improved, which in turn would possibly increase productivity and social welfare.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Menelaos Pavlou ◽  
Gareth Ambler ◽  
Rumana Z. Omar

Abstract Background Clustered data arise in research when patients are clustered within larger units. Generalised Estimating Equations (GEE) and Generalised Linear Models (GLMM) can be used to provide marginal and cluster-specific inference and predictions, respectively. Methods Confounding by Cluster (CBC) and Informative cluster size (ICS) are two complications that may arise when modelling clustered data. CBC can arise when the distribution of a predictor variable (termed ‘exposure’), varies between clusters causing confounding of the exposure-outcome relationship. ICS means that the cluster size conditional on covariates is not independent of the outcome. In both situations, standard GEE and GLMM may provide biased or misleading inference, and modifications have been proposed. However, both CBC and ICS are routinely overlooked in the context of risk prediction, and their impact on the predictive ability of the models has been little explored. We study the effect of CBC and ICS on the predictive ability of risk models for binary outcomes when GEE and GLMM are used. We examine whether two simple approaches to handle CBC and ICS, which involve adjusting for the cluster mean of the exposure and the cluster size, respectively, can improve the accuracy of predictions. Results Both CBC and ICS can be viewed as violations of the assumptions in the standard GLMM; the random effects are correlated with exposure for CBC and cluster size for ICS. Based on these principles, we simulated data subject to CBC/ICS. The simulation studies suggested that the predictive ability of models derived from using standard GLMM and GEE ignoring CBC/ICS was affected. Marginal predictions were found to be mis-calibrated. Adjusting for the cluster-mean of the exposure or the cluster size improved calibration, discrimination and the overall predictive accuracy of marginal predictions, by explaining part of the between cluster variability. The presence of CBC/ICS did not affect the accuracy of conditional predictions. We illustrate these concepts using real data from a multicentre study with potential CBC. Conclusion Ignoring CBC and ICS when developing prediction models for clustered data can affect the accuracy of marginal predictions. Adjusting for the cluster mean of the exposure or the cluster size can improve the predictive accuracy of marginal predictions.


2013 ◽  
Vol 740-742 ◽  
pp. 393-396
Author(s):  
Maxim N. Lubov ◽  
Jörg Pezoldt ◽  
Yuri V. Trushin

The influence of attractive and repulsive impurities on the nucleation process of the SiC clusters on Si(100) surface was investigated. Kinetic Monte Carlo simulations of the SiC clusters growth show that that increase of the impurity concentration (both attractive and repulsive) leads to decrease of the mean cluster size and rise of the nucleation density of the clusters.


2021 ◽  
pp. 1-11
Author(s):  
Tianhong Dai ◽  
Shijie Cong ◽  
Jianping Huang ◽  
Yanwen Zhang ◽  
Xinwang Huang ◽  
...  

In agricultural production, weed removal is an important part of crop cultivation, but inevitably, other plants compete with crops for nutrients. Only by identifying and removing weeds can the quality of the harvest be guaranteed. Therefore, the distinction between weeds and crops is particularly important. Recently, deep learning technology has also been applied to the field of botany, and achieved good results. Convolutional neural networks are widely used in deep learning because of their excellent classification effects. The purpose of this article is to find a new method of plant seedling classification. This method includes two stages: image segmentation and image classification. The first stage is to use the improved U-Net to segment the dataset, and the second stage is to use six classification networks to classify the seedlings of the segmented dataset. The dataset used for the experiment contained 12 different types of plants, namely, 3 crops and 9 weeds. The model was evaluated by the multi-class statistical analysis of accuracy, recall, precision, and F1-score. The results show that the two-stage classification method combining the improved U-Net segmentation network and the classification network was more conducive to the classification of plant seedlings, and the classification accuracy reaches 97.7%.


Sign in / Sign up

Export Citation Format

Share Document