The Effect of Cluster Size, Dimensionality, and Number of Clusters on Recovery of True Cluster Structure Through Chernoff-Type Faces

Clustering in some two- and three-dimensional lattices is investigated using an algorithm similar to that of HoshenKopelman. The total number of clusters reveals a maximum at an occupation probability, pmax, where the average cluster size, 2.03 ± 0.07, is found to be independent of the size, dimension, coordination number, and the type of lattice. We discussed the fact that the clustering effectively begins at pmax. The percolation threshold, pc, and pmax are found to get closer to each other as the coordination number increases. PACS Nos.: 64.60.Ht, 64.60.Qb, 82.30.Nr

Download Full-text

Sample Size Issues for Cluster Randomized Trials With Discrete-Time Survival Endpoints

Methodology ◽

10.1027/1614-2241/a000047 ◽

2012 ◽

Vol 8 (4) ◽

pp. 146-158 ◽

Cited By ~ 4

Author(s):

Mirjam Moerbeek

Keyword(s):

Discrete Time ◽

Cluster Size ◽

Randomized Trials ◽

Cluster Randomized Trial ◽

Smoking Initiation ◽

Fixed Number ◽

Cluster Randomized Trials ◽

Number Of Clusters ◽

Sufficient Power ◽

Cluster Randomized

With cluster randomized trials complete groups of subjects are randomized to treatment conditions. An important question might be whether and when the subjects experience a particular event, such as smoking initiation or recovery from disease. In the social sciences the timing of such events is often measured in discrete time by using time intervals. At the planning phase of a cluster randomized trial one should decide on the number of clusters and cluster size such that parameters are estimated accurately and sufficient power on the test on treatment effect is achieved. On basis of a simulation study it is concluded that regression coefficients are estimated more accurately than the variance of the random cluster effect. In addition, it is shown that power increases with cluster size and number of clusters, and that a sufficient power cannot always be achieved by using larger cluster sizes at a fixed number of clusters.

Download Full-text

The effect of number of clusters and cluster size on statistical power and Type I error rates when testing random effects variance components in multilevel linear and logistic regression models

Journal of Statistical Computation and Simulation ◽

10.1080/00949655.2018.1504945 ◽

2018 ◽

Vol 88 (16) ◽

pp. 3151-3163 ◽

Cited By ~ 8

Author(s):

Peter C. Austin ◽

George Leckie

Keyword(s):

Variance Components ◽

Cluster Size ◽

Regression Models ◽

Statistical Power ◽

Type I Error ◽

Error Rates ◽

Type I ◽

Number Of Clusters ◽

Logistic Regression Models ◽

Type I Error Rates

Download Full-text

Ignoring a Multilevel Structure in Mixture Item Response Models: Impact on Parameter Recovery and Model Selection

Applied Psychological Measurement ◽

10.1177/0146621617711999 ◽

2017 ◽

Vol 42 (2) ◽

pp. 136-154 ◽

Cited By ~ 2

Author(s):

Woo-yeol Lee ◽

Sun-Joo Cho ◽

Sonya K. Sterba

Keyword(s):

Model Selection ◽

Item Response ◽

Cluster Size ◽

Response Model ◽

Item Response Model ◽

Number Of Clusters ◽

Parameter Recovery ◽

Single Level ◽

Multilevel Structure ◽

Level Data

The current study investigated the consequences of ignoring a multilevel structure for a mixture item response model to show when a multilevel mixture item response model is needed. Study 1 focused on examining the consequence of ignoring dependency for within-level latent classes. Simulation conditions that may affect model selection and parameter recovery in the context of a multilevel data structure were manipulated: class-specific ICC, cluster size, and number of clusters. The accuracy of model selection (based on information criteria) and quality of parameter recovery were used to evaluate the impact of ignoring a multilevel structure. Simulation results indicated that, for the range of class-specific ICCs examined here (.1 to .3), mixture item response models which ignored a higher level nesting structure resulted in less accurate estimates and standard errors ( SEs) of item discrimination parameters when the number of clusters was larger than 24 and the cluster size was larger than six. Class-varying ICCs can have compensatory effects on bias. Also, the results suggested that a mixture item response model which ignored multilevel structure was not selected over the multilevel mixture item response model based on Bayesian information criterion (BIC) if the number of clusters and cluster size was at least 50, respectively. In Study 2, the consequences of unnecessarily fitting a multilevel mixture item response model to single-level data were examined. Reassuringly, in the context of single-level data, a multilevel mixture item response model was not selected by BIC, and its use would not distort the within-level item parameter estimates or SEs when the cluster size was at least 20. Based on these findings, it is concluded that, for class-specific ICC conditions examined here, a multilevel mixture item response model is recommended over a single-level item response model for a clustered dataset having cluster size [Formula: see text] and the number of clusters [Formula: see text].

Download Full-text

Molecular Dynamics Study on Cluster Structure of Water. Dependence of Cluster Size and Its Probability Distribution on Temperature and Density.

TRANSACTIONS OF THE JAPAN SOCIETY OF MECHANICAL ENGINEERS Series B ◽

10.1299/kikaib.60.496 ◽

1994 ◽

Vol 60 (570) ◽

pp. 496-503 ◽

Cited By ~ 1

Author(s):

Taku Ohara ◽

Toshio Aihara

Keyword(s):

Molecular Dynamics ◽

Probability Distribution ◽

Cluster Size ◽

Cluster Structure ◽

Structure Of Water

Download Full-text

STRUCTURE EVOLUTION OF AN ER FLUID UNDER ELECTRIC AND SHEAR FIELDS

International Journal of Modern Physics B ◽

10.1142/s0217979203017357 ◽

2003 ◽

Vol 17 (01n02) ◽

pp. 209-212

Author(s):

J. G. CAO ◽

L. LU ◽

L. W. ZHOU

Keyword(s):

Electric Field ◽

Cluster Size ◽

Cluster Structure ◽

Structure Evolution ◽

Shear Field ◽

Combined Application ◽

The Mean ◽

Er Fluids ◽

Er Fluid ◽

Shear Fields

With the combined application of an electric field and shear field, the particles in an ER fluids form cluster structure in the direction of shear. The morphology of the cluster structure was described, and it is found in the experiments that the mean cluster size, S mean, is proportional to [Formula: see text] and E0.64.

Download Full-text

Type I Error Control for Cluster Randomized Trials Under Varying Small Sample Structures

10.21203/rs.2.17855/v1 ◽

2019 ◽

Author(s):

Joshua Nugent ◽

Ken Kleinman

Keyword(s):

Cluster Size ◽

Error Control ◽

Type I Error ◽

Error Rates ◽

Small Sample ◽

Type I ◽

Number Of Clusters ◽

Type I Error Rates ◽

Wald Tests ◽

Cluster Randomized

Abstract Background: Linear mixed models (LMM) are a common approach to analyzing data from cluster randomized trials (CRTs). Inference on parameters can be performed via Wald tests or likelihood ratio tests (LRT), but both approaches may give incorrect Type I error rates in common finite sample settings. The impact of interactions of cluster size, number of clusters, intraclass correlation coefficient (ICC), and analysis approach on Type I error rates have not been well studied. Reviews of published CRTs find that small sample sizes are not uncommon, so the performance of different inferential approaches in these settings can guide data analysts to the best choices. Methods: Using a random-intercept LMM stucture, we use simulations to study Type I error rates with the LRT and Wald test with different degrees of freedom (DF) choices across different combinations of cluster size, number of clusters, and ICC.Results: Our simulations show that the LRT can be anti-conservative when the ICC is large and the number of clusters is small, with the effect most pronouced when the cluster size is relatively large. Wald tests with the Between-Within DF method or the Satterthwaite DF approximation maintain Type I error control at the stated level, though they are conservative when the number of clusters, the cluster size, and the ICC are small. Conclusions: Depending on the structure of the CRT, analysts should choose a hypothesis testing approach that will maintain the appropriate Type I error rate for their data. Wald tests with the Satterthwaite DF approximation work well in many circumstances, but in other cases the LRT may have Type I error rates closer to the nominal level.

Download Full-text

Type I Error Control for Cluster Randomized Trials Under Varying Small Sample Structures

10.21203/rs.2.17855/v2 ◽

2020 ◽

Author(s):

Joshua Nugent ◽

Ken Kleinman

Keyword(s):

Cluster Size ◽

Error Control ◽

Type I Error ◽

Error Rates ◽

Small Sample ◽

Type I ◽

Number Of Clusters ◽

Type I Error Rates ◽

Wald Tests ◽

Cluster Randomized

Abstract Background: Linear mixed models (LMM) are a common approach to analyzing data from cluster randomized trials (CRTs). Inference on parameters can be performed via Wald tests or likelihood ratio tests (LRT), but both approaches may give incorrect Type I error rates in common finite sample settings. The impact of different combinations of cluster size, number of clusters, intraclass correlation coefficient (ICC), and analysis approach on Type I error rates has not been well studied. Reviews of published CRTs nd that small sample sizes are not uncommon, so the performance of different inferential approaches in these settings can guide data analysts to the best choices.Methods: Using a random-intercept LMM stucture, we use simulations to study Type I error rates with the LRT and Wald test with different degrees of freedom (DF) choices across different combinations of cluster size, number of clusters, and ICC.Results: Our simulations show that the LRT can be anti-conservative when the ICC is large and the number of clusters is small, with the effect most pronounced when the cluster size is relatively large. Wald tests with the between-within DF method or the Satterthwaite DF approximation maintain Type I error control at the stated level, though they are conservative when the number of clusters, the cluster size, and the ICC are small.Conclusions: Depending on the structure of the CRT, analysts should choose a hypothesis testing approach that will maintain the appropriate Type I error rate for their data. Wald tests with the Satterthwaite DF approximation work well in many circumstances, but in other cases the LRT may have Type I error rates closer to the nominal level.

Download Full-text

On the Shapes of the Polish Word: Phonotactic Complexity and Diversity

Studia Anglica Posnaniensia ◽

10.2478/stap-2021-0006 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

Paulina Zydorowicz ◽

Michał Jankowski ◽

Katarzyna Dziubalska-Kołaczyk

Keyword(s):

Cluster Size ◽

Number Of Clusters ◽

Skeletal Pattern ◽

Corpus Data ◽

Cluster Length ◽

Word Shape

Abstract The aim of this contribution is to identify the dominant shapes of the Polish word with reference to three criteria: cluster complexity (i.e., cluster size), saturation (the number of clusters in a word), and diversity (in terms of features of consonant description). The dominant word shape is understood as the most frequent or typical skeletal pattern, expressed by means of alternations or groupings of Cs (consonants) and Vs (vowels), e.g., CVCCV etc., or by means of specific features (of place, manner, voice, and the sonorant/obstruent distinction). Our work focuses on 2 aspects of Polish phonotactics: (1) the relation between cluster complexity and saturation of words with clusters, (2) the degrees of diversity in features of place, manner, and voice within clusters. Using corpus data, we have established that only 4.17% of word shapes have no clusters. The dominant word shape for a one-cluster word is CVCCVCV. The most frequent scenario for a word shape is to contain two clusters, of which 67% are a combination of a word initial and a word medial cluster. We have found that: (1) cluster length is inversely proportional to the number of clusters in a word; (2) nearly 73% of word types contain clusters of the same size, e.g., two CCs or two CCCs (Polish words prefer saturation over complexity); (3) MOA is more diversified than POA across clusters and words.

Download Full-text