Unknown

Dataset Information

0

Relative efficiencies of two-stage sampling schemes for mean estimation in multilevel populations when cluster size is informative.


ABSTRACT: In multilevel populations, there are two types of population means of an outcome variable ie, the average of all individual outcomes ignoring cluster membership and the average of cluster-specific means. To estimate the first mean, individuals can be sampled directly with simple random sampling or with two-stage sampling (TSS), that is, sampling clusters first, and then individuals within the sampled clusters. When cluster size varies in the population, three TSS schemes can be considered, ie, sampling clusters with probability proportional to cluster size and then sampling the same number of individuals per cluster; sampling clusters with equal probability and then sampling the same percentage of individuals per cluster; and sampling clusters with equal probability and then sampling the same number of individuals per cluster. Unbiased estimation of the average of all individual outcomes is discussed under each sampling scheme assuming cluster size to be informative. Furthermore, the three TSS schemes are compared in terms of efficiency with each other and with simple random sampling under the constraint of a fixed total sample size. The relative efficiency of the sampling schemes is shown to vary across different cluster size distributions. However, sampling clusters with probability proportional to size is the most efficient TSS scheme for many cluster size distributions. Model-based and design-based inference are compared and are shown to give similar results. The results are applied to the distribution of high school size in Italy and the distribution of patient list size for general practices in England.

SUBMITTER: Innocenti F 

PROVIDER: S-EPMC6590157 | biostudies-literature | 2019 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

Relative efficiencies of two-stage sampling schemes for mean estimation in multilevel populations when cluster size is informative.

Innocenti Francesco F   Candel Math J J M MJJM   Tan Frans E S FES   van Breukelen Gerard J P GJP  

Statistics in medicine 20181221 10


In multilevel populations, there are two types of population means of an outcome variable ie, the average of all individual outcomes ignoring cluster membership and the average of cluster-specific means. To estimate the first mean, individuals can be sampled directly with simple random sampling or with two-stage sampling (TSS), that is, sampling clusters first, and then individuals within the sampled clusters. When cluster size varies in the population, three TSS schemes can be considered, ie, s  ...[more]

Similar Datasets

| S-EPMC8172256 | biostudies-literature
| S-EPMC11220780 | biostudies-literature
| S-EPMC4312901 | biostudies-other
| S-EPMC6026085 | biostudies-literature
| S-EPMC10638852 | biostudies-literature
| S-EPMC5461221 | biostudies-literature
| S-EPMC6838778 | biostudies-literature
| S-EPMC8254921 | biostudies-literature
| S-EPMC4963003 | biostudies-literature
| S-EPMC6392194 | biostudies-literature