Unknown

Dataset Information

0

Methods for observed-cluster inference when cluster size is informative: a review and clarifications.


ABSTRACT: Clustered data commonly arise in epidemiology. We assume each cluster member has an outcome Y and covariates X. When there are missing data in Y, the distribution of Y given X in all cluster members ("complete clusters") may be different from the distribution just in members with observed Y ("observed clusters"). Often the former is of interest, but when data are missing because in a fundamental sense Y does not exist (e.g., quality of life for a person who has died), the latter may be more meaningful (quality of life conditional on being alive). Weighted and doubly weighted generalized estimating equations and shared random-effects models have been proposed for observed-cluster inference when cluster size is informative, that is, the distribution of Y given X in observed clusters depends on observed cluster size. We show these methods can be seen as actually giving inference for complete clusters and may not also give observed-cluster inference. This is true even if observed clusters are complete in themselves rather than being the observed part of larger complete clusters: here methods may describe imaginary complete clusters rather than the observed clusters. We show under which conditions shared random-effects models proposed for observed-cluster inference do actually describe members with observed Y. A psoriatic arthritis dataset is used to illustrate the danger of misinterpreting estimates from shared random-effects models.

SUBMITTER: Seaman SR 

PROVIDER: S-EPMC4312901 | biostudies-other | 2014 Jun

REPOSITORIES: biostudies-other

altmetric image

Publications

Methods for observed-cluster inference when cluster size is informative: a review and clarifications.

Seaman Shaun R SR   Pavlou Menelaos M   Copas Andrew J AJ  

Biometrics 20140130 2


Clustered data commonly arise in epidemiology. We assume each cluster member has an outcome Y and covariates X. When there are missing data in Y, the distribution of Y given X in all cluster members ("complete clusters") may be different from the distribution just in members with observed Y ("observed clusters"). Often the former is of interest, but when data are missing because in a fundamental sense Y does not exist (e.g., quality of life for a person who has died), the latter may be more mean  ...[more]

Similar Datasets

| S-EPMC4963003 | biostudies-literature
| S-EPMC6838778 | biostudies-literature
| S-EPMC5461221 | biostudies-literature
| S-EPMC8254921 | biostudies-literature
| S-EPMC5844500 | biostudies-literature
| S-EPMC8172256 | biostudies-literature
| S-EPMC4521133 | biostudies-literature
| S-EPMC4581538 | biostudies-literature
| S-EPMC7593362 | biostudies-literature
| S-EPMC7597089 | biostudies-literature