Inspect multiple imputation model
checkmi.Rd
Check multiple imputation is valid under the proposed imputation model and directed acyclic graph (DAG). Validity means that the proposed approach will allow unbiased estimation of the estimand(s) of interest, including regression parameters, associations, and causal effects. The imputation model should include all other analysis model variables as predictors, as well as any auxiliary variables. The DAG should include all observed and unobserved variables related to the analysis model variables and their missingness, as well as all required missingness indicators.
Arguments
- dep
The partially observed variable to be imputed, specified as a string
- preds
The imputation model predictor(s), specified as a string (space delimited)
- r_dep
The partially observed variable's missingness indicator, specified as a string
- mdag
The DAG, specified as a string using dagitty syntax
Value
A message indicating whether multiple imputation is valid under the proposed DAG and imputation model
Details
In principle, multiple imputation is valid if each partially observed variable is unrelated to its own missingness, given its imputation model predictors.
References
Curnow E, Tilling K, Heron JE, Cornish RP, Carpenter JR. 2023. Multiple imputation of missing data under missing at random: including a collider as an auxiliary variable in the imputation model can induce bias. Frontiers in Epidemiology. doi:10.3389/fepid.2023.1237447
Examples
# Example DAG for which multiple imputation is valid
checkMI(dep="bmi7", preds="matage mated pregsize", r_dep="r",
mdag="matage -> bmi7 mated -> matage mated -> bmi7
sep_unmeas -> mated sep_unmeas -> r pregsize -> bmi7
pregsize -> bwt sep_unmeas -> bwt")
#> Based on the proposed directed acyclic graph (DAG), the incomplete
#> variable and its missingness indicator are independent given imputation
#> model predictors. Hence, multiple imputation methods which assume data
#> are missing at random are valid in principle.
# Example DAG for which multiple imputation is not valid, due to a collider
checkMI(dep="bmi7", preds="matage mated bwt", r_dep="r",
mdag="matage -> bmi7 mated -> matage mated -> bmi7
sep_unmeas -> mated sep_unmeas -> r pregsize -> bmi7
pregsize -> bwt sep_unmeas -> bwt")
#> Based on the proposed directed acyclic graph (DAG), the incomplete
#> variable and its missingness indicator are not independent given
#> imputation model predictors. Hence, multiple imputation methods which
#> assume data are missing at random are not valid.
#>
#> Consider using a different imputation model and/or strategy (e.g.
#> not-at-random fully conditional specification). For example, the
#> incomplete variable and its missingness indicator are independent if,
#> in addition to the specified predictors, the following sets of
#> variables are included as predictors in the imputation model (note that
#> this list is not necessarily exhaustive, particularly if your DAG is
#> complex):
#>
#> pregsize
#>
#> c("pregsize", "sep_unmeas")