Skip to contents

Use a fitted model to create replicate datasets, typically as a way of checking a model.

Usage

replicate_data(x, condition_on = NULL, n = 19)

Arguments

x

A fitted model, typically created by calling mod_pois(), mod_binom(), or mod_norm(), and then fit().

condition_on

Parameters to condition on. Either "expected" or "fitted". See details.

n

Number of replicate datasets to create. Default is 19.

Value

A tibble with the following structure:

.replicatedata
"Original"Original data supplied to mod_pois(), mod_binom(), mod_norm()
"Replicate 1"Simulated data.
"Replicate 2"Simulated data.
......
"Replicate <n>"Simulated data.

Details

Use n draws from the posterior distribution for model parameters to generate n simulated datasets. If the model is working well, these simulated datasets should look similar to the actual dataset.

The condition_on argument

With Poisson and binomial models that include dispersion terms (which is the default), there are two options for constructing replicate data.

  • When condition_on is "fitted", the replicate data is created by (i) drawing values from the posterior distribution for rates or probabilities (the \(\gamma_i\) defined in mod_pois() and mod_binom()), and (ii) conditional on these rates or probabilities, drawing values for the outcome variable.

  • When condition_on is "expected", the replicate data is created by (i) drawing values from hyper-parameters governing the rates or probabilities (the \(\mu_i\) and \(\xi\) defined in mod_pois() and mod_binom()), then (ii) conditional on these hyper-parameters, drawing values for the rates or probabilities, and finally (iii) conditional on these rates or probabilities, drawing values for the outcome variable.

The default for condition_on is "expected". The "expected" option provides a more severe test for a model than the "fitted" option, since "fitted" values are weighted averages of the "expected" values and the original data.

As described in mod_norm(), normal models have a different structure from Poisson and binomial models, and the distinction between "fitted" and "expected" does not apply.

Data models for outcomes

If a data model has been provided for the outcome variable, then creation of replicate data will include a step where errors are added to outcomes. For instance, the a rr3 data model is used, then replicate_data() rounds the outcomes to base 3.

See also

Examples

mod <- mod_pois(injuries ~ age:sex + ethnicity + year,
                data = injuries,
                exposure = 1) |>
  fit()

rep_data <- mod |>
  replicate_data()

library(dplyr)
rep_data |>
  group_by(.replicate) |>
  count(wt = injuries)
#> # A tibble: 20 × 2
#> # Groups:   .replicate [20]
#>    .replicate       n
#>    <fct>        <dbl>
#>  1 Original     21588
#>  2 Replicate 1  21342
#>  3 Replicate 2  21361
#>  4 Replicate 3  21564
#>  5 Replicate 4  20486
#>  6 Replicate 5  21356
#>  7 Replicate 6  21508
#>  8 Replicate 7  21067
#>  9 Replicate 8  22037
#> 10 Replicate 9  21617
#> 11 Replicate 10 20854
#> 12 Replicate 11 21131
#> 13 Replicate 12 22050
#> 14 Replicate 13 21681
#> 15 Replicate 14 21164
#> 16 Replicate 15 21448
#> 17 Replicate 16 21439
#> 18 Replicate 17 22109
#> 19 Replicate 18 21167
#> 20 Replicate 19 21727

## when the overall model includes an rr3 data model,
## replicate data are rounded to base 3
mod_pois(injuries ~ age:sex + ethnicity + year,
         data = injuries,
         exposure = popn) |>
  set_datamod_outcome_rr3() |>
  fit() |>
  replicate_data()
#> # A tibble: 18,240 × 7
#>    .replicate age   sex    ethnicity  year injuries  popn
#>    <fct>      <fct> <chr>  <chr>     <int>    <dbl> <int>
#>  1 Original   0-4   Female Maori      2000       12 35830
#>  2 Original   5-9   Female Maori      2000        6 35120
#>  3 Original   10-14 Female Maori      2000        3 32830
#>  4 Original   15-19 Female Maori      2000        6 27130
#>  5 Original   20-24 Female Maori      2000        6 24380
#>  6 Original   25-29 Female Maori      2000        6 24160
#>  7 Original   30-34 Female Maori      2000       12 22560
#>  8 Original   35-39 Female Maori      2000        3 22230
#>  9 Original   40-44 Female Maori      2000        6 18130
#> 10 Original   45-49 Female Maori      2000        6 13770
#> # ℹ 18,230 more rows