Skip to contents

Extract data and rates, probabilities, or means from a model object. The return value consists of the original data and one or more columns of modelled values.

Usage

# S3 method for class 'bage_mod'
augment(x, quiet = FALSE, ...)

Arguments

x

Object of class "bage_mod", typically created with mod_pois(), mod_binom(), or mod_norm().

quiet

Whether to suppress messages. Default is FALSE.

...

Unused. Included for generic consistency only.

Value

A tibble, with the original data plus one or more of the following columns:

  • .<outcome> Corrected or extended version of the outcome variable, in applications where the outcome variable has missing values, or a data model is being used.

  • .observed 'Direct' estimates of rates or probabilities, ie counts divided by exposure or size (in Poisson and binomial models.)

  • .fitted Draws of rates, probabilities, or means.

  • .expected Draws of expected values for rates or probabilities (in Poisson that include exposure, or in binomial models.)

Uncertain quantities are represented using rvecs.

Fitted vs unfitted models

augment() is typically called on a fitted model. In this case, the modelled values are draws from the joint posterior distribution for rates, probabilities, or means.

augment() can, however, be called on an unfitted model. In this case, the modelled values are draws from the joint prior distribution. In other words, the modelled values are informed by model priors, and by values for exposure, size, or weights, but not by observed outcomes.

Imputed values for outcome variable

augment() automatically imputes any missing values for the outcome variable. If outcome variable var has one or more NAs, then augment creates a variable .var holding original and imputed values.

Data model for outcome variable

If the overall model includes a data model for the outcome variable var, then augment() creates a new variable .var containing estimates of the true value for the outcome.

See also

Examples

## specify model
mod <- mod_pois(divorces ~ age + sex + time,
                data = divorces,
                exposure = population) |>
  set_n_draw(n_draw = 100) ## smaller sample, so 'augment' faster

## draw from the prior distribution
mod |> augment()
#>  Model not fitted, so values drawn straight from prior distribution.
#> # A tibble: 242 × 8
#>    age   sex     time              divorces population           .observed
#>    <fct> <chr>  <int>           <rdbl<100>>      <dbl>         <rdbl<100>>
#>  1 15-19 Female  2011 72440 (1079, 9419752)     154460    0.47 (0.007, 61)
#>  2 15-19 Female  2012   78540 (73, 6576945)     153060  0.51 (0.00048, 43)
#>  3 15-19 Female  2013  99597 (128, 1.1e+07)     152250  0.65 (0.00084, 74)
#>  4 15-19 Female  2014   77107 (20, 1.6e+07)     152020 0.51 (0.00013, 108)
#>  5 15-19 Female  2015    70802 (368, 1e+08)     152970  0.46 (0.0024, 659)
#>  6 15-19 Female  2016   70648 (45, 4.6e+07)     154170 0.46 (0.00029, 297)
#>  7 15-19 Female  2017   84262 (42, 2.3e+07)     154450 0.55 (0.00027, 148)
#>  8 15-19 Female  2018   60320 (14, 1.2e+08)     154170 0.39 (9.4e-05, 795)
#>  9 15-19 Female  2019   96664 (15, 6.9e+07)     154760 0.62 (9.9e-05, 447)
#> 10 15-19 Female  2020   1e+05 (16, 6.2e+07)     154480 0.67 (0.00011, 402)
#> # ℹ 232 more rows
#> # ℹ 2 more variables: .fitted <rdbl<100>>, .expected <rdbl<100>>

## fit model
mod <- mod |>
  fit()

## draw from the posterior distribution
mod |> augment()
#> # A tibble: 242 × 8
#>    age   sex     time divorces population .observed                    .fitted
#>    <fct> <chr>  <int>    <dbl>      <dbl>     <dbl>                <rdbl<100>>
#>  1 15-19 Female  2011        0     154460 0           1.1e-05 (6e-06, 1.8e-05)
#>  2 15-19 Female  2012        6     153060 0.0000392   1.4e-05 (6.2e-06, 2e-05)
#>  3 15-19 Female  2013        3     152250 0.0000197 1.1e-05 (7.6e-06, 1.8e-05)
#>  4 15-19 Female  2014        3     152020 0.0000197 1.1e-05 (6.2e-06, 1.7e-05)
#>  5 15-19 Female  2015        3     152970 0.0000196 1.1e-05 (7.5e-06, 1.8e-05)
#>  6 15-19 Female  2016        3     154170 0.0000195 1.1e-05 (6.5e-06, 1.8e-05)
#>  7 15-19 Female  2017        6     154450 0.0000388 1.1e-05 (7.3e-06, 1.8e-05)
#>  8 15-19 Female  2018        0     154170 0         8.8e-06 (4.8e-06, 1.5e-05)
#>  9 15-19 Female  2019        3     154760 0.0000194 9.9e-06 (6.3e-06, 1.6e-05)
#> 10 15-19 Female  2020        0     154480 0         7.9e-06 (4.3e-06, 1.3e-05)
#> # ℹ 232 more rows
#> # ℹ 1 more variable: .expected <rdbl<100>>

## insert a missing value into outcome variable
divorces_missing <- divorces
divorces_missing$divorces[1] <- NA

## fitting model and calling 'augument'
## creates a new variable called '.divorces'
## holding observed and imputed values
mod_pois(divorces ~ age + sex + time,
         data = divorces_missing,
         exposure = population) |>
  fit() |>
  augment()
#> # A tibble: 242 × 9
#>    age   sex     time divorces    .divorces population  .observed
#>    <fct> <chr>  <int>    <dbl> <rdbl<1000>>      <dbl>      <dbl>
#>  1 15-19 Female  2011       NA     2 (0, 6)     154460 NA        
#>  2 15-19 Female  2012        6     6 (6, 6)     153060  0.0000392
#>  3 15-19 Female  2013        3     3 (3, 3)     152250  0.0000197
#>  4 15-19 Female  2014        3     3 (3, 3)     152020  0.0000197
#>  5 15-19 Female  2015        3     3 (3, 3)     152970  0.0000196
#>  6 15-19 Female  2016        3     3 (3, 3)     154170  0.0000195
#>  7 15-19 Female  2017        6     6 (6, 6)     154450  0.0000388
#>  8 15-19 Female  2018        0     0 (0, 0)     154170  0        
#>  9 15-19 Female  2019        3     3 (3, 3)     154760  0.0000194
#> 10 15-19 Female  2020        0     0 (0, 0)     154480  0        
#> # ℹ 232 more rows
#> # ℹ 2 more variables: .fitted <rdbl<1000>>, .expected <rdbl<1000>>

## specifying a data model for the
## original data also leads to a new
## variable called '.divorces'
mod_pois(divorces ~ age + sex + time,
         data = divorces,
         exposure = population) |>
  set_datamod_outcome_rr3() |>
  fit() |>
  augment()
#> # A tibble: 242 × 9
#>    age   sex     time divorces    .divorces population .observed
#>    <fct> <chr>  <int>    <dbl> <rdbl<1000>>      <dbl>     <dbl>
#>  1 15-19 Female  2011        0     1 (0, 2)     154460 0        
#>  2 15-19 Female  2012        6     6 (4, 8)     153060 0.0000392
#>  3 15-19 Female  2013        3     3 (1, 5)     152250 0.0000197
#>  4 15-19 Female  2014        3     3 (1, 5)     152020 0.0000197
#>  5 15-19 Female  2015        3     3 (1, 5)     152970 0.0000196
#>  6 15-19 Female  2016        3     3 (1, 5)     154170 0.0000195
#>  7 15-19 Female  2017        6     5 (4, 7)     154450 0.0000388
#>  8 15-19 Female  2018        0     1 (0, 2)     154170 0        
#>  9 15-19 Female  2019        3     3 (1, 5)     154760 0.0000194
#> 10 15-19 Female  2020        0     1 (0, 2)     154480 0        
#> # ℹ 232 more rows
#> # ℹ 2 more variables: .fitted <rdbl<1000>>, .expected <rdbl<1000>>