16.6 Running a Bayesian model with rstanarm

Any time we use a convenient wrapper in R, we sacrifice a little control, but for getting exposure to Bayesian methods I think this is a worthwhile sacrifice. Truth be told, I write all of my models out by hand because I often need to use non-standard models such as non-linear growth models and I need that control. The convenience and flexibility offered through the model-fitting functions in rstanarm alone will keep you more than busy until you reach a deeper level of understanding needed to write your own should you aspire to do so.

Let’s go ahead and estimate a Gaussian GLM using logMass as our response and loglength as our explanatory variable.

For the sake of simplicity, we’ll create new columns in the cray data for loglength and logmass:

cray <- cray %>%
  mutate(loglength = log(length),
         logmass = log(mass)
         )

The next thing we’ll do is tell R to set up some options for Stan to make things work faster. We only need to do this once per R session.

rstan::rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())

Now, we can fit our first Bayesian model. We will do this using basically the same approach that we use in R:

cray_model <- stan_glm(logmass ~ loglength,
         family = gaussian(),
         data = cray
         )

This model tests the null hypothesis that the log length of crayfish has no influence on log mass (slope = 0).

We can have a look at a quick summary() of our model in the usual way:

summary(cray_model)
## 
## Model Info:
##  function:     stan_glm
##  family:       gaussian [identity]
##  formula:      logmass ~ loglength
##  algorithm:    sampling
##  sample:       4000 (posterior sample size)
##  priors:       see help('prior_summary')
##  observations: 964
##  predictors:   2
## 
## Estimates:
##               mean   sd   10%   50%   90%
## (Intercept) -8.5    0.1 -8.5  -8.5  -8.4 
## loglength    3.0    0.0  3.0   3.0   3.1 
## sigma        0.2    0.0  0.2   0.2   0.2 
## 
## Fit Diagnostics:
##            mean   sd   10%   50%   90%
## mean_PPD 1.2    0.0  1.2   1.2   1.2  
## 
## The mean_ppd is the sample average posterior predictive distribution of the outcome variable (for details see help('summary.stanreg')).
## 
## MCMC diagnostics
##               mcse Rhat n_eff
## (Intercept)   0.0  1.0  4153 
## loglength     0.0  1.0  4146 
## sigma         0.0  1.0  2745 
## mean_PPD      0.0  1.0  3609 
## log-posterior 0.0  1.0  1879 
## 
## For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).

If you look closely at the output above, you should be able to recognize that just as with the glm() function we get a brief summary of our coefficients in the Estimates portion of the output.

16.6.1 Interpreting the model summary

As for the rest of this summary. For now, we will focus on a few specific things:

1. Do our estimates make logical sense? Not are they right or wrong, but is our estimate crap or not? If the 95% CRI goes from negative infinity to infinity on the real scale here, then we probably have some issues with estimation because this means that we learned nothing new from the data (crap).

2. We need to look at the value of Rhat(\(\hat{r}\), the potential scale-reduction factor). This statistic is a diagnostic that can help us determine whether or not the model has even converged on an estimate for our parameter(s) of interest. It assesses the degree of mixing (agreement) between the chains that we used to get our estimates. Here, our values are about 1, so we are satisfied. Larger values would indicate a problem. We will examine some graphical diagnostics of mixing below.

3. We need to pay attention to n_eff. This quantity is the ‘number of effective samples’ that we have taken from the posterior. As discussed in class, the draws we take from the posterior can be auto-correlated to varying degrees depending on the sampler used. n_eff tells us how many independent samples we can actually consider ourselves to have drawn from the posterior. To have have some degree of confidence in our parameter estimates, we want this to be at least several hundred large. More is better, but these models can take a long time to run as they build in complexity so there is a balance to be struck. If we are running long chains and we still are not achieving large n_eff, it is a pretty good indication that we need to increase our thinning rate or (more likely) consider an alternative model parameterization. We are good to go according to the output from the cray_model above.