The variational method of a CmdStanModel object runs Stan's variational Bayes (ADVI) algorithms.

Details

CmdStan can fit a variational approximation to the posterior. The approximation is a Gaussian in the unconstrained variable space. Stan implements two variational algorithms. The algorithm="meanfield" option uses a fully factorized Gaussian for the approximation. The algorithm="fullrank" option uses a Gaussian with a full-rank covariance matrix for the approximation.

-- CmdStan Interface User's Guide

Usage

$variational(
  data = NULL,
  seed = NULL,
  refresh = NULL,
  init = NULL,
  algorithm = NULL,
  iter = NULL,
  grad_samples = NULL,
  elbo_samples = NULL,
  eta = NULL,
  adapt_engaged = NULL,
  adapt_iter = NULL,
  tol_rel_obj = NULL,
  eval_elbo = NULL,
  output_samples = NULL
)

Arguments shared by all fitting methods

The following arguments can be specified for any of the fitting methods (sample, optimize, variational). Arguments left at NULL default to the default used by the installed version of CmdStan.

  • data (multiple options): The data to use:

    • A named list of R objects like for RStan;

    • A path to a data file compatible with CmdStan (R dump or JSON). See the appendices in the CmdStan manual for details on using these formats.

  • seed: (positive integer) A seed for the (P)RNG to pass to CmdStan.

  • refresh: (non-negative integer) The number of iterations between screen updates.

  • init: (multiple options) The initialization method:

    • A real number x>0 initializes randomly between [-x,x] (on the unconstrained parameter space);

    • 0 initializes to 0;

    • A character vector of data file paths (one per chain) to initialization files.

Arguments unique to the variational method

In addition to the arguments above, the variational method also has its own set of arguments. These arguments are described briefly here and in greater detail in the CmdStan manual. Arguments left at NULL default to the default used by the installed version of CmdStan.

  • algorithm: (string) The algorithm. Either "meanfield" or "fullrank".

  • iter: (positive integer) The maximum number of iterations.

  • grad_samples: (positive integer) The number of samples for Monte Carlo estimate of gradients.

  • elbo_samples: (positive integer) The number of samples for Monte Carlo estimate of ELBO (objective function).

  • eta: (positive real) The stepsize weighting parameter for adaptive stepsize sequence.

  • adapt_engaged: (logical) Do warmup adaptation?

  • adapt_iter: (positive integer) The maximum number of adaptation iterations.

  • tol_rel_obj: (positive real) Convergence tolerance on the relative norm of the objective.

  • eval_elbo: (positive integer) Evaluate ELBO every Nth iteration.

  • output_samples: (positive integer) Number of posterior samples to draw and save.

Value

The variational method returns a CmdStanVB object.

See also

The CmdStanR website (mc-stan.org/cmdstanr) for online documentation and tutorials.

The Stan and CmdStan documentation:

Other CmdStanModel methods: CmdStanModel-method-compile, CmdStanModel-method-optimize, CmdStanModel-method-sample

Examples

# \dontrun{ # Set path to cmdstan # Note: if you installed CmdStan via install_cmdstan() with default settings # then default below should work. Otherwise use the `path` argument to # specify the location of your CmdStan installation. set_cmdstan_path(path = NULL)
#> CmdStan path set to: /Users/jgabry/.cmdstanr/cmdstan
# Create a CmdStan model object from a Stan program, # here using the example model that comes with CmdStan stan_program <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan") mod <- cmdstan_model(stan_program) mod$print()
#> data { #> int<lower=0> N; #> int<lower=0,upper=1> y[N]; #> } #> parameters { #> real<lower=0,upper=1> theta; #> } #> model { #> theta ~ beta(1,1); #> for (n in 1:N) #> y[n] ~ bernoulli(theta); #> }
# Compile to create executable mod$compile()
#> Running make /Users/jgabry/.cmdstanr/cmdstan/examples/bernoulli/bernoulli #> make: `/Users/jgabry/.cmdstanr/cmdstan/examples/bernoulli/bernoulli' is up to date.
# Run sample method (MCMC via Stan's dynamic HMC/NUTS), # specifying data as a named list (like RStan) standata <- list(N = 10, y =c(0,1,0,0,0,0,0,0,0,1)) fit_mcmc <- mod$sample(data = standata, seed = 123, num_chains = 2)
#> method = sample (Default) #> sample #> num_samples = 1000 (Default) #> num_warmup = 1000 (Default) #> save_warmup = 0 (Default) #> thin = 1 (Default) #> adapt #> engaged = 1 (Default) #> gamma = 0.050000000000000003 (Default) #> delta = 0.80000000000000004 (Default) #> kappa = 0.75 (Default) #> t0 = 10 (Default) #> init_buffer = 75 (Default) #> term_buffer = 50 (Default) #> window = 25 (Default) #> algorithm = hmc (Default) #> hmc #> engine = nuts (Default) #> nuts #> max_depth = 10 (Default) #> metric = diag_e (Default) #> metric_file = (Default) #> stepsize = 1 (Default) #> stepsize_jitter = 0 (Default) #> id = 1 #> data #> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/Rtmpy8TKSY/standata-b022add5243.data.R #> init = 2 (Default) #> random #> seed = 123 #> output #> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-sample-1.csv #> diagnostic_file = (Default) #> refresh = 100 (Default) #> #> #> Gradient evaluation took 2e-05 seconds #> 1000 transitions using 10 leapfrog steps per transition would take 0.2 seconds. #> Adjust your expectations accordingly! #> #> #> Iteration: 1 / 2000 [ 0%] (Warmup) #> Iteration: 100 / 2000 [ 5%] (Warmup) #> Iteration: 200 / 2000 [ 10%] (Warmup) #> Iteration: 300 / 2000 [ 15%] (Warmup) #> Iteration: 400 / 2000 [ 20%] (Warmup) #> Iteration: 500 / 2000 [ 25%] (Warmup) #> Iteration: 600 / 2000 [ 30%] (Warmup) #> Iteration: 700 / 2000 [ 35%] (Warmup) #> Iteration: 800 / 2000 [ 40%] (Warmup) #> Iteration: 900 / 2000 [ 45%] (Warmup) #> Iteration: 1000 / 2000 [ 50%] (Warmup) #> Iteration: 1001 / 2000 [ 50%] (Sampling) #> Iteration: 1100 / 2000 [ 55%] (Sampling) #> Iteration: 1200 / 2000 [ 60%] (Sampling) #> Iteration: 1300 / 2000 [ 65%] (Sampling) #> Iteration: 1400 / 2000 [ 70%] (Sampling) #> Iteration: 1500 / 2000 [ 75%] (Sampling) #> Iteration: 1600 / 2000 [ 80%] (Sampling) #> Iteration: 1700 / 2000 [ 85%] (Sampling) #> Iteration: 1800 / 2000 [ 90%] (Sampling) #> Iteration: 1900 / 2000 [ 95%] (Sampling) #> Iteration: 2000 / 2000 [100%] (Sampling) #> #> Elapsed Time: 0.014348 seconds (Warm-up) #> 0.021322 seconds (Sampling) #> 0.03567 seconds (Total) #> #> method = sample (Default) #> sample #> num_samples = 1000 (Default) #> num_warmup = 1000 (Default) #> save_warmup = 0 (Default) #> thin = 1 (Default) #> adapt #> engaged = 1 (Default) #> gamma = 0.050000000000000003 (Default) #> delta = 0.80000000000000004 (Default) #> kappa = 0.75 (Default) #> t0 = 10 (Default) #> init_buffer = 75 (Default) #> term_buffer = 50 (Default) #> window = 25 (Default) #> algorithm = hmc (Default) #> hmc #> engine = nuts (Default) #> nuts #> max_depth = 10 (Default) #> metric = diag_e (Default) #> metric_file = (Default) #> stepsize = 1 (Default) #> stepsize_jitter = 0 (Default) #> id = 2 #> data #> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/Rtmpy8TKSY/standata-b022add5243.data.R #> init = 2 (Default) #> random #> seed = 124 #> output #> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-sample-2.csv #> diagnostic_file = (Default) #> refresh = 100 (Default) #> #> #> Gradient evaluation took 1.9e-05 seconds #> 1000 transitions using 10 leapfrog steps per transition would take 0.19 seconds. #> Adjust your expectations accordingly! #> #> #> Iteration: 1 / 2000 [ 0%] (Warmup) #> Iteration: 100 / 2000 [ 5%] (Warmup) #> Iteration: 200 / 2000 [ 10%] (Warmup) #> Iteration: 300 / 2000 [ 15%] (Warmup) #> Iteration: 400 / 2000 [ 20%] (Warmup) #> Iteration: 500 / 2000 [ 25%] (Warmup) #> Iteration: 600 / 2000 [ 30%] (Warmup) #> Iteration: 700 / 2000 [ 35%] (Warmup) #> Iteration: 800 / 2000 [ 40%] (Warmup) #> Iteration: 900 / 2000 [ 45%] (Warmup) #> Iteration: 1000 / 2000 [ 50%] (Warmup) #> Iteration: 1001 / 2000 [ 50%] (Sampling) #> Iteration: 1100 / 2000 [ 55%] (Sampling) #> Iteration: 1200 / 2000 [ 60%] (Sampling) #> Iteration: 1300 / 2000 [ 65%] (Sampling) #> Iteration: 1400 / 2000 [ 70%] (Sampling) #> Iteration: 1500 / 2000 [ 75%] (Sampling) #> Iteration: 1600 / 2000 [ 80%] (Sampling) #> Iteration: 1700 / 2000 [ 85%] (Sampling) #> Iteration: 1800 / 2000 [ 90%] (Sampling) #> Iteration: 1900 / 2000 [ 95%] (Sampling) #> Iteration: 2000 / 2000 [100%] (Sampling) #> #> Elapsed Time: 0.01225 seconds (Warm-up) #> 0.019663 seconds (Sampling) #> 0.031913 seconds (Total) #>
# Call CmdStan's bin/summary fit_mcmc$summary()
#> Running bin/stansummary \ #> /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-sample-1.csv \ #> /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-sample-2.csv #> Inference for Stan model: bernoulli_model #> 2 chains: each with iter=(1000,1000); warmup=(0,0); thin=(1,1); 2000 iterations saved. #> #> Warmup took (0.014, 0.012) seconds, 0.027 seconds total #> Sampling took (0.021, 0.020) seconds, 0.041 seconds total #> #> Mean MCSE StdDev 5% 50% 95% N_Eff N_Eff/s R_hat #> lp__ -7.3 2.7e-02 7.3e-01 -8.9 -7.0 -6.8 738 18018 1.0e+00 #> accept_stat__ 0.92 3.1e-03 1.3e-01 0.64 0.97 1.0 1670 40735 1.0e+00 #> stepsize__ 0.92 1.7e-03 1.7e-03 0.92 0.92 0.92 1.0 24 1.4e+12 #> treedepth__ 1.3 1.1e-02 4.7e-01 1.0 1.0 2.0 1968 48012 1.0e+00 #> n_leapfrog__ 2.4 2.5e-02 1.0e+00 1.0 3.0 3.0 1640 40018 1.0e+00 #> divergent__ 0.00 0.0e+00 0.0e+00 0.00 0.00 0.00 1000 24399 nan #> energy__ 7.8 4.0e-02 1.0e+00 6.8 7.5 9.8 647 15780 1.0e+00 #> theta 0.24 4.6e-03 1.2e-01 0.077 0.22 0.47 720 17570 1.0e+00 #> #> Samples were drawn using hmc with nuts. #> For each parameter, N_Eff is a crude measure of effective sample size, #> and R_hat is the potential scale reduction factor on split chains (at #> convergence, R_hat=1). #>
# Run optimization method (default is Stan's LBFGS algorithm) # and also demonstrate specifying data as a path to a file (readable by CmdStan) my_data_file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.data.R") fit_optim <- mod$optimize(data = my_data_file, seed = 123)
#> Warning: Optimization method is experimental and the structure of returned object may change.
#> method = optimize #> optimize #> algorithm = lbfgs (Default) #> lbfgs #> init_alpha = 0.001 (Default) #> tol_obj = 9.9999999999999998e-13 (Default) #> tol_rel_obj = 10000 (Default) #> tol_grad = 1e-08 (Default) #> tol_rel_grad = 10000000 (Default) #> tol_param = 1e-08 (Default) #> history_size = 5 (Default) #> iter = 2000 (Default) #> save_iterations = 0 (Default) #> id = 1 #> data #> file = /Users/jgabry/.cmdstanr/cmdstan/examples/bernoulli/bernoulli.data.R #> init = 2 (Default) #> random #> seed = 123 #> output #> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-optimize-1.csv #> diagnostic_file = (Default) #> refresh = 100 (Default) #> #> Initial log joint probability = -9.51104 #> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes #> 6 -5.00402 0.000103557 2.55661e-07 1 1 9 #> Optimization terminated normally: #> Convergence detected: relative gradient magnitude is below tolerance
#' Print estimates fit_optim$summary()
#> Estimates from optimization:
#> theta lp__ #> 0.20000 -5.00402
# Run variational Bayes method (default is meanfield ADVI) fit_vb <- mod$variational(data = standata, seed = 123)
#> Warning: Variational inference method is experimental and the structure of returned object may change.
#> method = variational #> variational #> algorithm = meanfield (Default) #> meanfield #> iter = 10000 (Default) #> grad_samples = 1 (Default) #> elbo_samples = 100 (Default) #> eta = 1 (Default) #> adapt #> engaged = 1 (Default) #> iter = 50 (Default) #> tol_rel_obj = 0.01 (Default) #> eval_elbo = 100 (Default) #> output_samples = 1000 (Default) #> id = 1 #> data #> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/Rtmpy8TKSY/standata-b022843c2b1.data.R #> init = 2 (Default) #> random #> seed = 123 #> output #> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-variational-1.csv #> diagnostic_file = (Default) #> refresh = 100 (Default) #> #> ------------------------------------------------------------ #> EXPERIMENTAL ALGORITHM: #> This procedure has not been thoroughly tested and may be unstable #> or buggy. The interface is subject to change. #> ------------------------------------------------------------ #> #> #> #> Gradient evaluation took 2.1e-05 seconds #> 1000 transitions using 10 leapfrog steps per transition would take 0.21 seconds. #> Adjust your expectations accordingly! #> #> #> Begin eta adaptation. #> Iteration: 1 / 250 [ 0%] (Adaptation) #> Iteration: 50 / 250 [ 20%] (Adaptation) #> Iteration: 100 / 250 [ 40%] (Adaptation) #> Iteration: 150 / 250 [ 60%] (Adaptation) #> Iteration: 200 / 250 [ 80%] (Adaptation) #> Success! Found best value [eta = 1] earlier than expected. #> #> Begin stochastic gradient ascent. #> iter ELBO delta_ELBO_mean delta_ELBO_med notes #> 100 -6.258 1.000 1.000 #> 200 -6.475 0.517 1.000 #> 300 -6.228 0.358 0.040 #> 400 -6.220 0.269 0.040 #> 500 -6.379 0.220 0.034 #> 600 -6.195 0.188 0.034 #> 700 -6.262 0.163 0.030 #> 800 -6.345 0.144 0.030 #> 900 -6.201 0.131 0.025 #> 1000 -6.307 0.119 0.025 #> 1100 -6.290 0.020 0.023 #> 1200 -6.238 0.017 0.017 #> 1300 -6.182 0.014 0.013 #> 1400 -6.167 0.014 0.013 #> 1500 -6.219 0.012 0.011 #> 1600 -6.164 0.010 0.009 MEDIAN ELBO CONVERGED #> #> Drawing a sample of size 1000 from the approximate posterior... #> COMPLETED.
# Call CmdStan's bin/summary fit_vb$summary()
#> Running bin/stansummary \ #> /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-variational-1.csv #> Warning: non-fatal error reading adapation data #> Inference for Stan model: bernoulli_model #> 1 chains: each with iter=(1001); warmup=(0); thin=(0); 1001 iterations saved. #> #> Warmup took (0.00) seconds, 0.00 seconds total #> Sampling took (0.00) seconds, 0.00 seconds total #> #> Mean MCSE StdDev 5% 50% 95% N_Eff N_Eff/s R_hat #> lp__ 0.00 0.0e+00 0.00 0.00 0.00 0.0e+00 500 inf nan #> log_p__ -7.2 2.5e-02 0.72 -8.6 -7.0 -6.8e+00 789 inf 1.0e+00 #> log_g__ -0.54 2.9e-02 0.76 -2.1 -0.27 -1.5e-03 679 inf 1.0e+00 #> theta 0.26 4.2e-03 0.12 0.091 0.23 4.9e-01 823 inf 1.0e+00 #> #> Samples were drawn using meanfield with . #> For each parameter, N_Eff is a crude measure of effective sample size, #> and R_hat is the potential scale reduction factor on split chains (at #> convergence, R_hat=1). #>
# For models fit using MCMC, if you like working with RStan's stanfit objects # then you can create one with rstan::read_stan_csv() if (require(rstan, quietly = TRUE)) { stanfit <- rstan::read_stan_csv(fit_mcmc$output_files()) print(stanfit) }
#> Inference for Stan model: bernoulli-stan-sample-1. #> 2 chains, each with iter=2000; warmup=1000; thin=1; #> post-warmup draws per chain=1000, total post-warmup draws=2000. #> #> mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat #> theta 0.24 0.00 0.12 0.06 0.14 0.22 0.32 0.52 720 1 #> lp__ -7.32 0.03 0.73 -9.37 -7.53 -7.05 -6.81 -6.75 737 1 #> #> Samples were drawn using NUTS(diag_e) at Mon Oct 14 21:41:32 2019. #> For each parameter, n_eff is a crude measure of effective sample size, #> and Rhat is the potential scale reduction factor on split chains (at #> convergence, Rhat=1).
# }