Skip to contents

After having built a stock-and-flow model, you may want to explore how different parameter values affect the model’s behaviour. Running multiple simulations with varying parameters is also called an ensemble, which provides insight into the range of possible outcomes and uncertainty associated with your model. In this vignette, we will explore how to set up and run ensemble simulations using the sdbuildR package.

Setting up the model

For this example, we will use Crielaard et al.’s (2022) model of eating behaviour, including the stocks hunger, eating, and compensatory behaviour (i.e. disordered eating behaviour such as purging and overexercising). For more details, see Crielaard et al. (2022). We can load this example from the model library and look what is inside:

sfm <- xmile("Crielaard2022")
summary(sfm)
#> Your model contains:
#> * 3 Stocks: Food_intake, Hunger, Compensatory_behaviour
#> * 8 Flows: Losing_energy_by_compensatory_behavior, Feeling_hunger, Satiety, Food_intake_reduces_hunger, Compensating_for_having_eaten, Satisfaction_with_hungry_feeling, Effect_of_eating_triggers, Effect_of_compensatory_behavior
#> * 3 Constants: a0, a1, a2
#> * 0 Auxiliaries
#> * 0 Graphical Functions
#> * 0 Custom model units
#> * 0 Macros
#> 
#> Simulation time: 0.0 to 100.0 days (dt = 0.01)
#> Simulation settings: solver euler in R

Ensemble simulations are only supported in Julia, so we need to load the Julia environment and change the simulation language to Julia:

use_julia()
#> Starting Julia ...
#> Connecting to Julia TCP server at localhost:11980 ...
#> Setting up Julia environment for sdbuildR...
sfm <- sim_specs(sfm, language = "Julia")

Without changing the parameters, we can run a single simulation to see how the model behaves:

sim <- simulate(sfm)
plot(sim)

As the model has random initial conditions, another run will be different:

sim <- simulate(sfm)
plot(sim)

To explore this more systematically, we can run an ensemble simulation using the ensemble() function.

Running ensemble simulations

Ensemble simulations create multiple runs of the model, which only makes sense if the model either has some random elements or if parameters are being varied. Our model already has random initial conditions, but if it had not, we could create these:

sfm <- build(sfm, c("Food_intake", "Hunger", "Compensatory_behaviour"),
  eqn = "runif(1)")

With random initial conditions, multiple runs of the same model will be different. As running ensemble simulations can be quite memory intensive, it is highly recommended to reduce the size of the returned timeseries. This will save memory and speed up the simulation. For example, we can only save the timeseries every 1 time units, and only save from time 10:

sfm <- sim_specs(sfm, save_at = 1, save_from = 10)

The model is now ready for running ensemble simulations. We complete 100 runs using the ensemble() function:

sims <- ensemble(sfm, n = 100)
#> Running a total of 100 simulations
#> Simulation took 4.6642 seconds
plot(sims)

The plot shows the mean and confidence interval of the stocks (mean with 95% confidence interval). We can also plot the individual runs, for which we first have to rerun the simulation and set return_sims = TRUE:

sims <- ensemble(sfm, n = 30, return_sims = TRUE)
#> Running a total of 30 simulations
#> Simulation took 0.641 seconds
plot(sims, type = "sims")

This automatically only plots the first ten simulations, as plotting a large number of simulations can be quite slow. We can change which simulations we plot by specifying the i argument:

plot(sims, type = "sims", i = 15:30)

By default, only the stocks are saved. To save all variables, set only_stocks = FALSE:

sims <- ensemble(sfm, n = 100, only_stocks = FALSE)
#> Running a total of 100 simulations
#> Simulation took 2.1264 seconds
plot(sims)

Parallel simulations

To run simulations in parallel, set the number of threads you would like to use:

use_threads(n = 4)
#> Warning in use_threads(n = 4): n is set to 4, which is higher than the number
#> of available cores minus 1. Setting it to 3.

To stop using parallel simulations, run use_threads(stop = TRUE).

Specifying ranges

Instead of generating an ensemble with random initial conditions, we can also specify ensembles with exact parameter values. For example, we could vary the a2a_2 parameter, which determines how strongly having eaten increases compensatory behaviour.

sims <- ensemble(sfm,
  n = 100,
  range = list("a2" = c(0.2, 0.4, 0.6, 0.8))
)
#> Running a total of 400 simulations for 4 conditions (100 simulations per condition)
#> Simulation took 2.2523 seconds
plot(sims)

We can also vary multiple parameters at once. For example, we can vary both a2a_2 and a1a_1, where the latter influences how strongly food intake leads to more food intake. n now specifies the number of simulations per condition. By default, cross = TRUE, which means that all possible combinations of parameters are simulated.

sims <- ensemble(sfm,
  range = list(
    "a2" = c(0.2, 0.8),
    "a1" = c(1.3, 1.5)
  ),
  n = 100
)
#> Running a total of 400 simulations for 4 conditions (100 simulations per condition)
#> Simulation took 0.9372 seconds
plot(sims)

The plot shows similarity within columns but differences between columns. As a1a_1 differs between columns, it appears that a1a_1 has a larger effect than a2a_2. To view the parameter combination corresponding to each condition jj, view conditions in sims:

sims$conditions
#>      j  a1  a2
#> [1,] 1 1.3 0.2
#> [2,] 2 1.5 0.2
#> [3,] 3 1.3 0.8
#> [4,] 4 1.5 0.8

To generate a non-crossed designed, set cross = FALSE. In this case, the length of each range needs to be the same.

sims <- ensemble(sfm,
  range = list(
    "a2" = c(0.4, 0.5, 0.6),
    "a1" = c(1.3, 1.4, 1.5)
  ),
  n = 100, cross = FALSE, return_sims = TRUE
)
#> Running a total of 300 simulations for 3 conditions (100 simulations per condition)
#> Simulation took 0.7381 seconds
plot(sims, nrows = 1)

We can select specific conditions to compare, where here we plot the first 5 simulations of the first two conditions:

plot(sims, i = 1:15, j = 1:2, type = "sims", nrows = 1)

Accessing simulation results

The results of the ensemble simulation are stored in the sims object. You can access the summary statistics per condition jj and per timepoint, such as the mean and confidence intervals, using:

head(sims$summary)
#>   j time               variable       mean       median     variance
#> 1 1   10 Compensatory_behaviour 0.49752342 0.4925661167 0.0050657405
#> 2 1   10            Food_intake 0.01127093 0.0006806108 0.0009954441
#> 3 1   10                 Hunger 0.88038502 0.8753712340 0.0027987819
#> 4 1   11 Compensatory_behaviour 0.47184687 0.4676526175 0.0030850559
#> 5 1   11            Food_intake 0.01032006 0.0004710780 0.0008999066
#> 6 1   11                 Hunger 0.89699247 0.8917484667 0.0018608474
#>   missing_count         q025       q975
#> 1             0 3.969197e-01 0.65575622
#> 2             0 8.343609e-06 0.09740325
#> 3             0 8.022650e-01 0.98394813
#> 4             0 3.956835e-01 0.58339579
#> 5             0 6.436691e-06 0.09190464
#> 6             0 8.340815e-01 0.98500971

If you have set have set return_sims = TRUE, you can find the individual simulation runs in sims$df. These are in long format, containing the value of each variable for each timepoint in each simulation ii per condition jj.

head(sims$df)
#>   j i time               variable       value
#> 1 1 1   10 Compensatory_behaviour 0.484402052
#> 2 1 1   10            Food_intake 0.010087221
#> 3 1 1   10                 Hunger 0.851484951
#> 4 1 1   11 Compensatory_behaviour 0.467650293
#> 5 1 1   11            Food_intake 0.007535658
#> 6 1 1   11                 Hunger 0.870943886

To view the initial values of each simulation ii per condition jj, run:

head(sims$init$df)
#>   j i               variable      value
#> 1 1 1 Compensatory_behaviour 0.30478256
#> 2 1 1            Food_intake 0.55435115
#> 3 1 1                 Hunger 0.83209859
#> 4 1 2 Compensatory_behaviour 0.01156185
#> 5 1 2            Food_intake 0.89676911
#> 6 1 2                 Hunger 0.19775285

To view their summary statistics, run:

head(sims$init$summary)
#>   j               variable      mean    median   variance missing_count
#> 1 1 Compensatory_behaviour 0.4790821 0.4860786 0.08572806             0
#> 2 1            Food_intake 0.5213652 0.5095480 0.07865034             0
#> 3 1                 Hunger 0.5321854 0.5321732 0.08626045             0
#> 4 2 Compensatory_behaviour 0.4854778 0.4664117 0.08341138             0
#> 5 2            Food_intake 0.5067007 0.5286892 0.08246959             0
#> 6 2                 Hunger 0.4842567 0.4493411 0.07354595             0
#>         q025      q975
#> 1 0.02679849 0.9649624
#> 2 0.02461372 0.9739386
#> 3 0.02029469 0.9871011
#> 4 0.03805743 0.9849021
#> 5 0.04430578 0.9625319
#> 6 0.06109075 0.9834187

Finally, to access the parameters (i.e. constants) of each simulation ii per condition jj, run:

head(sims$constants$df)
#>   j i variable value
#> 1 1 1       a0  1.31
#> 2 1 1       a1  1.30
#> 3 1 1       a2  0.40
#> 4 1 2       a0  1.31
#> 5 1 2       a1  1.30
#> 6 1 2       a2  0.40

To view their summary statistics, run:

head(sims$constants$summary)
#>   j variable mean median     variance missing_count q025 q975
#> 1 1       a0 1.31   1.31 0.000000e+00             0 1.31 1.31
#> 2 1       a1 1.30   1.30 4.980182e-32             0 1.30 1.30
#> 3 1       a2 0.40   0.40 0.000000e+00             0 0.40 0.40
#> 4 2       a0 1.31   1.31 0.000000e+00             0 1.31 1.31
#> 5 2       a1 1.40   1.40 0.000000e+00             0 1.40 1.40
#> 6 2       a2 0.50   0.50 0.000000e+00             0 0.50 0.50