This page is a short, practical reference for the meta-analytic models we run in the SYPRES living reviews. For each model type, we show the example code block used in our analyses (see analysis/) and walk through every argument and the reasoning behind it.

Most models are fit with the runMetaAnalysis() wrapper from metapsyTools, which is itself a convenience layer on top of {meta} and {metafor}. For the broader theory and worked examples behind these choices, we recommend Doing Meta-Analysis with R (Harrer, Cuijpers, Furukawa & Ebert).


1. Continuous outcomes (random-effects)

The primary efficacy model. Used to pool standardized mean differences (e.g., psilocybin vs. control on a depression rating scale) across studies.

main_results <- runMetaAnalysis(data_main,

  # Specify models to run
  which.run       = c("overall", "outliers"),
  which.influence = "overall",
  which.outliers  = "overall",

  # Specify statistical parameters
  es.measure    = "g",         # Hedges' g
  method.tau    = "REML",
  method.tau.ci = "Q-Profile",
  hakn          = TRUE,        # Knapp-Hartung adjustment

  # Specify variables
  study.var   = "study",
  arm.var.1   = "condition_arm1",
  arm.var.2   = "condition_arm2",
  measure.var = "instrument",
  w1.var      = "n_arm1",
  w2.var      = "n_arm2",
  time.var    = "time_weeks",
  round.digits = 2
)

Argument-by-argument:

Why these defaults? Hedges’ g + REML + Knapp–Hartung is the recommended combination in the metapsyTools documentation.


2. Dichotomous outcomes (response/remission)

Used for binary clinical outcomes such as response (defined per study as a threshold (i.e.,≥50%) in symptom reduction) or remission (score below a clinical threshold to meet diagnostic criteria).

response_results <- runMetaAnalysis(data_response,
  which.run     = "overall",
  es.measure    = "RR",        # risk ratio
  es.type       = "raw",
  method.tau    = "PM",
  method.tau.ci = "Q-Profile",
  hakn          = TRUE,        # Knapp-Hartung adjustment

  # Specify variables
  study.var   = "study",
  arm.var.1   = "condition_arm1",
  arm.var.2   = "condition_arm2",
  measure.var = "instrument",
  w1.var      = "n_arm1",
  w2.var      = "n_arm2",
  time.var    = "time_weeks",
  round.digits = 2
)

What changes vs. the continuous model:

Why no es.measure = "OR"? RR is what we report in the paper for clinical interpretability. Switching to OR would only require changing this single argument.


3. Three-level (CHE) models

Used when a study contributes multiple effect sizes — typically several post-treatment timepoints from the same trial. A standard random-effects model assumes effect sizes are independent; ignoring within-study dependence underestimates standard errors. The correlated and hierarchical effects (CHE) model of Pustejovsky & Tipton (2021) handles both sources of non-independence.

time_results <- runMetaAnalysis(data_time,
  which.run     = "threelevel.che",

  # Specify statistical parameters
  es.measure    = "g",
  method.tau    = "REML",
  method.tau.ci = "Q-Profile",   # N/A for three-level models
  hakn          = TRUE,

  # Specify variables
  study.var   = "study",
  arm.var.1   = "condition_arm1",
  arm.var.2   = "condition_arm2",
  measure.var = "instrument",
  w1.var      = "n_arm1",
  w2.var      = "n_arm2",
  time.var    = "time_days",
  round.digits = 2
)

What’s different here:

  • which.run = "threelevel.che" — fits the three-level CHE model via {metafor}’s rma.mv() with cluster-robust (“RVE”) inference. Effect sizes are nested within studies, with two heterogeneity components: within-study (level 2) and between-study (level 3). See Doing Meta-Analysis ch. 10 (“Multilevel” Meta-Analysis).
  • time.var = "time_days" — the variable that distinguishes effect sizes within the same study. Each timepoint becomes a separate row in the long-format input.
  • method.tau.ci = "Q-Profile" is ignored for multilevel models (we leave it for symmetry with other calls; the inline comment flags this). Heterogeneity CIs for three-level models require parametric bootstrapping (i2.ci.boot = TRUE), which we run separately when needed.
  • hakn = TRUE — when combined with rma.mv(), this triggers the small-sample (Tipton-Pustejovsky) adjustment to the cluster-robust standard errors.
  • An additional argument rho.within.study (default 0.6) sets the assumed correlation between effect sizes within a study. Because the true ρ is rarely known, we sweep it from 0 → 1 as a sensitivity check; see the rho-sweep block in the analysis Rmd files.

When to use this instead of the standard model: any time you have more than one effect size per study (multiple timepoints, multiple outcomes, multi-arm trials with shared controls). Don’t average them away — let the three-level model use all the data.


4. Meta-regression

Adds a moderator to a fitted meta-analytic model to test whether the pooled effect varies as a function of a study- or effect-level covariate (e.g., time since dosing, cumulative dose, % female).

reg <- metaRegression(time_results$model.threelevel.che, ~time_days)
reg

Argument-by-argument:

  • First argument — a fitted model object from runMetaAnalysis(). We almost always regress on top of the three-level CHE model so that within-study dependence is correctly handled.
  • ~time_days — a one-sided R formula listing the moderator(s). Use + to add multiple moderators (~time_days + diagnosis) or * for interactions. Categorical moderators are dummy-coded automatically.

metaRegression() is a thin wrapper around metafor::rma.mv()’s mods= argument that preserves the parent model’s τ² estimator, RVE settings, and Knapp–Hartung adjustment, so the moderator test inherits the same inference machinery as the parent model. See Doing Meta-Analysis ch. 8 (Meta-Regression) and the metafor meta-regression docs.

Tips:

  • Don’t run a separate meta-regression on top of an overall model when you have multiple effect sizes per study — fit it on the three-level model instead.
  • Plot the result with regplot(reg, mod = "time_days") to visualize the moderator slope and study-weighted points.
  • For categorical moderators, the omnibus test (QM) tells you whether the moderator explains a significant share of heterogeneity overall; individual coefficients give you the contrast against the reference level.

Where to go next