Create a causal data generating process
causalsim_dgp.RdDefines a causal DGP with known ground truth. Covariates can be specified
via shorthand count arguments (Option B), an explicit named list of
covar() objects (Option A), or both combined.
Usage
causalsim_dgp(
n,
effect = 1,
propensity = "moderate",
baseline = 0,
sigma = 1,
covariates = list(),
n_confounders = 0L,
n_effect_modifiers = 0L,
n_instruments = 0L,
n_noise = 0L,
mc_draws = 10000L
)Arguments
- n
Positive integer. Sample size for each simulated dataset.
- effect
Numeric scalar or function. A scalar specifies a constant (homogeneous) treatment effect — ATE = CATE everywhere. A function should accept named arguments matching covariate names defined in the DGP and return a numeric vector of individual-level causal effects (CATE). See Details.
- propensity
Numeric scalar, preset string, or function. Treatment assignment probability. A scalar (e.g.
0.5) gives a constant propensity (randomized trial). Preset strings"low","moderate","high"generate a logistic propensity over confounders with coefficients 0.25, 0.5, and 1.0 respectively. A function follows the same named-argument convention aseffectand must return values in [0, 1]. Defaults to"moderate".- baseline
Numeric scalar, preset string, or function. Mean potential outcome under control,
E[Y(0) | W]. Preset strings follow the same levels aspropensityand apply a linear combination of confounders. Defaults to0.- sigma
Positive numeric. Standard deviation of the outcome noise term. Default
1.- covariates
Named list of
covar()objects (Option A / explicit path). Each name becomes the column name in generated data and the argument name expected byeffect,propensity, andbaselinefunctions. Merged with any auto-generated covariates; name collisions error.- n_confounders
Non-negative integer. Standard normal confounders auto-generated as
W(single) orW1, W2, ...(multiple).- n_effect_modifiers
Non-negative integer. Auto-generates standard normal effect modifiers as
VorV1, V2, ....- n_instruments
Non-negative integer. Auto-generates standard normal instruments as
ZorZ1, Z2, ....- n_noise
Non-negative integer. Auto-generates standard normal noise covariates as
XorX1, X2, ....- mc_draws
Positive integer. Monte Carlo draws for true ATE approximation. Default
10000L. Ignored for scalareffect.
Value
An S3 object of class causalsim_dgp with components:
nSample size (integer)
covar_specNamed list of
covar()objectseffect_fnNormalized effect function
propensity_fnNormalized propensity function
baseline_fnNormalized baseline function
sigmaOutcome noise standard deviation
true_ateTrue ATE — exact for scalar, MC approximation otherwise
heterogeneousLogical;
TRUEifeffectwas a functionmc_drawsMonte Carlo draws used (integer)
Details
Function calling convention
The effect, propensity, and baseline functions are called with named
arguments matching covariate names — not a data frame. Write:
effect = function(W) 2 + 1.5 * W
propensity = function(W1, W2) plogis(0.3 * W1 + 0.5 * W2)Every argument name is validated against the DGP's covariate spec at construction time, so mismatches surface immediately rather than at draw time. The propensity function is additionally evaluated on a small test draw to confirm it returns values in [0, 1].
Examples
# Minimal: one confounder, constant effect, moderate confounding
dgp <- causalsim_dgp(n = 500, n_confounders = 1, effect = 2)
dgp
#> <causalsim_dgp>
#> n : 500
#> true ATE : 2.0000
#> heterogeneous: FALSE
#> sigma : 1.00
#> covariates :
#> W normal [confounder]
# Heterogeneous effect, explicit covariate spec
dgp2 <- causalsim_dgp(
n = 500,
covariates = list(
W = covar("normal", role = "confounder"),
V = covar("binary", role = "effect_modifier", prob = 0.4)
),
effect = function(V) 2 + 1.5 * V,
propensity = function(W) plogis(0.5 * W),
baseline = function(W) 1.5 * W
)
# Mixed: shorthand confounders + explicit instrument + RCT propensity
dgp3 <- causalsim_dgp(
n = 1000,
n_confounders = 2,
covariates = list(Z = covar("normal", role = "instrument")),
effect = 1,
propensity = 0.5
)