Validation Scenarios#
Three built-in scenarios validate different model configurations against the reference book.
Baseline (Section 3.9.1)#
Standard BAM model behavior, 25 metrics across 3 categories:
TIME_SERIES (10): Unemployment, inflation, GDP trend/growth, vacancy rates
CURVES (6): Phillips, Okun, Beveridge curve correlations
DISTRIBUTION (4): Firm size metrics (skewness, tail ratios)
from validation import run_validation, run_stability_test
result = run_validation(seed=42, n_periods=1000)
stability = run_stability_test(seeds=list(range(100)))
Growth+ (Section 3.9.2)#
Endogenous productivity growth via R&D investment, 65 metrics across 6 categories:
Category |
Count |
Key Metrics |
|---|---|---|
TIME_SERIES |
14 |
Unemployment, inflation, GDP trend/growth, vacancy rates |
CURVES |
6 |
Phillips, Okun, Beveridge correlations |
DISTRIBUTION |
4 |
Firm size metrics (skewness, tail ratios) |
GROWTH |
11 |
Productivity/wage growth, co-movement, recession detection |
FINANCIAL |
20 |
Interest rates, fragility, price ratio, Minsky classification |
GROWTH_RATE_DIST |
10 |
Tent-shape R², bounds checks, outlier percentages |
from validation import run_growth_plus_validation
result = run_growth_plus_validation(seed=42, n_periods=1000)
Buffer-Stock (Section 3.9.4)#
Buffer-stock consumption with individual adaptive MPC. Uses a two-layer validation approach:
Unique metrics (8, per-seed): Wealth distribution fitting (Figure 3.8) and MPC behavioral metrics. These determine per-seed PASS/FAIL.
Improvement over Growth+ (aggregate): Mean score deltas across all seeds checked after stability testing. Catches systematic degradation without being affected by per-seed noise.
Category |
Count |
Key Metrics |
|---|---|---|
DISTRIBUTION |
6 |
Wealth CCDF fitting (Singh-Maddala, Dagum, GB2 R²), Gini, skewness |
FINANCIAL |
2 |
Mean MPC, percent dissaving |
from validation import run_buffer_stock_validation
# Automatic: runs Growth+ internally as baseline
result = run_buffer_stock_validation(seed=42, n_periods=1000)
# With reuse: pass pre-computed Growth+ result
from validation import run_growth_plus_validation
gp = run_growth_plus_validation(seed=42, n_periods=1000)
result = run_buffer_stock_validation(seed=42, growth_plus_result=gp)
# Access improvement data
print(result.improvement_deltas) # dict of metric -> delta
print(result.degraded_metrics) # metrics that degraded beyond threshold
print(result.baseline_score) # Growth+ ValidationScore
- class validation.types.BufferStockValidationScore(metric_results, total_score, n_pass, n_warn, n_fail, config=<factory>, baseline_score=None, improvement_deltas=<factory>, degraded_metrics=<factory>, blend_alpha=0.6)[source]#
Buffer-stock validation result with improvement tracking over Growth+.
Per-seed PASS/FAIL is determined solely by the 8 unique buffer-stock metrics (wealth distribution fits, MPC, dissaving). Improvement over Growth+ is assessed at the aggregate level after stability testing.
The
improvement_deltasare computed per seed (informational) but do not affectpassedortotal_score.- baseline_score = None#
Growth+ baseline result used for comparison (same seed).
- improvement_deltas#
bs_score - gp_score(informational).- Type:
Per-metric score deltas
- degraded_metrics#
Growth+ metrics with systematic degradation (populated at aggregate level by
run_buffer_stock_stability_test(), not per seed).
- blend_alpha = 0.6#
Informational only. Not used in score computation.
Target File Structure#
Target values are defined in co-located targets.yaml files with standardized
keys per check type:
CheckType |
YAML Keys |
Purpose |
|---|---|---|
|
|
Value within target ± tolerance |
|
|
Value within [min, max] |
|
|
Percentage meeting threshold |
|
|
Penalize excess outliers |
|
|
Simple > or < check |
Core Types#
- class validation.types.ValidationScore(metric_results, total_score, n_pass, n_warn, n_fail, config=<factory>)[source]#
Overall validation result with scoring for comparison.
- metric_results#
- total_score#
- n_pass#
- n_warn#
- n_fail#
- config#
- property passed#
True if no metrics failed validation.
- class validation.types.StabilityResult(seed_results, mean_score, std_score, min_score, max_score, pass_rate, n_seeds, metric_stats)[source]#
Result of multi-seed stability testing.
- seed_results#
- mean_score#
- std_score#
- min_score#
- max_score#
- pass_rate#
- n_seeds#
- metric_stats#
- property is_stable#
True if pass_rate >= 90% and std_score <= 0.15.
- class validation.types.MetricResult(name, status, actual, target_desc, score, weight=1.0, message='', group=MetricGroup.TIME_SERIES, format=MetricFormat.DEFAULT)[source]#
Result of validating a single metric.
- name#
- status#
- actual#
- target_desc#
- score#
- weight = 1.0#
- message = ''#
- group = 1#
- format = 1#
- class validation.types.MetricSpec(name, field, check_type, target_path, weight=1.0, group=MetricGroup.TIME_SERIES, format=MetricFormat.DEFAULT, threshold=0.0, invert=False, target_desc=None)[source]#
Unified specification for a validation metric.
This dataclass captures everything needed to validate a single metric. Target values are looked up from YAML using standardized keys: - MEAN_TOLERANCE: expects ‘target’ and ‘tolerance’ keys - RANGE: expects ‘min’ and ‘max’ keys - PCT_WITHIN: expects ‘target’ and ‘min’ keys - OUTLIER: expects ‘max_outlier’ key (and optional ‘penalty_weight’) - BOOLEAN: uses threshold defined here
- name#
- field#
- check_type#
- target_path#
- weight = 1.0#
- group = 1#
- format = 1#
- threshold = 0.0#
- invert = False#
- target_desc = None#
- class validation.types.Scenario(name, metric_specs, collect_config, targets_path, compute_metrics, default_config=<factory>, setup_hook=None, title='', stability_title='')[source]#
Configuration for a validation scenario.
This dataclass bundles everything needed to run validation for a specific scenario (baseline, growth_plus, or buffer_stock).
- name#
- metric_specs#
- collect_config#
- targets_path#
- compute_metrics#
- default_config#
- setup_hook = None#
first with
None(to trigger imports/registration), then with theSimulationinstance (to attach roles/extensions).- Type:
Optional hook called twice
- title = ''#
- stability_title = ''#