Validation Scenarios#

Three built-in scenarios validate different model configurations against the reference book.

Baseline (Section 3.9.1)#

Standard BAM model behavior, 25 metrics across 3 categories:

TIME_SERIES (10): Unemployment, inflation, GDP trend/growth, vacancy rates
CURVES (6): Phillips, Okun, Beveridge curve correlations
DISTRIBUTION (4): Firm size metrics (skewness, tail ratios)

from validation import run_validation, run_stability_test

result = run_validation(seed=42, n_periods=1000)
stability = run_stability_test(seeds=list(range(100)))

Growth+ (Section 3.9.2)#

Endogenous productivity growth via R&D investment, 65 metrics across 6 categories:

Category	Count	Key Metrics
TIME_SERIES	14	Unemployment, inflation, GDP trend/growth, vacancy rates
CURVES	6	Phillips, Okun, Beveridge correlations
DISTRIBUTION	4	Firm size metrics (skewness, tail ratios)
GROWTH	11	Productivity/wage growth, co-movement, recession detection
FINANCIAL	20	Interest rates, fragility, price ratio, Minsky classification
GROWTH_RATE_DIST	10	Tent-shape R², bounds checks, outlier percentages

from validation import run_growth_plus_validation

result = run_growth_plus_validation(seed=42, n_periods=1000)

Buffer-Stock (Section 3.9.4)#

Buffer-stock consumption with individual adaptive MPC. Uses a two-layer validation approach:

Unique metrics (8, per-seed): Wealth distribution fitting (Figure 3.8) and MPC behavioral metrics. These determine per-seed PASS/FAIL.
Improvement over Growth+ (aggregate): Mean score deltas across all seeds checked after stability testing. Catches systematic degradation without being affected by per-seed noise.

Category	Count	Key Metrics
DISTRIBUTION	6	Wealth CCDF fitting (Singh-Maddala, Dagum, GB2 R²), Gini, skewness
FINANCIAL	2	Mean MPC, percent dissaving

from validation import run_buffer_stock_validation

# Automatic: runs Growth+ internally as baseline
result = run_buffer_stock_validation(seed=42, n_periods=1000)

# With reuse: pass pre-computed Growth+ result
from validation import run_growth_plus_validation

gp = run_growth_plus_validation(seed=42, n_periods=1000)
result = run_buffer_stock_validation(seed=42, growth_plus_result=gp)

# Access improvement data
print(result.improvement_deltas)  # dict of metric -> delta
print(result.degraded_metrics)  # metrics that degraded beyond threshold
print(result.baseline_score)  # Growth+ ValidationScore

class validation.types.BufferStockValidationScore(metric_results, total_score, n_pass, n_warn, n_fail, config=<factory>, baseline_score=None, improvement_deltas=<factory>, degraded_metrics=<factory>, blend_alpha=0.6)[source]#

Buffer-stock validation result with improvement tracking over Growth+.

Per-seed PASS/FAIL is determined solely by the 8 unique buffer-stock metrics (wealth distribution fits, MPC, dissaving). Improvement over Growth+ is assessed at the aggregate level after stability testing.

The improvement_deltas are computed per seed (informational) but do not affect passed or total_score.

baseline_score = None#: Growth+ baseline result used for comparison (same seed).

improvement_deltas#

bs_score - gp_score (informational).

Type:: Per-metric score deltas

degraded_metrics#: Growth+ metrics with systematic degradation (populated at aggregate level by run_buffer_stock_stability_test(), not per seed).

blend_alpha = 0.6#: Informational only. Not used in score computation.

Target File Structure#

Target values are defined in co-located targets.yaml files with standardized keys per check type:

CheckType	YAML Keys	Purpose
`MEAN_TOLERANCE`	`target`, `tolerance`	Value within target ± tolerance
`RANGE`	`min`, `max`	Value within [min, max]
`PCT_WITHIN`	`target`, `min`	Percentage meeting threshold
`OUTLIER`	`max_outlier`, `penalty_weight`	Penalize excess outliers
`BOOLEAN`	`threshold` (in MetricSpec)	Simple > or < check

Core Types#

class validation.types.ValidationScore(metric_results, total_score, n_pass, n_warn, n_fail, config=<factory>)[source]#

Overall validation result with scoring for comparison.

metric_results#

total_score#

n_pass#

n_warn#

n_fail#

config#

property passed#: True if no metrics failed validation.

class validation.types.StabilityResult(seed_results, mean_score, std_score, min_score, max_score, pass_rate, n_seeds, metric_stats)[source]#

Result of multi-seed stability testing.

seed_results#

mean_score#

std_score#

min_score#

max_score#

pass_rate#

n_seeds#

metric_stats#

property is_stable#: True if pass_rate >= 90% and std_score <= 0.15.

class validation.types.MetricResult(name, status, actual, target_desc, score, weight=1.0, message='', group=MetricGroup.TIME_SERIES, format=MetricFormat.DEFAULT)[source]#

Result of validating a single metric.

name#

status#

actual#

target_desc#

score#

weight = 1.0#

message = ''#

group = 1#

format = 1#

class validation.types.MetricSpec(name, field, check_type, target_path, weight=1.0, group=MetricGroup.TIME_SERIES, format=MetricFormat.DEFAULT, threshold=0.0, invert=False, target_desc=None)[source]#

Unified specification for a validation metric.

This dataclass captures everything needed to validate a single metric. Target values are looked up from YAML using standardized keys: - MEAN_TOLERANCE: expects ‘target’ and ‘tolerance’ keys - RANGE: expects ‘min’ and ‘max’ keys - PCT_WITHIN: expects ‘target’ and ‘min’ keys - OUTLIER: expects ‘max_outlier’ key (and optional ‘penalty_weight’) - BOOLEAN: uses threshold defined here

name#

field#

check_type#

target_path#

weight = 1.0#

group = 1#

format = 1#

threshold = 0.0#

invert = False#

target_desc = None#

class validation.types.Scenario(name, metric_specs, collect_config, targets_path, compute_metrics, default_config=<factory>, setup_hook=None, title='', stability_title='')[source]#

Configuration for a validation scenario.

This dataclass bundles everything needed to run validation for a specific scenario (baseline, growth_plus, or buffer_stock).

name#

metric_specs#

collect_config#

targets_path#

compute_metrics#

default_config#

setup_hook = None#

first with None (to trigger imports/registration), then with the Simulation instance (to attach roles/extensions).

Type:: Optional hook called twice

title = ''#

stability_title = ''#