Validation Scenarios#

Three built-in scenarios validate different model configurations against the reference book.

Baseline (Section 3.9.1)#

Standard BAM model behavior, 25 metrics across 3 categories:

  • TIME_SERIES (10): Unemployment, inflation, GDP trend/growth, vacancy rates

  • CURVES (6): Phillips, Okun, Beveridge curve correlations

  • DISTRIBUTION (4): Firm size metrics (skewness, tail ratios)

from validation import run_validation, run_stability_test

result = run_validation(seed=42, n_periods=1000)
stability = run_stability_test(seeds=list(range(100)))

Growth+ (Section 3.9.2)#

Endogenous productivity growth via R&D investment, 65 metrics across 6 categories:

Category

Count

Key Metrics

TIME_SERIES

14

Unemployment, inflation, GDP trend/growth, vacancy rates

CURVES

6

Phillips, Okun, Beveridge correlations

DISTRIBUTION

4

Firm size metrics (skewness, tail ratios)

GROWTH

11

Productivity/wage growth, co-movement, recession detection

FINANCIAL

20

Interest rates, fragility, price ratio, Minsky classification

GROWTH_RATE_DIST

10

Tent-shape R², bounds checks, outlier percentages

from validation import run_growth_plus_validation

result = run_growth_plus_validation(seed=42, n_periods=1000)

Buffer-Stock (Section 3.9.4)#

Buffer-stock consumption with individual adaptive MPC. Uses a two-layer validation approach:

  1. Unique metrics (8, per-seed): Wealth distribution fitting (Figure 3.8) and MPC behavioral metrics. These determine per-seed PASS/FAIL.

  2. Improvement over Growth+ (aggregate): Mean score deltas across all seeds checked after stability testing. Catches systematic degradation without being affected by per-seed noise.

Category

Count

Key Metrics

DISTRIBUTION

6

Wealth CCDF fitting (Singh-Maddala, Dagum, GB2 R²), Gini, skewness

FINANCIAL

2

Mean MPC, percent dissaving

from validation import run_buffer_stock_validation

# Automatic: runs Growth+ internally as baseline
result = run_buffer_stock_validation(seed=42, n_periods=1000)

# With reuse: pass pre-computed Growth+ result
from validation import run_growth_plus_validation

gp = run_growth_plus_validation(seed=42, n_periods=1000)
result = run_buffer_stock_validation(seed=42, growth_plus_result=gp)

# Access improvement data
print(result.improvement_deltas)  # dict of metric -> delta
print(result.degraded_metrics)  # metrics that degraded beyond threshold
print(result.baseline_score)  # Growth+ ValidationScore
class validation.types.BufferStockValidationScore(metric_results, total_score, n_pass, n_warn, n_fail, config=<factory>, baseline_score=None, improvement_deltas=<factory>, degraded_metrics=<factory>, blend_alpha=0.6)[source]#

Buffer-stock validation result with improvement tracking over Growth+.

Per-seed PASS/FAIL is determined solely by the 8 unique buffer-stock metrics (wealth distribution fits, MPC, dissaving). Improvement over Growth+ is assessed at the aggregate level after stability testing.

The improvement_deltas are computed per seed (informational) but do not affect passed or total_score.

baseline_score = None#

Growth+ baseline result used for comparison (same seed).

improvement_deltas#

bs_score - gp_score (informational).

Type:

Per-metric score deltas

degraded_metrics#

Growth+ metrics with systematic degradation (populated at aggregate level by run_buffer_stock_stability_test(), not per seed).

blend_alpha = 0.6#

Informational only. Not used in score computation.

Target File Structure#

Target values are defined in co-located targets.yaml files with standardized keys per check type:

CheckType

YAML Keys

Purpose

MEAN_TOLERANCE

target, tolerance

Value within target ± tolerance

RANGE

min, max

Value within [min, max]

PCT_WITHIN

target, min

Percentage meeting threshold

OUTLIER

max_outlier, penalty_weight

Penalize excess outliers

BOOLEAN

threshold (in MetricSpec)

Simple > or < check

Core Types#

class validation.types.ValidationScore(metric_results, total_score, n_pass, n_warn, n_fail, config=<factory>)[source]#

Overall validation result with scoring for comparison.

metric_results#
total_score#
n_pass#
n_warn#
n_fail#
config#
property passed#

True if no metrics failed validation.

class validation.types.StabilityResult(seed_results, mean_score, std_score, min_score, max_score, pass_rate, n_seeds, metric_stats)[source]#

Result of multi-seed stability testing.

seed_results#
mean_score#
std_score#
min_score#
max_score#
pass_rate#
n_seeds#
metric_stats#
property is_stable#

True if pass_rate >= 90% and std_score <= 0.15.

class validation.types.MetricResult(name, status, actual, target_desc, score, weight=1.0, message='', group=MetricGroup.TIME_SERIES, format=MetricFormat.DEFAULT)[source]#

Result of validating a single metric.

name#
status#
actual#
target_desc#
score#
weight = 1.0#
message = ''#
group = 1#
format = 1#
class validation.types.MetricSpec(name, field, check_type, target_path, weight=1.0, group=MetricGroup.TIME_SERIES, format=MetricFormat.DEFAULT, threshold=0.0, invert=False, target_desc=None)[source]#

Unified specification for a validation metric.

This dataclass captures everything needed to validate a single metric. Target values are looked up from YAML using standardized keys: - MEAN_TOLERANCE: expects ‘target’ and ‘tolerance’ keys - RANGE: expects ‘min’ and ‘max’ keys - PCT_WITHIN: expects ‘target’ and ‘min’ keys - OUTLIER: expects ‘max_outlier’ key (and optional ‘penalty_weight’) - BOOLEAN: uses threshold defined here

name#
field#
check_type#
target_path#
weight = 1.0#
group = 1#
format = 1#
threshold = 0.0#
invert = False#
target_desc = None#
class validation.types.Scenario(name, metric_specs, collect_config, targets_path, compute_metrics, default_config=<factory>, setup_hook=None, title='', stability_title='')[source]#

Configuration for a validation scenario.

This dataclass bundles everything needed to run validation for a specific scenario (baseline, growth_plus, or buffer_stock).

name#
metric_specs#
collect_config#
targets_path#
compute_metrics#
default_config#
setup_hook = None#

first with None (to trigger imports/registration), then with the Simulation instance (to attach roles/extensions).

Type:

Optional hook called twice

title = ''#
stability_title = ''#