API Reference#
Full autodoc reference for all calibration modules.
Analysis#
Types, patterns, export, and comparison.
Result types, parameter pattern analysis, config export, and comparison.
This module provides: - Core result types (CalibrationResult, ComparisonResult) - Progress formatting helpers - Parameter pattern analysis for identifying best values - Config export (YAML) and before/after comparison
- class calibration.analysis.ScenarioResult(mean_score, std_score, combined_score, pass_rate, n_fail, seed_scores)[source]#
Per-scenario results for cross-scenario evaluation.
- Variables:
mean_score (
float) – Mean score across seeds.std_score (
float) – Standard deviation of scores across seeds.combined_score (
float) – Combined score: mean * (1 - std).pass_rate (
float) – Fraction of seeds with zero FAIL metrics.n_fail (
int) – Total number of seed-level failures.seed_scores (
list[float]) – Individual seed scores.
- mean_score#
- std_score#
- combined_score#
- pass_rate#
- n_fail#
- seed_scores#
- class calibration.analysis.CalibrationResult(params, single_score, n_pass, n_warn, n_fail, mean_score=None, std_score=None, pass_rate=None, combined_score=None, stability_result=None, seed_scores=None, seed_fails=None, scenario_results=None)[source]#
Result from calibration optimization.
- Variables:
params (
dict) – Parameter configuration.single_score (
float) – Validation score from single-seed run.n_pass (
int) – Number of metrics that passed.n_warn (
int) – Number of metrics with warnings.n_fail (
int) – Number of metrics that failed.mean_score (
float, optional) – Mean score across stability seeds.std_score (
float, optional) – Standard deviation of scores across seeds.pass_rate (
float, optional) – Fraction of seeds that passed (no FAIL metrics).combined_score (
float, optional) – Combined score balancing accuracy and stability.stability_result (
StabilityResult, optional) – Full stability test result.seed_scores (
list[float], optional) – Individual seed scores (for incremental stability).seed_fails (
list[int], optional) – Per-seed fail counts (for incremental stability).scenario_results (
dict[str,ScenarioResult], optional) – Per-scenario results for cross-scenario evaluation.
- params#
- single_score#
- n_pass#
- n_warn#
- n_fail#
- mean_score = None#
- std_score = None#
- pass_rate = None#
- combined_score = None#
- stability_result = None#
- seed_scores = None#
- seed_fails = None#
- scenario_results = None#
- classmethod from_cross_eval(params, scenario_results)[source]#
Create a CalibrationResult from cross-scenario evaluation data.
Computes aggregate fields from per-scenario results.
- Parameters:
params (
dict) – Parameter configuration.scenario_results (
dict[str,ScenarioResult]) – Per-scenario evaluation results.
- Return type:
- class calibration.analysis.ComparisonResult(scenario, default_metrics, calibrated_metrics, default_score, calibrated_score, improvements)[source]#
Result from before/after config comparison.
- scenario#
- default_metrics#
- calibrated_metrics#
- default_score#
- calibrated_score#
- improvements#
- calibration.analysis.format_eta(remaining, avg_time, n_workers)[source]#
Format an ETA string from remaining items and average time.
- calibration.analysis.format_progress(completed, total, remaining, eta)[source]#
Format a progress line.
- calibration.analysis.analyze_parameter_patterns(results, top_n=50)[source]#
Analyze which parameter values consistently appear in top configs.
- Parameters:
results (
list[CalibrationResult]) – Screening results sorted by score (best first).top_n (
int) – Number of top configs to analyze.
- Returns:
For each parameter, a dict mapping value -> count in top configs.
- Return type:
dict[str,dict[Any,int]]
- calibration.analysis.print_parameter_patterns(patterns, top_n=50)[source]#
Print parameter pattern analysis.
- calibration.analysis.export_best_config(result, scenario, path=None)[source]#
Export best calibration result as a ready-to-use YAML config.
- Parameters:
result (
CalibrationResult) – Best calibration result.scenario (
str) – Scenario name.path (
Path, optional) – Output path. Defaults to output/{scenario}_best_config.yml.
- Returns:
Path to exported config file.
- Return type:
Path
- calibration.analysis.compare_configs(default, calibrated, scenario, seed=0, n_periods=1000)[source]#
Run default and calibrated configs side-by-side and compare.
- Parameters:
- Returns:
Side-by-side comparison of metrics.
- Return type:
- calibration.analysis.print_comparison(result)[source]#
Print before/after comparison table.
- Parameters:
result (
ComparisonResult) – Output from compare_configs().
Morris Method#
Morris Method screening (elementary effects).
Morris Method (Elementary Effects) screening for global sensitivity analysis.
This module implements the Morris Method (Morris 1991), which runs multiple One-at-a-Time (OAT) trajectories from random starting points across the parameter space. Unlike standard OAT (which depends on a single baseline), Morris provides two measures per parameter:
mu* (mu_star): Mean absolute elementary effect – average importance
sigma: Std of elementary effects – interaction/nonlinearity indicator
Classification uses dual thresholds:
INCLUDE: mu* > threshold OR sigma > threshold
FIX: mu* <= threshold AND sigma <= threshold
This catches interaction-prone parameters that OAT would miss: a parameter with low mu* but high sigma means its effect varies wildly depending on other parameters’ values.
- Supports multiple scenarios:
baseline: Standard BAM model (Section 3.9.1)
growth_plus: Endogenous productivity growth via R&D (Section 3.9.2)
buffer_stock: Buffer-stock consumption with R&D (Section 3.9.4)
- class calibration.morris.MorrisParameterEffect(name, mu, mu_star, sigma, elementary_effects, value_scores=<factory>)[source]#
Morris method results for a single parameter.
- Variables:
name (
str) – Parameter name.mu (
float) – Signed mean elementary effect (can cancel out).mu_star (
float) – Mean absolute elementary effect (primary importance measure).sigma (
float) – Standard deviation of elementary effects (interaction indicator).elementary_effects (
list[float]) – Raw elementary effects from each trajectory.value_scores (
dict[Any,list[float]]) – Observed scores for each grid value across trajectories. Used for best_value estimation and grid pruning.
- name#
- mu#
- mu_star#
- sigma#
- elementary_effects#
- value_scores#
- class calibration.morris.MorrisResult(effects, n_trajectories, n_evaluations, scenario='baseline', avg_time_per_run=0.0, n_seeds=1)[source]#
Full Morris method screening result.
- Variables:
effects (
list[MorrisParameterEffect]) – Per-parameter results.n_trajectories (
int) – Number of Morris trajectories used.n_evaluations (
int) – Number of unique configs evaluated.scenario (
str) – The scenario that was analyzed.avg_time_per_run (
float) – Average wall-clock time per simulation run (seconds).n_seeds (
int) – Number of seeds used per evaluation.
- effects#
- n_trajectories#
- n_evaluations#
- scenario = 'baseline'#
- avg_time_per_run = 0.0#
- n_seeds = 1#
- property ranked#
Effects ranked by mu_star (highest first).
- get_important(mu_star_threshold=0.02, sigma_threshold=0.02)[source]#
Categorize parameters using dual threshold.
A parameter is INCLUDEd if it is either important (high mu*) OR interaction-prone (high sigma). It is FIXed only if both are low.
- to_sensitivity_result()[source]#
Convert to SensitivityResult for downstream compatibility.
Maps mu* to sensitivity, reconstructs per-value scores from trajectory observations, enabling zero changes to build_focused_grid and all downstream calibration code.
- Returns:
Compatible result that can be passed to build_focused_grid().
- Return type:
SensitivityResult
- calibration.morris.run_morris_screening(scenario='baseline', grid=None, n_trajectories=10, seed=0, n_seeds=1, n_periods=1000, n_workers=10, fixed_params=None)[source]#
Run Morris Method screening analysis.
Generates multiple OAT trajectories from random starting points, evaluates all unique configs in parallel, then computes per-parameter elementary effects (mu*, sigma) for importance and interaction classification.
- Parameters:
scenario (
str) – Scenario to calibrate.grid (
dict, optional) – Parameter grid. Defaults to scenario-specific grid.n_trajectories (
int) – Number of Morris trajectories (more = more reliable estimates).seed (
int) – Base random seed for trajectory generation and evaluation.n_seeds (
int) – Number of seeds per config evaluation.n_periods (
int) – Number of simulation periods.n_workers (
int) – Number of parallel workers.fixed_params (
dict, optional) – Parameters to lock at specific values. These params will be included in configs but not perturbed. Use for second-pass Morris screening after locking optimized params from a previous calibration.
- Returns:
Morris screening result with per-parameter mu*, sigma, and value scores.
- Return type:
- calibration.morris.print_morris_report(result, mu_star_threshold=0.02, sigma_threshold=0.02)[source]#
Print formatted Morris method screening report.
- Parameters:
result (
MorrisResult) – Result from run_morris_screening().mu_star_threshold (
float) – Threshold for mu* classification.sigma_threshold (
float) – Threshold for sigma classification.
OAT Sensitivity#
One-at-a-time sensitivity analysis and pairwise interaction testing.
One-At-a-Time (OAT) sensitivity analysis with pairwise interaction scanning.
This module provides sensitivity analysis functionality to identify which parameters have the most impact on validation scores.
- Supports multiple scenarios:
baseline: Standard BAM model (Section 3.9.1)
growth_plus: Endogenous productivity growth via R&D (Section 3.9.2)
buffer_stock: Buffer-stock consumption with R&D (Section 3.9.4)
- class calibration.sensitivity.ParameterSensitivity(name, values, scores, best_value, best_score, sensitivity, group_scores=<factory>)[source]#
Sensitivity result for a single parameter.
- Variables:
name (
str) – Parameter name.values (
list) – All values tested for this parameter.scores (
list[float]) – Validation scores for each value (averaged across seeds).best_value (
Any) – Value that produced the highest score.best_score (
float) – Highest score achieved.sensitivity (
float) – Score range (max - min), indicating parameter importance.group_scores (
dict[str,list[float]]) – Per-metric-group scores for each value. Keys are MetricGroup names (e.g., “TIME_SERIES”, “CURVES”), values are lists parallel to scores.
- name#
- values#
- scores#
- best_value#
- best_score#
- sensitivity#
- group_scores#
- class calibration.sensitivity.SensitivityResult(parameters, baseline_score, scenario='baseline', avg_time_per_run=0.0, n_seeds=1)[source]#
Full sensitivity analysis result.
- Variables:
parameters (
list[ParameterSensitivity]) – Sensitivity results for all parameters.baseline_score (
float) – Score with all default values.scenario (
str) – The scenario that was analyzed.avg_time_per_run (
float) – Average wall-clock time per simulation run (seconds).n_seeds (
int) – Number of seeds used per evaluation.
- parameters#
- baseline_score#
- scenario = 'baseline'#
- avg_time_per_run = 0.0#
- n_seeds = 1#
- property ranked#
Parameters ranked by sensitivity (highest first).
- get_important(sensitivity_threshold=0.02)[source]#
Categorize parameters by sensitivity.
- Parameters:
sensitivity_threshold (
float) – Minimum sensitivity (Δ) for inclusion in grid search.- Returns:
(included, fixed) parameter name lists.
- Return type:
tuple[list[str],list[str]]
- calibration.sensitivity.run_sensitivity_analysis(scenario='baseline', grid=None, baseline=None, seed=0, n_seeds=1, n_periods=1000, n_workers=10)[source]#
Run OAT sensitivity analysis.
Tests each parameter independently while holding others at baseline values. Supports multi-seed evaluation for more robust sensitivity measurement.
- Parameters:
scenario (
str) – Scenario to calibrate (“baseline”, “growth_plus”, or “buffer_stock”).grid (
dict, optional) – Parameter grid. Defaults to scenario-specific grid.baseline (
dict, optional) – Baseline parameter values. Defaults to scenario-specific defaults.seed (
int) – Base random seed (used as first seed).n_seeds (
int) – Number of seeds per evaluation. Seeds are [seed, seed+1, …, seed+n_seeds-1].n_periods (
int) – Number of simulation periods.n_workers (
int) – Number of parallel workers.
- Returns:
Sensitivity ranking of all parameters.
- Return type:
- calibration.sensitivity.print_sensitivity_report(result, sensitivity_threshold=0.02)[source]#
Print formatted sensitivity analysis report with score decomposition.
- Parameters:
result (
SensitivityResult) – Result from run_sensitivity_analysis().sensitivity_threshold (
float) – Threshold for INCLUDE/FIX classification (informational preview).
- class calibration.sensitivity.PairInteraction(param_a, param_b, value_a, value_b, individual_a_score, individual_b_score, combined_score, baseline_score, interaction_strength)[source]#
Interaction result for a pair of parameters.
- param_a#
- param_b#
- value_a#
- value_b#
- individual_a_score#
- individual_b_score#
- combined_score#
- baseline_score#
- interaction_strength#
- class calibration.sensitivity.PairwiseResult(interactions, scenario, baseline_score)[source]#
Full pairwise interaction analysis result.
- interactions#
- scenario#
- baseline_score#
- property ranked#
Interactions ranked by strength (highest first).
- property synergies#
Positive interactions (combined > expected).
- property conflicts#
Negative interactions (combined < expected).
- calibration.sensitivity.run_pairwise_analysis(params, grid, best_values, scenario='baseline', seed=0, n_seeds=3, n_periods=1000, n_workers=10)[source]#
Run pairwise interaction analysis on included parameters.
For each pair of included params, tests all value combinations while fixing others at best values. Measures interaction strength.
- Parameters:
params (
list[str]) – List of included parameter names.grid (
dict) – Full parameter grid.best_values (
dict) – Best value for each parameter (from sensitivity analysis).scenario (
str) – Scenario name.seed (
int) – Base random seed.n_seeds (
int) – Seeds per evaluation.n_periods (
int) – Simulation periods.n_workers (
int) – Parallel workers.
- Returns:
Pairwise interaction results.
- Return type:
Grid Building#
Grid construction, YAML loading, validation, and combination generation.
Grid building, loading, validation, and combination generation.
This module handles parameter grid operations: - Building focused grids from sensitivity analysis results - Loading grids from YAML/JSON files - Validating grid structure - Generating and counting parameter combinations
- calibration.grid.build_focused_grid(sensitivity, full_grid=None, scenario='baseline', sensitivity_threshold=0.02, pruning_threshold=0.04)[source]#
Build focused grid from sensitivity analysis.
- Parameters:
sensitivity (
SensitivityResult) – Result from run_sensitivity_analysis().full_grid (
dict, optional) – Full parameter grid. Defaults to scenario-specific grid.scenario (
str) – Scenario name.sensitivity_threshold (
float) – Minimum sensitivity (delta) for inclusion in grid search.pruning_threshold (
floatorNone) – Maximum score gap from best value for keeping a grid value.Nonedisables pruning.
- Returns:
(grid_to_search, fixed_params) - INCLUDE params (delta > threshold): all grid values (pruned if enabled) - FIX params (delta <= threshold): fix at best value
- Return type:
tuple[dict,dict]
- calibration.grid.load_grid(path)[source]#
Load parameter grid from YAML/JSON file.
Light validation: check dict-of-lists structure, warn about empty values. Supports both .yaml/.yml and .json extensions.
- Parameters:
path (
Path) – Path to grid file.- Returns:
Parameter grid (param_name -> list of values).
- Return type:
dict[str,list[Any]]- Raises:
ValueError – If the file contents are not a dict-of-lists structure.
FileNotFoundError – If the file does not exist.
- calibration.grid.validate_grid(grid)[source]#
Light validation of grid structure.
- Parameters:
grid (
dict[str,list[Any]]) – Parameter grid to validate.- Returns:
List of warnings (empty = OK).
- Return type:
list[str]
- calibration.grid.count_combinations(grid)[source]#
Count total combinations in grid.
- Parameters:
grid (
dict[str,list[Any]]) – Parameter grid.- Returns:
Number of combinations in the grid.
- Return type:
- calibration.grid.generate_combinations(grid, fixed=None, constraints=None)[source]#
Generate all parameter combinations, merged with fixed params.
- Parameters:
grid (
dict[str,list[Any]]) – Parameter grid to generate combinations from.fixed (
dict, optional) – Fixed parameter values to merge into each combination.constraints (
list[callable], optional) – List of callables that take a combo dict and return bool. A combination is yielded only if ALL constraints return True. Useful for coupled params (e.g.,lambda c: c['nfpf'] >= c['nfsf']).
- Yields:
dict[str,Any]– Dictionary mapping parameter names to values.
Screening#
Single-seed grid screening with checkpointing.
Single-seed grid screening with progress tracking and checkpointing.
This module handles the grid screening phase of calibration: testing many parameter combinations quickly using a single seed, with progress reporting and checkpoint-based resumption.
- calibration.screening.screen_single_seed(params, scenario, seed, n_periods)[source]#
Run single-seed validation for quick screening.
- calibration.screening.save_checkpoint(results, scenario, phase='screening')[source]#
Save intermediate results to a checkpoint file.
- calibration.screening.load_checkpoint(scenario, phase='screening')[source]#
Load checkpoint if it exists.
- calibration.screening.delete_checkpoint(scenario, phase='screening')[source]#
Delete checkpoint file if it exists.
- calibration.screening.run_screening(combinations, scenario, n_workers=10, n_periods=1000, avg_time_per_run=0.0, checkpoint_every=50, resume=False)[source]#
Screen parameter combinations with progress tracking and checkpointing.
- Parameters:
combinations (
list[dict]) – Parameter combinations to test.scenario (
str) – Scenario name.n_workers (
int) – Parallel workers.n_periods (
int) – Simulation periods.avg_time_per_run (
float) – Estimated time per run (from sensitivity). 0 = measure during warmup.checkpoint_every (
int) – Save checkpoint every N completions.resume (
bool) – If True, load checkpoint and skip already-evaluated configs.
- Returns:
Results sorted by single_score (best first).
- Return type:
list[CalibrationResult]
Stability Testing#
Tiered stability testing with configurable ranking.
Multi-seed stability testing with tiered evaluation and ranking strategies.
This module handles the stability testing phase of calibration: evaluating top candidates from screening across multiple seeds with configurable ranking strategies and tiered pruning.
- calibration.stability.evaluate_stability(params, scenario, seeds, n_periods)[source]#
Run multi-seed stability test for full evaluation.
- calibration.stability.parse_stability_tiers(tiers_str)[source]#
Parse stability tiers from CLI string.
- Parameters:
tiers_str (
str) – Format: “100:10,50:20,10:100” meaning (top 100 x 10 seeds, top 50 x 20 seeds, top 10 x 100 seeds)- Returns:
List of (n_configs, total_seeds) tuples.
- Return type:
list[tuple[int,int]]
- calibration.stability.run_tiered_stability(candidates, scenario, tiers, n_workers=10, n_periods=1000, avg_time_per_run=0.0, rank_by='combined', k_factor=1.0)[source]#
Run incremental tiered stability testing.
Each tier runs only NEW seeds (not previously tested ones) and accumulates all seed scores for ranking.
- Parameters:
candidates (
list[CalibrationResult]) – Screening results to stability-test.scenario (
str) – Scenario name.tiers (
list[tuple[int,int]]) – List of (n_configs, total_seeds) – each tier tests the top n_configs using enough new seeds to reach total_seeds cumulative.n_workers (
int) – Parallel workers.n_periods (
int) – Simulation periods.avg_time_per_run (
float) – Estimated time per run for ETA.rank_by (
str) – Ranking strategy: “combined” (mean*(1-k*std)), “stability” (pass_rate/n_fail priority), or “mean” (mean_score only).k_factor (
float) – Configurable k in mean - k*std formula (for “combined” ranking).
- Returns:
Final results sorted by ranking strategy (best first).
- Return type:
list[CalibrationResult]
- calibration.stability.run_focused_calibration(grid, fixed_params, scenario='baseline', n_workers=10, n_periods=1000, stability_tiers=None, avg_time_per_run=0.0, resume=False, rank_by='combined', k_factor=1.0)[source]#
Run calibration on focused grid with fixed params.
- Parameters:
grid (
dict) – Parameter grid to search (from build_focused_grid).fixed_params (
dict) – Fixed parameter values (from build_focused_grid).scenario (
str) – Scenario name.n_workers (
int) – Number of parallel workers.n_periods (
int) – Number of simulation periods.stability_tiers (
list[tuple[int,int]], optional) – Tiered stability config. Defaults to [(100, 10), (50, 20), (10, 100)].avg_time_per_run (
float) – Average time per simulation run (from sensitivity).resume (
bool) – If True, resume from checkpoint.rank_by (
str) – Ranking strategy for stability testing.k_factor (
float) – k in mean - k*std formula.
- Returns:
Results sorted by ranking strategy (best first).
- Return type:
list[CalibrationResult]
Serialization#
Save/load for all result types, timestamped output directories.
Central serialization for calibration results.
All save/load operations use a consistent JSON schema with version tracking. Timestamped output directories keep results organized across runs.
- calibration.io.create_run_dir(scenario, output_dir=None)[source]#
Create timestamped output directory.
- Parameters:
scenario (
str) – Scenario name (included in directory name).output_dir (
Path, optional) – Parent directory. Defaults to calibration/output/.
- Returns:
Path to the created directory.
- Return type:
Path
- calibration.io.save_screening(results, sensitivity, grid, fixed, patterns, scenario, path)[source]#
Save screening results to JSON.
- calibration.io.load_screening(path)[source]#
Load screening results from JSON. Returns (results, avg_time_per_run).
- calibration.io.save_stability(results, scenario, path)[source]#
Save stability testing results to JSON.
Reporting#
Auto-generated markdown reports.
Auto-generated markdown reports for calibration results.
Each phase of the calibration pipeline generates a markdown report alongside its JSON results in the timestamped output directory.
- calibration.reporting.generate_sensitivity_report(result, method, path)[source]#
Generate markdown report for sensitivity phase.
- Parameters:
result (
SensitivityResult) – Sensitivity analysis result.method (
str) – Sensitivity method used (“morris” or “oat”).path (
Path) – Output path for the markdown report.
- calibration.reporting.generate_screening_report(results, grid, fixed, patterns, sensitivity, scenario, path, top_n=50)[source]#
Generate markdown report for grid screening phase.
- Parameters:
results (
list[CalibrationResult]) – Screening results (sorted by score).grid (
dict) – Grid parameters searched.fixed (
dict) – Fixed parameter values.patterns (
dict) – Parameter patterns from top configs.sensitivity (
SensitivityResult) – Sensitivity result used for grid building.scenario (
str) – Scenario name.path (
Path) – Output path for the markdown report.top_n (
int) – Number of top configs used for pattern analysis.
- calibration.reporting.generate_stability_report(results, scenario, tiers, comparison, path)[source]#
Generate markdown report for stability phase.
- Parameters:
- calibration.reporting.generate_full_report(sensitivity, screening_results, stability_results, comparison, scenario, tiers, path)[source]#
Generate comprehensive calibration report combining all phases.
- Parameters:
sensitivity (
SensitivityResult) – Sensitivity analysis result.screening_results (
list[CalibrationResult]) – Grid screening results.stability_results (
list[CalibrationResult]) – Stability testing results.comparison (
ComparisonResultorNone) – Before/after comparison.scenario (
str) – Scenario name.tiers (
list[tuple[int,int]]) – Stability tiers used.path (
Path) – Output path for the markdown report.
Rescreen#
Second-pass Morris screening after locking optimized params.
Second-pass Morris screening after locking optimized params.
Delegates to run_morris_screening(fixed_params=...) and computes
the sensitivity collapse between Phase 1 and Phase 2 Morris results.
- calibration.rescreen.resolve_params(params_str)[source]#
Resolve a param group name or comma-separated param names.
- Parameters:
params_str (
str) – Either a PARAM_GROUPS key (e.g., “entry”, “behavioral”) or comma-separated full parameter names (e.g., “beta,max_M”).- Returns:
List of parameter names.
- Return type:
list[str]- Raises:
ValueError – If the string is not a known group and doesn’t look like param names.
- calibration.rescreen.load_fixed_from_result(path)[source]#
Load the #1-ranked result’s params from a stability result file.
- Parameters:
path (
Path) – Path to stability result JSON.- Returns:
Parameter dict from the top-ranked result.
- Return type:
dict[str,Any]
- calibration.rescreen.compute_sensitivity_collapse(phase1, phase2)[source]#
Compute sensitivity collapse between two Morris screenings.
- Parameters:
phase1 (
MorrisResult) – First-pass Morris result (before locking params).phase2 (
MorrisResult) – Second-pass Morris result (after locking params).
- Returns:
Per-parameter dict with phase1_mu_star, phase2_mu_star, collapse_pct.
- Return type:
dict[str,dict]
- calibration.rescreen.run_rescreen(scenario, fix_from, params, n_trajectories=20, n_seeds=5, n_periods=1000, n_workers=10, phase1_morris=None)[source]#
Run second-pass Morris screening on a subset of params.
- Parameters:
scenario (
str) – Scenario name.fix_from (
Path) – Path to stability result JSON to load fixed params from.params (
list[str]) – Parameter names to screen (the rest are fixed).n_trajectories (
int) – Number of Morris trajectories.n_seeds (
int) – Seeds per evaluation.n_periods (
int) – Simulation periods.n_workers (
int) – Parallel workers.phase1_morris (
MorrisResult, optional) – Phase 1 Morris result for collapse comparison.
- Returns:
(phase2_result, collapse_table)
- Return type:
tuple[MorrisResult,dict]
Cost Analysis#
Targeted cost analysis for parameter value substitutions.
Targeted cost analysis – measure the cost of swapping values into a base config.
Evaluates the impact of substituting preferred parameter values into an optimized base configuration. Classifies each swap by cost: FREE (<0.002), CHEAP (<0.005), MODERATE (<0.010), EXPENSIVE (>=0.010).
- class calibration.cost.SwapResult(param, value, base_combined, swap_combined, delta, classification, pass_rate)[source]#
Result of swapping a single parameter value into the base config.
- Variables:
param (
str) – Parameter name.value (
Any) – Swapped value.base_combined (
float) – Base config’s combined score.swap_combined (
float) – Combined score with this value swapped in.delta (
float) – Score change (swap - base). Negative = worse.classification (
str) – Cost classification: FREE, CHEAP, MODERATE, or EXPENSIVE.pass_rate (
float) – Pass rate with swapped value.
- param#
- value#
- base_combined#
- swap_combined#
- delta#
- classification#
- pass_rate#
- calibration.cost.parse_swaps(swap_args)[source]#
Parse swap arguments from CLI.
- Parameters:
swap_args (
list[str]) – List of “param=v1,v2,v3” strings.- Returns:
Parameter -> list of values to try.
- Return type:
dict[str,list]
- calibration.cost.run_cost_analysis(base_params, swaps, scenario, n_seeds=20, n_periods=1000, n_workers=10, base_combined=None)[source]#
Run targeted cost analysis for parameter swaps.
- Parameters:
base_params (
dict) – Base configuration (the stability winner).swaps (
dict[str,list]) – Parameters to swap and their candidate values.scenario (
str) – Scenario name.n_seeds (
int) – Seeds per evaluation.n_periods (
int) – Simulation periods.n_workers (
int) – Parallel workers.base_combined (
float, optional) – Pre-computed base combined score. If None, evaluates the base.
- Returns:
Results for each swap, sorted by absolute delta.
- Return type:
list[SwapResult]
Cross-Scenario Evaluation#
Cross-scenario evaluation with multiple ranking strategies.
Cross-scenario evaluation – run configs across multiple scenarios.
Evaluates parameter configurations on all specified scenarios simultaneously and ranks using cross-scenario criteria.
- calibration.cross_eval.rank_cross_scenario(results, strategy='stability-first')[source]#
Rank configs using cross-scenario criteria.
- Parameters:
results (
list[CalibrationResult]) – Results withscenario_resultspopulated.strategy (
str) – Ranking strategy: - “stability-first”: min(pass_rates) -> total fails -> min(combined) - “score-first”: min(combined) -> total fails - “balanced”: geometric mean of combined scores
- Returns:
Sorted results (best first).
- Return type:
list[CalibrationResult]- Raises:
ValueError – If strategy is not recognized.
- calibration.cross_eval.evaluate_cross_scenario(configs, scenarios, n_seeds=100, n_periods=1000, n_workers=10)[source]#
Evaluate configs across multiple scenarios.
- Parameters:
- Returns:
Results with scenario_results populated.
- Return type:
list[CalibrationResult]
- calibration.cross_eval.compute_scenario_tension(results, scenarios)[source]#
Analyze parameter tensions between scenarios.
Identifies params where the optimal value differs between scenarios, indicating a fundamental trade-off.
- Parameters:
results (
list[CalibrationResult]) – Results withscenario_resultspopulated.scenarios (
list[str]) – Scenario names to compare.
- Returns:
Per-parameter tension info: which value each scenario prefers, and the score gap.
- Return type:
dict[str,dict]
Structured Sweep#
Multi-stage parameter sweep with carry-forward winners.
Structured parameter sweep by category, carrying forward winners.
Each stage runs a grid of its parameters while holding everything else fixed from the base config (plus winners from prior stages). Optionally cross-evaluates against other scenarios at each stage.
- calibration.sweep.parse_stage(stage_str)[source]#
Parse a single stage definition.
Format: “LABEL:param1=v1,v2,v3 param2=v4,v5”
- Parameters:
stage_str (
str) – Stage definition string.- Returns:
(label, param_grid)
- Return type:
tuple[str,dict]
- calibration.sweep.parse_stages(stage_args)[source]#
Parse multiple stage definitions.
- Parameters:
stage_args (
list[str]) – List of stage definition strings.- Returns:
List of (label, param_grid) tuples.
- Return type:
list[tuple[str,dict]]
- class calibration.sweep.StageResult(label, winner_params, combined_score, mean_score, pass_rate, n_candidates)[source]#
Result of a single sweep stage.
- Variables:
- label#
- winner_params#
- combined_score#
- mean_score#
- pass_rate#
- n_candidates#
- calibration.sweep.run_sweep(base_params, stages, scenario, n_workers=10, n_periods=1000, stability_tiers=None, rank_by='combined', k_factor=1.0, cross_scenario=None)[source]#
Run structured multi-stage parameter sweep.
- Parameters:
base_params (
dict) – Starting configuration.stages (
list[tuple[str,dict]]) – List of (label, param_grid) stages to run in order.scenario (
str) – Scenario name.n_workers (
int) – Parallel workers.n_periods (
int) – Simulation periods.stability_tiers (
list[tuple[int,int]], optional) – Tiers for stability testing. Defaults to [(100, 10), (50, 20), (10, 100)].rank_by (
str) – Ranking strategy for stability.k_factor (
float) – k in combined score formula.cross_scenario (
str, optional) – If set, cross-evaluate the stage winner against this scenario.
- Returns:
Per-stage results with winner params and scores.
- Return type:
list[StageResult]
Parameter Space#
Parameter grids for all three scenarios.
Parameter space definition for calibration.
This module defines the parameter grids and scenario-specific defaults used as baselines for sensitivity analysis.
- Supports multiple scenarios:
baseline: Standard BAM model (Section 3.9.1)
growth_plus: Endogenous productivity growth via R&D (Section 3.9.2)
buffer_stock: Buffer-stock consumption with R&D (Section 3.9.4)
For grid combination generation, see calibration.grid.
- calibration.parameter_space.get_parameter_grid(scenario='baseline')[source]#
Get the parameter grid for a scenario.
- Parameters:
scenario (
str) – Scenario name (“baseline”, “growth_plus”, or “buffer_stock”).- Returns:
Parameter grid for the scenario.
- Return type:
- Raises:
ValueError – If scenario is not recognized.
- calibration.parameter_space.get_default_values(scenario='baseline')[source]#
Get the scenario-specific parameter overrides.
- Parameters:
scenario (
str) – Scenario name (“baseline”, “growth_plus”, or “buffer_stock”).- Returns:
Scenario overrides. For baseline, returns empty dict (engine defaults). For extensions, returns extension-specific parameter defaults.
- Return type:
- Raises:
ValueError – If scenario is not recognized.