Validation & Analysis
=====================

BAM Engine includes a validation framework for comparing simulation output
against target values from Delli Gatti et al. (2011). This ensures the model
reproduces the reference results and helps detect parameter configurations
that deviate from expected behavior.


Running Validation
------------------

The simplest way to validate is with ``run_validation()``:

.. code-block:: python

   from validation import run_validation

   result = run_validation(seed=42, n_periods=1000)

   print(f"Score: {result.total_score:.3f}")
   print(f"Passed: {result.passed}")
   print(f"Failures: {result.n_fail}")

The result object contains:

- ``total_score``: Weighted score from 0.0 (worst) to 1.0 (perfect)
- ``passed``: ``True`` if zero FAIL-status metrics
- ``n_pass``, ``n_warn``, ``n_fail``: Count of metrics by status
- ``metric_results``: Detailed per-metric breakdown


Validation Scenarios
--------------------

Three built-in scenarios correspond to sections of the reference book:

.. list-table::
   :header-rows: 1
   :widths: 18 22 60

   * - Scenario
     - Book Section
     - What It Validates
   * - ``baseline``
     - Section 3.9.1
     - Core model: unemployment, inflation, firm dynamics, business cycles
   * - ``growth_plus``
     - Section 3.9.2
     - R&D extension: productivity growth, firm size distribution
   * - ``buffer_stock``
     - Section 3.9.4
     - Buffer-stock extension: savings behavior, wealth distribution

Run a specific scenario:

.. code-block:: python

   from validation import run_validation, run_growth_plus_validation

   # Baseline (default)
   baseline_result = run_validation(seed=42, n_periods=1000)

   # Growth+ scenario
   growth_result = run_growth_plus_validation(seed=42, n_periods=1000)

Each scenario has its own targets (defined in ``targets.yaml`` files) and
metric weights tuned to the phenomena that matter most for that model variant.


Understanding Scores
--------------------

Validation uses a **two-layer system**:

**Status checks** (categorical):

- **PASS**: Metric is within acceptable range
- **WARN**: Metric is borderline (outside target but within tolerance)
- **FAIL**: Metric significantly deviates from target

**Scores** (continuous, 0 to 1):

Each metric produces a score between 0.0 and 1.0. The ``total_score`` is a
weighted average across all metrics. Metric weights range from 0.5 (low
importance) to 5.0 (critical).

**Weight-based fail escalation**: High-weight metrics have stricter WARN/FAIL
thresholds. The escalation formula
(:math:`\text{clamp}(5 - 2w, 0.5, 5.0)`) means a weight-3.0 metric fails at
deviations that would only warn for a weight-0.5 metric.

**Metric types**:

.. list-table::
   :header-rows: 1
   :widths: 22 78

   * - Type
     - How It Works
   * - ``RANGE``
     - Value must fall within [min, max] range
   * - ``TOLERANCE``
     - Value must be within percentage of target
   * - ``PCT_WITHIN``
     - Percentage of time series within a band
   * - ``OUTLIER_PENALTY``
     - Penalizes extreme values in distribution
   * - ``BOOLEAN``
     - Binary pass/fail check (e.g., "economy did not collapse")


Robustness Analysis
-------------------

The robustness package tests whether results hold across multiple random seeds,
parameter variations, and structural mechanism changes (Section 3.10):

.. code-block:: bash

   # Full robustness analysis
   python -m validation.robustness

   # Individual parts
   python -m validation.robustness --internal-only
   python -m validation.robustness --sensitivity-only
   python -m validation.robustness --structural-only

.. seealso::

   :doc:`/validation/robustness/index` for complete robustness analysis
   documentation including internal validity, sensitivity analysis, and
   structural experiments.


Parameter Calibration
---------------------

The calibration package finds parameters that maximize validation scores
through a multi-phase pipeline: Morris screening → grid search → stability
testing.

.. code-block:: bash

   python -m calibration --scenario baseline --workers 10

.. seealso::

   - :doc:`calibration` for the user guide calibration tutorial
   - :doc:`/calibration/index` for the full calibration reference


Visualization
-------------

**Scenario plots.** Run a scenario with visualization:

.. code-block:: python

   from validation.scenarios.baseline import run_scenario

   run_scenario(seed=0, show_plot=True)

**Diagnostic dashboards.** Comprehensive multi-figure analysis:

.. code-block:: bash

   python diagnostics/baseline_diagnostics.py
   python diagnostics/growth_plus_diagnostics.py


.. seealso::

   - :doc:`/validation/index` for the full validation reference
   - :doc:`/validation/scoring` for the scoring system details
   - :doc:`calibration` for the calibration tutorial
   - :doc:`extensions` for setting up model extensions before validation
   - :doc:`configuration` for parameter definitions