Data Collection#
sim.run() returns a SimulationResults object by default,
containing time series data collected during the simulation. The collect
parameter controls what data is captured.
Quick Example#
import bamengine as bam
sim = bam.Simulation.init(seed=42)
results = sim.run(n_periods=100)
# Access data via bracket syntax
unemployment = results["Economy.unemployment_rate"]
inflation = results["Economy.inflation"]
# Or via attribute-style access
prices = results.Producer.price # shape: (n_periods, n_firms)
# Export to pandas DataFrame (requires pandas)
df = results.to_dataframe()
Collection Options#
The collect parameter accepts three forms:
Boolean (simplest):
# Collect all roles unaggregated + economy metrics (the default)
results = sim.run(n_periods=100)
# Skip collection for benchmarks or when only final state is needed
sim.run(n_periods=100, collect=False)
List (select roles):
# Collect specific roles with all their variables
# Economy metrics are always included automatically
results = sim.run(
n_periods=100,
collect=["Producer", "Worker"],
)
Dict (full control):
# Specify exactly what to collect (full per-agent data by default)
# Economy metrics are always included automatically
results = sim.run(
n_periods=100,
collect={
"Producer": ["price", "inventory"], # Specific variables
"Worker": True, # All Worker variables
"aggregate": "mean", # Explicit aggregation (default: None)
},
)
Collection Settings#
In dict form, the following keys are recognized:
Role names (e.g., “Producer”, “Worker”): Values are either
True(all variables) or a list of variable names.“aggregate”: How to aggregate across agents. Options:
None(default, full per-agent data),"mean","median","sum", or"std".
Economy metrics (avg_price, unemployment_rate, inflation) are
always collected regardless of the collect form used.
Discoverability#
Use sim.collectables() before running to see all available variables,
and results.available() after running to see what was collected:
sim = bam.Simulation.init(seed=42)
# Before running: see what can be collected
sim.collectables()
# ['Consumer.income', 'Economy.avg_price', 'Economy.inflation',
# 'Economy.unemployment_rate', 'Producer.price', 'Producer.production', ...]
results = sim.run(n_periods=100)
# After running: see what was collected
results.available()
# ['Consumer.income', 'Economy.avg_price', 'Economy.inflation', ...]
Economy Metrics#
Economy metrics are 1D arrays (one value per period) and are always collected:
Metric |
Description |
|---|---|
|
Average market price across firms (production-weighted) |
|
Fraction of households without an employer |
|
Year-over-year change in average market price |
These are also available directly on the economy object during simulation:
sim.ec.avg_mkt_price # Current average price (scalar)
sim.ec.avg_mkt_price_history # Full time series (array)
sim.ec.inflation_history # Full inflation time series
np.mean(~sim.wrk.employed) # Current unemployment rate
Full Per-Agent Data#
By default (collect=True), role data is unaggregated: each variable is a
2D array of shape (n_periods, n_agents).
results = sim.run(n_periods=100)
# Shape: (n_periods, n_firms)
prices = results["Producer.price"]
prices = results.Producer.price # equivalent
# Aggregate on access if needed
avg_prices = results.get("Producer", "price", aggregate="mean")
Relationship Data Collection#
Relationships (like LoanBook) can also be collected. Unlike roles,
relationships are opt-in only: they are NOT included when using
collect=True.
# Collect LoanBook data along with role data
results = sim.run(
n_periods=100,
collect={
"Producer": ["price"],
"LoanBook": ["principal", "rate"], # Relationship fields
"aggregate": "sum", # Sum across all active loans
},
)
# Access relationship data
total_principal = results["LoanBook.principal"]
avg_rate = results.get("LoanBook", "rate")
Available aggregations for relationships:
"sum": Total across all edges (e.g., total outstanding principal)"mean": Average value across all edges (e.g., average interest rate)"std": Standard deviation across edgesNone: Full edge data (list of variable-length arrays per period)
Non-aggregated relationship data:
When aggregate=None, relationship data cannot be stacked into 2D arrays
because edge counts vary per period. Instead, data is stored as a list of
arrays:
results = sim.run(
n_periods=50,
collect={
"LoanBook": ["principal"],
},
)
# List of variable-length arrays (one per period)
principal_per_period = results.relationship_data["LoanBook"]["principal"]
# principal_per_period[0] might have 5 loans, period 10 might have 12
Warning
Non-aggregated relationship data cannot be included in DataFrame exports
due to variable lengths. Use results.relationship_data directly or
use aggregation during collection.
Accessing Results#
SimulationResults provides several ways to access data:
# Bracket syntax (flat "Name.variable" key)
results["Producer.price"]
results["Economy.unemployment_rate"]
results["LoanBook.principal"] # if collected
# Attribute-style access
results.Producer.price
results.Economy.unemployment_rate
# get() method (supports on-the-fly aggregation)
results.get("Producer", "price")
results.get("Producer", "price", aggregate="mean")
results.get("Economy", "unemployment_rate")
results.get("LoanBook", "principal") # if collected
# Direct access to nested dicts
results.role_data["Producer"]["price"]
results.economy_data["unemployment_rate"]
results.relationship_data["LoanBook"]["principal"] # if collected
# Get role/relationship as DataFrame
prod_df = results.get_role_data("Producer")
loans_df = results.get_relationship_data("LoanBook") # if collected
Exporting Data#
Export collected data to pandas DataFrames or files for external analysis:
# Export all collected data to a single DataFrame
df = results.to_dataframe()
# Save to various formats (requires pandas)
df.to_csv("results.csv")
df.to_parquet("results.parquet")
# Export individual roles
prod_df = results.get_role_data("Producer")
prod_df.to_csv("producer_data.csv")
Tip
For long simulations or parameter sweeps, saving to Parquet format is recommended: it is compressed, fast to read, and preserves column types.
See also
See the examples for more data collection patterns.