Every Run record on the platform stores enough state to replay the solve a year later — same Σ matrix, same constraint set, same lot history, same wash-sale lock vector, same solver, same seed. The replay reproduces the same trades exactly, modulo floating-point determinism. This isn't a nice-to-have; it's the property that makes audit, regression-testing, and cross-solver comparison cheap. This post is about what gets stored, what we deliberately don't store, and why the storage cost is a bargain.
Five fields and one binary
A Run row carries the standard metadata (account_id, run_seq, as_of_date, solver, status, timing). The reproducibility-relevant fields are these:
| Field | Type | What it stores |
|---|---|---|
| snapshot | JSONB | Universe, lot history, wash-sale lock vector, marginal rates, today's prices, borrow curve |
| sigma_blob | Binary (LZ4-compressed) | Covariance matrix Σ as a flat float32 array; size N × N |
| loadings_blob | Binary (LZ4-compressed) | Factor loadings matrix B; size N × 6 |
| params_snapshot | JSONB | Strategy params after override merge; the full set the optimizer actually used |
| result | JSONB | Target weights, realised P&L, tax accruals, solver status, iter count |
One method, deterministic if the inputs are
def replay(run_id: int, *, verify: bool = True) -> ReplayResult:
run = repo.get_run(run_id)
snap = Snapshot.deserialise(run.snapshot, sigma_blob=run.sigma_blob,
loadings_blob=run.loadings_blob)
params = run.params_snapshot
# Same solver back-end the original used; same tolerance.
solver = solver_factory(run.solver, tolerance=run.solver_tolerance)
result = solver.solve(snap, params)
if verify:
assert_weights_close(result.target, run.result["target"], atol=1e-8)
assert_trades_close(result.trades, run.result["trades"])
return ReplayResult(snap=snap, params=params, result=result)And why the omissions are correct
- The full price history. The snapshot stores today's prices for the universe. Σ is built off the trailing 504-day return panel, which we recompute from the Securities table on demand. Storing the panel on every Run would be wasteful and (worse) duplicate the source-of-truth.
- The benchmark weights as numbers. The snapshot stores the benchmark_id and rebalance date; the weights are recomputed from the Benchmark table on replay. The Benchmark methodology archive is the source-of-truth for point-in-time membership.
- The intermediate solver state. We store the solver name, tolerance, status, and iteration count — but not the dual variables or the intermediate iterates. They're recoverable from a replay if needed and would dominate the storage budget if persisted.
Three questions the snapshot answers, every time
The audit-grade properties of the system fall out of the snapshot for free:
- Why was this trade made? Re-solve from the snapshot; the trade ticket and the lot identification are reproduced exactly. The audit doesn't require re-deriving anything.
- What constraints applied? params_snapshot is the full set the optimizer used — including any overrides the account had on that date.
- What was the risk model that day? sigma_blob and loadings_blob are the actual matrices, not a reference to a model that may have been retrained.
The audit story is "click Replay; the system reproduces the decision." Not "open a notebook and try to remember what the risk model looked like in 2024."
How the cuOPT vs CVXPY benchmark uses replay
The cuOPT vs CVXPY benchmark is built on the replay machinery. We pulled 252 trading days of Runs from the synthetic benchmark fleet, replayed each one through both solvers, and compared the results trade-by-trade. Because the snapshot already pins the inputs, the comparison is apples-to-apples by construction. No re-solving with stale Σ; no off-by-one corporate-action mismatch; no surprise "the cuOPT trades look different because the universe was slightly different on that day."
What the snapshot doesn't pin
- Floating-point determinism across hardware. cuOPT on an L4 GPU and cuOPT on a different GPU model can diverge in the last few decimal places of the dual. The result.target differences are typically < 1e-10 and don't change trades, but a strict bit-equivalent replay across hardware is not guaranteed.
- External feed corrections. If a corporate action correction comes in for a date in the past, the prices in the Securities table change. A replay then catches the drift — the trades replay differently, by design. The original Run's persisted trades remain authoritative for audit.
- Source: packages/portfolios/taxview_portfolios/db/models/run.py — Run model with sigma_blob and loadings_blob columns. Replay logic in replay.py.
Engineering note · reproducibility falls out of pinning the inputs, not of pinning the outputs. Save the questions, not the answers.