Twenty accounts get rebalanced every morning before market open. Each one runs a constrained convex optimization with its own constraint set, its own lot history, and its own wash-sale lock vector. The DailyRunner is the orchestration layer between the API surface and the optimizer — the thing that takes "rebalance the fleet" as a verb and turns it into a sequence of idempotent, account-scoped solves with audit trail. This post is about how it's wired.
For each account in the fleet, build a snapshot, solve, persist
The pseudocode is roughly twenty lines. The interesting parts are what it doesn't have to do — the runner is deliberately thin because the strategy module already speaks snapshot-in/trades-out.
def run_account(self, account: Account, as_of: date) -> Run:
if self.flag_frozen():
return self._noop_run(account, as_of, reason="kill-switch")
snap = self._build_snapshot(account, as_of) # cov, lots, prices,
# wash-sale lock vec,
# marginal rates
params = self._merge_overrides(snap, account.tags)
result = self.solver.solve(snap, params) # cuOPT or CLARABEL
trades = self._lot_identify(result.target, snap) # HIFO/ACB/FIFO
run_seq = self._next_account_seq(account.id) # per-account
return self.repo.persist_run(
account_id=account.id,
run_seq=run_seq,
snapshot=snap,
params_snapshot=params,
result=result,
trades=trades,
as_of=as_of,
)Idempotency, account-scoped sequencing, kill-switch
| Property | How it's enforced | Why it matters |
|---|---|---|
| Idempotent per (account, as_of) | Unique constraint on (account_id, as_of_date) in Postgres | Re-running the morning solve from a cron retry doesn't double-trade |
| Account-scoped run_seq | next_seq = max(run_seq for account_id) + 1, inside the same tx | URLs and UI use RUN-001, RUN-002 per account — not the global PK |
| Inputs frozen on Run record | snapshot, params_snapshot, solver, solver_status persisted | Replays are deterministic; audit doesn't require re-deriving |
| Single kill switch | RUNNER_FLAG_FROZEN env var; no-ops every account in flight | Emergency freeze if a data feed is bad — minutes, not deploys |
Why we keep two
The autoincrement primary key on the Run table jumps around globally — for twenty accounts that all run at 5am, the run_id values for Account 1 might be 10001, 10021, 10041 across three days. That's surprising in URLs and confusing in UI. So every Run also carries a per-account run_seq starting at 1 for each account. URLs reference the seq; the database joins use the PK.
The seq is allocated atomically inside the same transaction that inserts the Run, with a unique constraint on (account_id, run_seq). If two cron retries race, one wins, the other gets a constraint violation and bails — exactly what idempotency requires.
The user-visible run number is per-account because that's how users see the world: "show me Run 47 of Account A," not "show me global Run 102,419."
Why it's a boolean, not a queue drain
When something is wrong with the data feed — yfinance returning stale prices, the risk model failing to reload, a borrow curve that didn't update — the right answer is usually "freeze everything, page someone, fix the underlying data, re-run." The flag is an environment variable read at the start of every account solve. Setting it doesn't roll back in-flight work, but no new accounts will solve until it's flipped back. The existing day's trades stay in the persisted state; the next day picks up from there.
What goes into the per-account input bundle
The snapshot is the deterministic input to the solve — every non-strategy variable the optimizer sees. It's persisted with the Run so a replay reproduces the same trades.
| Field | Source | Typical size |
|---|---|---|
| Universe | Benchmark.constituents (point-in-time) | 100 — 1,000 names |
| Prices · today | Securities.daily_prices | N rows |
| Σ matrix | RiskModel cache, refreshed nightly | N × N floats |
| Factor loadings B | Same risk model | N × 6 floats |
| Lot history | Account.lots | ≈ 4 × N rows |
| Wash-sale lock vector | Account.recent_trades, last 30 days | Boolean N-vector |
| Marginal rates | Account.tax_settings | {st, lt, niit} |
| Borrow curve (L/S only) | Broker feed or default | N rates |
Up of optimizer, below of API
The runner depends on packages/portfolios (for the snapshot builders, account services, and run repository) and packages/optimizer (for the solver). It's depended on by services/api (the optimize router calls into DailyRunner.run_account for one-off rebalances) and by the standalone CLI (`taxview-runner run-daily`) that nightly cron triggers.
The intentional shape: every API request that triggers a rebalance goes through the same runner code path that the nightly cron uses. There's no second copy of the orchestration logic for "interactive" vs "scheduled" — both are the same function call. That's the property that makes audit trail trivial.
Solver non-optimal, missing prices, stale risk model
- Solver non-optimal. Both cuOPT and CLARABEL occasionally return a non-OPTIMAL status (numerical, near-degenerate). The runner's response is to retry once with CLARABEL at tighter tolerance, then fall through to "persist previous-day weights with status = NON_OPTIMAL" and emit an alert. The alert's at the account level so on-call doesn't get paged for a fleet-wide signal.
- Missing prices. A name with no price for as_of (vendor downtime) is treated as held-at-prior-price for the solve, with a flag on the Run record. The next day's price refresh resolves the gap.
- Stale risk model. The Σ matrix is loaded once at API startup (and refreshed nightly via a separate process). If the loaded matrix is older than 36 hours, the runner refuses to solve and emits a STALE_RISK alert. This is a deliberate choice: a stale Σ can produce silently wrong tracking-error bounds.
- Source code: packages/runner/taxview_runner/. The CLI entry point is taxview-runner; see the runner's pyproject.toml for the script reference.
- Related: Reproducibility by snapshot for how the Run record stores everything needed to replay a solve, and Daily, weekly, or monthly for the cadence-cost analysis that motivated the daily default.
Engineering note · the runner is intentionally thin. If something can be expressed as a snapshot field or a strategy parameter rather than runner logic, it should be.