The DailyRunner: orchestration, idempotency, and the kill switch

Twenty accounts get rebalanced every morning before market open. Each one runs a constrained convex optimization with its own constraint set, its own lot history, and its own wash-sale lock vector. The DailyRunner is the orchestration layer between the API surface and the optimizer — the thing that takes "rebalance the fleet" as a verb and turns it into a sequence of idempotent, account-scoped solves with audit trail. This post is about how it's wired.

Module

packages/runner

Entry point

taxview-runner run-daily

Solves per account / day

Kill switch

1 boolean

What it does in one sentence

For each account in the fleet, build a snapshot, solve, persist

The pseudocode is roughly twenty lines. The interesting parts are what it doesn't have to do — the runner is deliberately thin because the strategy module already speaks snapshot-in/trades-out.

DailyRunner.run_account

def run_account(self, account: Account, as_of: date) -> Run:
    if self.flag_frozen():
        return self._noop_run(account, as_of, reason="kill-switch")

    snap = self._build_snapshot(account, as_of)        # cov, lots, prices,
                                                        # wash-sale lock vec,
                                                        # marginal rates
    params = self._merge_overrides(snap, account.tags)

    result = self.solver.solve(snap, params)            # cuOPT or CLARABEL
    trades = self._lot_identify(result.target, snap)    # HIFO/ACB/FIFO

    run_seq = self._next_account_seq(account.id)        # per-account
    return self.repo.persist_run(
        account_id=account.id,
        run_seq=run_seq,
        snapshot=snap,
        params_snapshot=params,
        result=result,
        trades=trades,
        as_of=as_of,
    )

Source: packages/runner/taxview_runner/daily.py — annotated. Real source has more error handling and structured logging; the shape is what's shown.

Three properties the runner enforces

Idempotency, account-scoped sequencing, kill-switch

Runner invariants

Property	How it's enforced	Why it matters
Idempotent per (account, as_of)	Unique constraint on (account_id, as_of_date) in Postgres	Re-running the morning solve from a cron retry doesn't double-trade
Account-scoped run_seq	next_seq = max(run_seq for account_id) + 1, inside the same tx	URLs and UI use RUN-001, RUN-002 per account — not the global PK
Inputs frozen on Run record	snapshot, params_snapshot, solver, solver_status persisted	Replays are deterministic; audit doesn't require re-deriving
Single kill switch	RUNNER_FLAG_FROZEN env var; no-ops every account in flight	Emergency freeze if a data feed is bad — minutes, not deploys

Per-account run_seq vs global run_id

Why we keep two

The autoincrement primary key on the Run table jumps around globally — for twenty accounts that all run at 5am, the run_id values for Account 1 might be 10001, 10021, 10041 across three days. That's surprising in URLs and confusing in UI. So every Run also carries a per-account run_seq starting at 1 for each account. URLs reference the seq; the database joins use the PK.

The seq is allocated atomically inside the same transaction that inserts the Run, with a unique constraint on (account_id, run_seq). If two cron retries race, one wins, the other gets a constraint violation and bails — exactly what idempotency requires.

The user-visible run number is per-account because that's how users see the world: "show me Run 47 of Account A," not "show me global Run 102,419."

The kill switch

Why it's a boolean, not a queue drain

When something is wrong with the data feed — yfinance returning stale prices, the risk model failing to reload, a borrow curve that didn't update — the right answer is usually "freeze everything, page someone, fix the underlying data, re-run." The flag is an environment variable read at the start of every account solve. Setting it doesn't roll back in-flight work, but no new accounts will solve until it's flipped back. The existing day's trades stay in the persisted state; the next day picks up from there.

Snapshot construction

What goes into the per-account input bundle

The snapshot is the deterministic input to the solve — every non-strategy variable the optimizer sees. It's persisted with the Run so a replay reproduces the same trades.

Snapshot contents

Field	Source	Typical size
Universe	Benchmark.constituents (point-in-time)	100 — 1,000 names
Prices · today	Securities.daily_prices	N rows
Σ matrix	RiskModel cache, refreshed nightly	N × N floats
Factor loadings B	Same risk model	N × 6 floats
Lot history	Account.lots	≈ 4 × N rows
Wash-sale lock vector	Account.recent_trades, last 30 days	Boolean N-vector
Marginal rates	Account.tax_settings	{st, lt, niit}
Borrow curve (L/S only)	Broker feed or default	N rates

For more on what gets stored alongside each Run for replayability, see Reproducibility by snapshot.

Where the runner lives in the dependency graph

Up of optimizer, below of API

The runner depends on packages/portfolios (for the snapshot builders, account services, and run repository) and packages/optimizer (for the solver). It's depended on by services/api (the optimize router calls into DailyRunner.run_account for one-off rebalances) and by the standalone CLI (`taxview-runner run-daily`) that nightly cron triggers.

The intentional shape: every API request that triggers a rebalance goes through the same runner code path that the nightly cron uses. There's no second copy of the orchestration logic for "interactive" vs "scheduled" — both are the same function call. That's the property that makes audit trail trivial.

Failure modes the runner handles explicitly

Solver non-optimal, missing prices, stale risk model

Solver non-optimal. Both cuOPT and CLARABEL occasionally return a non-OPTIMAL status (numerical, near-degenerate). The runner's response is to retry once with CLARABEL at tighter tolerance, then fall through to "persist previous-day weights with status = NON_OPTIMAL" and emit an alert. The alert's at the account level so on-call doesn't get paged for a fleet-wide signal.
Missing prices. A name with no price for as_of (vendor downtime) is treated as held-at-prior-price for the solve, with a flag on the Run record. The next day's price refresh resolves the gap.
Stale risk model. The Σ matrix is loaded once at API startup (and refreshed nightly via a separate process). If the loaded matrix is older than 36 hours, the runner refuses to solve and emits a STALE_RISK alert. This is a deliberate choice: a stale Σ can produce silently wrong tracking-error bounds.

Notes & references

Source code: packages/runner/taxview_runner/. The CLI entry point is taxview-runner; see the runner's pyproject.toml for the script reference.
Related: Reproducibility by snapshot for how the Run record stores everything needed to replay a solve, and Daily, weekly, or monthly for the cadence-cost analysis that motivated the daily default.

Engineering note · the runner is intentionally thin. If something can be expressed as a snapshot field or a strategy parameter rather than runner logic, it should be.