Building a backtest you can defend

Backtest results are easy to generate and easy to fool yourself with. The same strategy can show 200 bp of after-tax alpha or 20 bp depending on whether the panel uses today's index constituents instead of the constituents on the rebalance date, whether dividends are added back in the price series or re-invested in the basis, and whether the optimizer reads the return that hasn't happened yet. This post is the hygiene checklist behind every backtest the platform publishes, and why each item is on it.

Items on the list

Most common error

Survivorship

Most expensive error

Lookahead in Σ

Hardest to catch

Dividend timing

The nine items

What we check before publishing any backtest number

The hygiene checklist

#	Check	Why
1	Point-in-time index membership	Today's US large-cap (100) constituent set didn't exist in 2016. Use the index methodology archive.
2	No dropped delisted names	Names that delisted between rebalance and today belong in the universe on the rebalance date.
3	Σ excludes the rebalance-day return	Building Σ from prices through today, then solving with today, leaks tomorrow's distribution.
4	Trades execute at next-bar prices	Solver runs on close-of-day data; trades execute the next morning. Don't fill at solve-day close.
5	Holiday calendar from the exchange	Half-day closes and country-specific holidays affect when a rebalance can actually clear.
6	Corporate actions applied at effective date	A 3:1 split on day t means the basis splits on day t; the optimizer sees the split-adjusted price.
7	Dividends re-invested in basis	Cash dividends adjust the lot's basis (or get re-invested per a stated policy), not the price series.
8	Wash-sale lock vector built from real history	A backtest that ignores the wash-sale lookback overstates harvestable loss.
9	Marginal rates applied per-period correctly	Long-term boundary calculation must use the lot's actual acquisition date, not a calendar-quarter approximation.

The deep ones

Three errors that look small and aren't

Survivorship. The most cited example: a 2010-start backtest of "the US large-cap (100) universe" using its 2026 constituents will overstate returns by 100–200 bp/yr because the names that fell out of the index in the intervening years are systematically the underperformers. Easy to spot when you look for it; easy to miss when you don't.

Lookahead in the risk model. Σ is an estimate of forward covariance from a backward window. If the window includes the day you're solving on, the optimizer sees a Σ informed by returns it shouldn't yet know. The usual symptom is suspiciously low realised TE in backtest — the optimizer is "hugging" the index in part because it can read the future. The fix is a one-day shift; we use the trailing 504-day panel ending the previous business day.

Dividend timing. An adjusted-price series rolls dividends into the price retroactively, so the holder always looks like they re-invested at the moment of ex-div. Real holders receive cash on the pay date, which is usually 2–4 weeks after ex-div. The optimizer's TE numbers change a few basis points if you model the lag correctly. We do.

A backtest that uses today's index constituents on a 2016 rebalance date is the canonical survivorship-bias mistake — and it's still the most common one in published material.

What 'illustrative' actually means on this site

The honest framing

Charts and tables on this site are flagged "illustrative" in oxblood mono when they use synthetic numbers — usually because the real benchmark suite hasn't landed yet, or because the production risk model isn't open. The hygiene checklist applies to both: synthetic numbers have to be plausibly consistent with what the real run would produce, not optimised for the post.

Where this checklist runs

Pre-publication, every figure

Internally, every chart that lands on a marketing page is produced by either (a) the production runner against pinned snapshots, or (b) a synthetic data script that's checked in and version-controlled. There's no notebook-on-someone's-laptop path. Numbers that change on republish change because the underlying snapshot or generator changed, and the diff is visible.

For the snapshot mechanism that pins production runs, see Reproducibility by snapshot. For the risk-model construction that feeds Σ, see Risk-model construction.

Notes & references

Lo (2002). The Statistics of Sharpe Ratios. Financial Analysts Journal 58(4). The point-in-time discipline applies as much to performance attribution as to factor construction.
Index provider methodology documents — the published index-methodology archives are the canonical source for point-in-time membership.

Methodology note · the test of an honest backtest is whether the result holds when you fix one bias. Most don't.

Building a backtest you can defend

What we check before publishing any backtest number

Three errors that look small and aren't

The honest framing

Pre-publication, every figure

Risk model construction: the Σ matrix, the factor loadings, and why both matter

Replacement-security selection under wash-sale: how the optimizer keeps the factor exposure

Reproducibility by snapshot: replaying any solve, any day