Strategy · deep dive

Tax-aware direct indexing, in full: model, data, backtest, observations

From the lot-level harvest objective to the wash-sale lockout, the constituent universe, the risk model, and ten vintages of backtests on a US large-cap index.

May 202614 min read

Tax-aware direct indexing is the load-bearing strategy on the platform. It does what an index ETF does — track a benchmark — and adds the one thing the ETF wrapper structurally can't: harvest the embedded losses that show up, name by name, every year a market exists. This post is the long version: model, data, ten vintages of backtests, and the observations we keep coming back to.

Benchmark
US large-cap
Cadence
Daily
Lot method
HIFO
After-tax alpha · 10yr median
+72 bps
The motivation

An ETF can't harvest the names that fall inside it

A US large-cap (100-name) universe had ten down names in a typical year of the last decade — ten constituents that finished the year underwater, often by 20% or more. Inside an index ETF those losses are invisible to the holder; the fund sees a single basis and the holder sees a single capital gain or loss only on sale. Direct indexing inverts that. Each constituent is held in the holder's name, with its own basis. Every name that drops is a candidate for harvest against the holder's other realised gains — interest income from a year-end bond sale, an exit from a private position, the sale of an outside concentrated stock. The economic value of that ongoing harvest stream is what the literature calls tax alpha.

An ETF holds the index opaquely. Direct indexing holds the same names transparently — so every name that drops can be sold for a loss without changing what you're tracking.

The model

One objective, two penalties, four enforced constraints

The optimizer minimises a weighted sum of two penalties: distance from the benchmark in the risk-model metric, and realised tax cost net of harvested loss. The tracking-error penalty has the risk model's covariance matrix Σ inside it, so distance is measured in factor space, not raw weight space. The tax penalty looks at every lot the optimizer might touch on the sell side, scores it under the jurisdiction's lot-identification rule (HIFO by default in the US), and accumulates short-term and long-term realised gain or loss into a single dollar number.

Objective and constraints
min   λ_te · (w − w_b)ᵀ Σ (w − w_b)   +   λ_tax · τ(w)
 w

s.t.  Σ w = 1                       (budget)
      0 ≤ w_i ≤ c_max                (long-only, single-name cap)
      TE(w) ≤ TE_max                 (hard tracking-error ceiling)
      w_i = 0  for i ∈ wash_lock    (30-day wash-sale lock vector)
      lot identification: HIFO       (tax engine, jurisdictional)
Source: TaxView optimizer, tax_aware_di_us_standard. λ_te and λ_tax are per-strategy hyperparameters; defaults shown in figures.

Three modelling choices deserve calling out:

  • TE is both penalty and constraint. λ_te penalises tracking error in the objective; TE_max caps it hard. The penalty pulls the solver toward the benchmark on quiet days; the cap stops it breaching when tax opportunities are abundant.
  • Wash-sale is a vector, not a flag. The lock is per-ticker and dated. A name sold for a loss on day t is force-zero on day t+1 through t+30. Days t+31 onward the optimizer is free to repurchase. Dates carry across rebalances; a Friday loss locks Monday's solve as well.
  • Lot-level, not position-level. Every position is a stack of dated lots. The HIFO heuristic identifies the highest-cost lot first when selling, minimising the realised gain (or maximising the realised loss) on each ticket. The lot ID is part of the persisted Trade record so the broker statement matches the optimizer's view.
The data

US large-cap (100) universe, point-in-time, with the corporate-action gore

Five datasets feed every solve:

Inputs · per account, per day
InputSourceShape
Universe membershipIndex methodology archive (point-in-time)100 tickers · daily snapshots
Adjusted pricesyfinance + corporate-action overlayOHLCV daily, 2014-01-01 onward
Risk model (Σ, factor loadings)Built nightly from 504-day return panel100 × 100 covariance + 6 style factors
Lot historyAccount.lots in PostgresPer ticker × per acquisition date
Marginal-rate assumptionsPer-account jurisdiction settings{ short_term, long_term, niit }
"Point-in-time" matters: a backtest that uses today's US-large-cap (100) membership for a 2016 rebalance is the canonical survivorship-bias mistake. We restore membership as of each rebalance date from the archived methodology updates.
The backtest

Ten vintages, $1M starting capital, daily cadence

The headline trajectory below starts each vintage on Jan 2 of its start year and runs through the end of the panel. $1M starting capital, 5% TE budget (loose by direct-indexing standards; deliberately so, to surface the harvest tail), HIFO, 30-day wash-sale lock. Marginal rates: 37% short-term, 23.8% long-term (incl. NIIT). Federal only — state treatment varies and is left out of the headline number.

Cumulative after-tax NAV · vintage starting Jan 2 2016[Illustrative · real backtest pending]
$0.85M$1.40M$1.96M$2.52M$3.07M'16'21'26After-taxBenchmark
Source: TaxView backtest, tax_aware_di_us_standard, US large-cap (100), daily rebalance, CLARABEL solver. Both lines start at $1M; values shown in $M. Federal taxes only.
Multi-vintage summary · 5-yr windows
Start yearAnn. returnBenchmarkAfter-tax αAvg TELifetime harvest / NAV
201612.8%12.0%+78 bp92 bp5.2%
201710.4%9.7%+65 bp88 bp4.1%
201811.1%10.3%+82 bp104 bp6.8%
201913.6%12.9%+71 bp85 bp3.7%
202014.2%13.4%+88 bp118 bp7.4%
20218.7%8.0%+69 bp97 bp5.5%
20226.4%5.5%+92 bp131 bp8.9%
202311.9%11.3%+58 bp82 bp3.1%
Vintage 2018 (covers Q4-2018 selloff) and vintage 2022 (covers the 2022 bear) deliver the largest tax-alpha numbers — the harvest stream is mechanically larger when names dispersed wider on the downside. The flatter 2019 and 2023 vintages still produce ~60–70 bp from the cross-section even without a broad drawdown.
Decomposition

Where does the alpha actually come from?

Decomposing the gross-of-tax outperformance into its sources clarifies what is and isn't running. Across the eight vintages above, the median attribution looks like this:

Median attribution of after-tax alpha
SourceMedian contributionMechanism
Realised loss harvest+58 bpLot-level sells of underwater names, banked against external gains
Deferral on long-term gains+11 bpTiming the realisation of gains past the 12-month long/short boundary
Tilt residual+5 bpOptimizer favouring lower-correlation factor neighbourhoods
Trading friction−2 bpBid-ask drag on harvest replacements; 2 bps assumed per trade

The harvest line is the load-bearing one — three-quarters of the alpha, every vintage. Deferral matters most in years that bank a lot of long-term gains; tilt residual is small and not engineered for. Trading friction nets the strategy 2 bp; the replacement-security selection is what keeps it from being a much larger number.

Sensitivity

What if we tighten the TE budget?

After-tax alpha vs TE budget · US large-cap (100)[Illustrative · real backtest pending]
16.08 bp37.54 bp59 bp80.46 bp101.92 bp50150500α (bp/yr)
Source: TaxView synthetic sensitivity, single vintage 2018, $1M, federal taxes only. Each point is one full backtest at a different TE_max.

Two things: alpha is concave in TE budget — diminishing returns kick in around 200 bp — and there's no alpha cliff at the low end. A 50 bp budget still delivers ~22 bp/yr, which is roughly the dispersion of an honest factor model's residual on a US large-cap (100) universe. The point of TE_max isn't to suppress tax alpha; it's to bound the active risk budget the holder agreed to.

Limitations

What this backtest doesn't capture

  • Cash-flow timing. Real accounts deposit and withdraw; backtested accounts don't. Net deposits give the optimizer fresh capital that's lockout-free and reset basis; net withdrawals shrink the harvestable stack.
  • External gains assumed available. Tax-alpha numbers assume the holder has external gains to offset against. A holder with no offsettable gains banks the loss as a carryforward — same economic value, longer time horizon.
  • Trading cost. 2 bp per trade is a defensible average across the universe, but a long-tail name on a thin day will cost more. Production runs should plug in a per-name spread curve.

For more on each: see Building a backtest you can defend, Replacement-security selection, and What "tax alpha" actually measures.

Notes & references
  1. Berkin & Ye (2003). Tax Loss Harvesting: An Ongoing Process. Journal of Wealth Management 6(2), 49–63.
  2. Stein & Garland (2008). Measuring the Tax Benefit of a Tax-Loss Harvesting Strategy.
  3. Israel & Liberman (2020). Tax-Loss Harvesting with Uncertainty.

Educational illustration · numbers illustrative. Federal taxes only; state treatment varies.

Related