does_harden_choke.exe

The strongest finding

Players who lose free throws from the regular season to the postseason are more likely to have terrible games. The relationship is strong (r = -0.528, p = 0.002). When the whistle dries up, scoring floors drop. FGA and FTA collapse together (r = +0.428, p = 0.016). Foul-dependent scoring fails as a single action when rim access is denied.

What I didn't find

I built a five-axis foul taxonomy, ran LLM video grading across four prompt modes, tested trigger taxonomies, shot-chart models, and architecture models. Most of it failed honestly:

  • The trigger taxonomy (pressure, elimination, opponent quality) doesn't replicate. 31% split-sample concordance, which is not a stable trait.
  • The box-score architecture model explains almost nothing (R² = 0.128).
  • Shot-chart mode independence is dead (range 0.315–0.340 across splits).

Harden is a separate case

  • His elimination-game floor rate (14.6%) is actually lower than his non-elimination playoff rate (23.2%). The choker-in-big-moments framing doesn't hold on that axis.
  • The playoff whistle isn't league-wide: 15 of 31 players increase FTA rate. The cohort mean shift is −0.05 FTA/36, essentially noise on average.
  • He floors at high rates against both elite and weak defenses. Most stars show a clear opponent-quality gradient; Harden doesn't.

Jimmy Butler is the clearest exception. He loses free throws and doesn't floor. The FTA-floor relationship is real but not universal.

What's next

The FTA-shift finding points to foul type as the missing variable. Rim-finishing contact vs. perimeter foul-drawing. The LLM grader I built got to 71% binary accuracy on the timing axis, but the Giannis counterexample killed timing as a discriminator. I graded 300 shooting foul clips by hand trying to build a video classifier for landing fouls. The models can't get past a coin flip yet. When they do, that's the next paper.

Full write-up in the does-harden-choke repo.

31 players · frozen archive