DATASET

Dataset

86 named officials, 40 players, and defense-adjusted FTA/36 deltas built from play-by-play attribution. All officials are named by design. That's the dataset's value, not something to hide.

officials.db

Official profiles

Suppressor/amplifier scores and per-player defense-adjusted deltas.

Browse officials →

86 rows

players.db

Player profiles

FTA rates across the officials each player shared games with.

Browse players →

40 rows

downloads.zip

Downloads

CSV, JSON, and parquet exports (CC-BY-4.0).

Download artifacts →

CC-BY-4.0

Methodology

I attribute shooting-foul calls to individual officials by parsing the referee name from the unstructured description field in NBA play-by-play data, a text field that programmatic consumers skip entirely. Layer 1 is the published, browseable layer. Contact-type classification (Layer 2) is the research frontier. I graded 300 clips by hand and the LLMs topped out at 55% precision. When the classifier works, it goes here.

Attribution methodology and ANOVA results →

Data: CC-BY-4.0 · Code: MIT · Author: Harris Gordon