Current Model (v2.0) — Walk-Forward Calibration 1145 OOS fights
66.9%
Walk-Forward Accuracy
0.2303
Brier Score
0.6532
Log Loss
0.1216
ECE
Predicted Range Fights Avg Predicted Actual Win Rate Delta Status
50%-55% 689 52.3% 59.2% +6.9% Under-confident (wins MORE than predicted — extra value)
55%-60% 348 57.1% 74.4% +17.4% Under-confident (wins MORE than predicted — extra value)
60%-65% 95 61.8% 90.5% +28.7% Under-confident (wins MORE than predicted — extra value)
65%-70% 11 67.1% 100.0% +32.9% Under-confident (wins MORE than predicted — extra value)
70%-80% 2 71.7% 100.0% +28.3% Under-confident (wins MORE than predicted — extra value)
Walk-forward = trained on pre-2024 data, tested on 2024+ fights (1145 fights). True out-of-sample, no data leakage. The model is consistently under-confident — fighters the model picks win even more often than predicted, meaning the real edge is larger than shown.
How to Read These Metrics
Brier Score — measures probability calibration. 0.25 = coin flip, lower = better. Good
Log Loss — information-theoretic quality. 0.693 = coin flip, lower = better. Good
ECE (Expected Calibration Error) — avg gap between predicted and actual rates. Lower = better calibrated. Reasonably calibrated
Live Tracking Data (13 predictions scored — click to collapse)
13
Scored
62%
Accuracy
0.24
Brier
Calibration by Confidence Bucket
Predicted Range Count Avg Predicted Actual Win Rate Delta Status
50%-55% 7 53.0% 57.1% +4.1% Well calibrated
55%-60% 2 56.4% 50.0% -6.4% Over-confident
60%-65% 2 63.0% 100.0% +37.0% Under-confident
65%-70% 1 67.8% 0.0% -67.8% Over-confident
75%-80% 1 75.7% 100.0% +24.3% Under-confident
Rolling Accuracy (10-fight window)
Fight 10
70%
Fight 11
70%
Fight 12
60%
Fight 13
60%