Model honesty, public
Every Sentinel shutdown forecast ships with a 90% conformal interval. This page tracks how often the real outcome lands inside that interval — the closer to 90%, the more honest the model. Data lives at /v1/sentinel/calibration/history and updates every 24h.
/v1/forecast/{cc}/7day response under aci_alpha + aci.* fields. Full ACI methodology →Live forecast accuracy (prod_rolling, 30-day window)
Brier < 0.10 is good, > 0.30 is concerning. Calibration MAE < 0.05 means predicted-probabilities track observed-rates closely. See /sentinel/backtest for the actual reliability diagram (predicted-mean vs observed-rate scatter) and /methodology#validation for the full evaluation methodology + 3-split honest baselines.
Empirical coverage — 90-day rolling
The blue line is the actual fraction of forecasts where the real outcome landed inside the 90% conformal interval. The green dashed line is the nominal target (0.90). If the blue stays close to the green, the model is well calibrated.
Blue: empirical coverage · Dashed green: nominal 0.90 target · Orange circles: drift alerts
Last 14 days
| Date | Coverage | q90 | n holdout | Drift? |
|---|---|---|---|---|
| Jun 15 | 91.3% | 0.125 | 2,203 | — |
| Jun 14 | 91.3% | 0.125 | 2,203 | — |
| Jun 13 | 91.3% | 0.125 | 2,203 | — |
| Jun 12 | 91.3% | 0.125 | 2,203 | — |
| Jun 11 | 91.3% | 0.125 | 2,203 | — |
| Jun 10 | 91.3% | 0.125 | 2,203 | ⚠️ |
| Jun 9 | 91.3% | 0.125 | 2,203 | ⚠️ |
| Jun 8 | 91.3% | 0.125 | 2,203 | ⚠️ |
| Jun 7 | 91.3% | 0.125 | 2,203 | ⚠️ |
| Jun 6 | 90.9% | 0.077 | 2,181 | — |
| Jun 5 | 90.9% | 0.077 | 2,181 | — |
| Jun 4 | 90.9% | 0.077 | 2,181 | ⚠️ |
| Jun 3 | 90.9% | 0.077 | 2,181 | ⚠️ |
| Jun 2 | 90.9% | 0.077 | 2,181 | ⚠️ |
What features the model actually uses
Sklearn feature_importances_ on the underlying XGBoost. 39 features total. Top-3 sum: 0.219 · Top-5: 0.317 · Top-10: 0.493. Healthy distribution — no single feature dominates the model.
- 1.recent_shutdown10.0%
- 2.week_of_year6.0%
- 3.month5.9%
- 4.high_urgency_signals_7d5.8%
- 5.gdelt_unrest_30d4.0%
- 6.election_in_7days3.8%
- 7.high_importance_event3.7%
- 8.block_rate_roll30_mean3.6%
- 9.critical_incident_7d3.4%
- 10.ioda_alert_7d3.2%
- 11.blocked_count_roll14_mean3.0%
- 12.block_rate_roll14_mean2.9%
Interpretation: The forecast model's top feature is gdelt_unrest_30d (0.25) — protest + conflict signals from the GDELT 1.0 global news feed. recent_shutdown, block_rate rolling means, and incident counts follow. risk_tier — the leaky country-level encoding that dominated our older classifier at 85% — contributes only ~2% here. Healthy distribution; no single feature dominates.
Raw JSON: /v1/sentinel/feature-importance
Related
- /methodology#validation — 3-split honest baselines (LOCO 0.91 AUC vs stratified 0.98)
- /v1/sentinel/accuracy — full evaluation JSON, updated nightly
- /atlas/elections — see the forecast in action (90-day upcoming elections)