SG · per-country backtest
SG forecast vs reality
Every Voidly Sentinel shutdown-risk forecast for SG, plotted against what actually happened. 13 (predicted, observed) pairs from the rolling 30-day evaluation window.
Updated every 30 min · CC BY 4.0 · Raw JSON · Current forecast →
Forecasts evaluated
13
since May 21
Accuracy @ 0.5
0.0%
0/13 correct
Brier score
0.248
lower is better
Observed positive rate
0%
mean predicted 49%
Forecast time series
Blue line: forecast probability. Green ✓ markers: forecast was right. Red ✗ markers: forecast was wrong. Dashed line at 0.5 is the binary decision threshold.
Y axis: forecast probability · Faint tick: distance to observed outcome (0 or 1)
All 13 predictions (newest first)
| Eval date | Forecast | Pred ≥ 0.5? | Observed? | Correct? |
|---|---|---|---|---|
| Jun 1 | 44.8% | ↑ | no event | ✗ |
| May 31 | 43.7% | ↑ | no event | ✗ |
| May 30 | 43.2% | ↑ | no event | ✗ |
| May 29 | 45.6% | ↑ | no event | ✗ |
| May 28 | 44.8% | ↑ | no event | ✗ |
| May 27 | 44.2% | ↑ | no event | ✗ |
| May 26 | 44.5% | ↑ | no event | ✗ |
| May 25 | 56.7% | ↑ | no event | ✗ |
| May 24 | 56.2% | ↑ | no event | ✗ |
| May 23 | 56.3% | ↑ | no event | ✗ |
| May 22 | 48.9% | ↑ | no event | ✗ |
| May 21 | 60.0% | ↑ | no event | ✗ |
| May 21 | 54.3% | ↑ | no event | ✗ |
How to read this
- Each row is one historical forecast. We made the prediction at eval_date, then waited the 7-day horizon, then graded against the observed outcome.
- Forecast % is the calibrated probability we published that day. Post-recalibration (2026-05-20) these now match actual observed rates much better — see the refit finding.
- Pred ≥ 0.5 shows the binary alert decision. Note we usually fire alerts at a lower threshold (see /v1/sentinel/global_heatmap for the live cutoff) — the 0.5 column here is for backtest scoring consistency with the global confusion matrix.
- Correct = both (pred ≥ 0.5) and (observed) agree.
Related
- /atlas/forecast/sg — current 7-day calibrated forecast for SG with SHAP drivers
- /sg — SG country page (current state, recent incidents)
- /sentinel/backtest — global reliability diagram + per-country comparison table
- Calibration refit writeup