The Voidly Atlas ML stack
Every deployed censorship-intelligence model in one place — forecast, classifier, anomaly, causal, and the accountability surfaces that grade them. 34+ live endpoints, and 8 negative results we publish with the same permalinks as the wins.
Production defaults: classifier v3.3 + forecast v1 with per-country calibrators. Everything else is additive — new endpoints, not replacements. All CC BY 4.0.
Forecast models
| Model | What it does | Live |
|---|---|---|
| 7-day forecast | XGBoost + isotonic + per-country calibrators | API ↗ |
| Multi-horizon | 1d / 7d / 30d, per-horizon SHAP + conformal | API ↗ |
| Hourly | Within-day 6h / 12h / 24h horizons | API ↗ |
| 30-day trajectory | Seq2seq day-by-day path + bands | API ↗ |
| Per-region | Aggregate forecast per continent / MENA | API ↗ |
| Per-platform | 12 platforms (Twitter, WhatsApp, …) | API ↗ |
| Per-domain | Top-100 domains × 50 countries | API ↗ |
| Zero-shot transfer | Meta-feature model for low-data tail countries | API ↗ |
| Contagion chain | When X blocks, who follows in N days? | API ↗ |
| Live 24h watchlist | Proactive who-blocks-next ranking | API ↗ |
| Pre-shutdown signal | BGP / TLS / new-ASN precursors, ~31h median lead | API ↗ |
Classifiers
| Model | What it does | Live |
|---|---|---|
| v3.3 (production) | GradientBoosting, regime-weighted contagion, LOCO median F1 0.870 | API ↗ |
| Per-country thresholds | F1-optimal threshold per country, +4pp no retrain | API ↗ |
| Per-method | HTTP / TLS / DNS / TCP specialised | API ↗ |
| Per-protocol | 8 protocol groups, AUC 0.98+ | API ↗ |
| Per-category | NEWS / ANON / COMT / … (7 promoted) | API ↗ |
| Per-measurement | Row-level classifier (Niaki KDD23) | API ↗ |
Anomaly detectors
| Model | What it does | Live |
|---|---|---|
| DBSCAN | CenDTect-style per-country shape anomaly | API ↗ |
| HDBSCAN domain-drift | Per-domain weekly blocking-pattern drift | API ↗ |
| STL seasonal | Deviation from a country’s own weekly pattern | API ↗ |
| Multi-country bursts | Coordinated-campaign detection, FDR-corrected | API ↗ |
| Fused ensemble | Composite of 4 detectors, AUC 0.684 | API ↗ |
Causal & attribution
| Model | What it does | Live |
|---|---|---|
| Synthetic DiD | Shutdown attribution vs democracy donor pool | API ↗ |
| Causal forest HTE | Election treatment effect, ATE +9.6pp | API ↗ |
| Outage attribution | Censorship vs DDoS / infrastructure / weather | API ↗ |
| Survival / duration | Random Survival Forest, c-index 0.728 | API ↗ |
Accountability surfaces
| Model | What it does | Live |
|---|---|---|
| Prediction track record | Live 30-day out-of-sample accuracy per model | API ↗ |
| Baseline benchmark | Model lift over trivial baselines (honest) | API ↗ |
| Adversarial robustness | Detection rate under realistic evasion (88–93%) | API ↗ |
| Serving reliability | Uptime + latency across 38 ML endpoints | API ↗ |
| Alert lead-time | Real Sentinel TP / FP rate + lead time | API ↗ |
| Calibration drift | Per-country forecast calibration audit | API ↗ |
| Model uncertainty | Per-day which predictions to question | API ↗ |
| Data freshness | Per-country A–D probe-freshness grade | API ↗ |
8 honest negative results
Models we built, evaluated, and did NOT promote. Each is a published finding with the full metrics. The headline lesson: stacking and transformer architectures did not beat the simpler regime-weighted GradientBoosting on cross-country generalisation.
v3.4 regime-cluster fine-tune
−3.6pp LOCO F1 — cluster heads added no signal
v3.5 TabPFN
−1pp stratified F1 — small-data regime didn’t favour it
SSL tabular pretraining
−15.6pp F1 — 9.7K unlabeled rows too few
Stacking ensemble v3.7
+1.1pp — below the +2pp promote gate
Meta-ensemble v3.8
LOCO median fails by 0.5pp even with 10 base models
Quantile regression
Zero-inflated target collapsed q5 / q50 to 0
Temporal Fusion Transformer
LOCO AUC 0.927 ties the simpler XGBoost
Pre-protest GDELT correlator
Only 2 of 16 countries statistically significant