The Voidly Atlas ML stack

Every deployed censorship-intelligence model in one place — forecast, classifier, anomaly, causal, and the accountability surfaces that grade them. 34+ live endpoints, and 8 negative results we publish with the same permalinks as the wins.

Production defaults: classifier v3.3 + forecast v1 with per-country calibrators. Everything else is additive — new endpoints, not replacements. All CC BY 4.0.

Forecast models

Model	What it does	Live
7-day forecast	XGBoost + isotonic + per-country calibrators	API ↗
Multi-horizon	1d / 7d / 30d, per-horizon SHAP + conformal	API ↗
Hourly	Within-day 6h / 12h / 24h horizons	API ↗
30-day trajectory	Seq2seq day-by-day path + bands	API ↗
Per-region	Aggregate forecast per continent / MENA	API ↗
Per-platform	12 platforms (Twitter, WhatsApp, …)	API ↗
Per-domain	Top-100 domains × 50 countries	API ↗
Zero-shot transfer	Meta-feature model for low-data tail countries	API ↗
Contagion chain	When X blocks, who follows in N days?	API ↗
Live 24h watchlist	Proactive who-blocks-next ranking	API ↗
Pre-shutdown signal	BGP / TLS / new-ASN precursors, ~31h median lead	API ↗

Classifiers

Model	What it does	Live
v3.3 (production)	GradientBoosting, regime-weighted contagion, LOCO median F1 0.870	API ↗
Per-country thresholds	F1-optimal threshold per country, +4pp no retrain	API ↗
Per-method	HTTP / TLS / DNS / TCP specialised	API ↗
Per-protocol	8 protocol groups, AUC 0.98+	API ↗
Per-category	NEWS / ANON / COMT / … (7 promoted)	API ↗
Per-measurement	Row-level classifier (Niaki KDD23)	API ↗

Anomaly detectors

Model	What it does	Live
DBSCAN	CenDTect-style per-country shape anomaly	API ↗
HDBSCAN domain-drift	Per-domain weekly blocking-pattern drift	API ↗
STL seasonal	Deviation from a country’s own weekly pattern	API ↗
Multi-country bursts	Coordinated-campaign detection, FDR-corrected	API ↗
Fused ensemble	Composite of 4 detectors, AUC 0.684	API ↗

Causal & attribution

Model	What it does	Live
Synthetic DiD	Shutdown attribution vs democracy donor pool	API ↗
Causal forest HTE	Election treatment effect, ATE +9.6pp	API ↗
Outage attribution	Censorship vs DDoS / infrastructure / weather	API ↗
Survival / duration	Random Survival Forest, c-index 0.728	API ↗

Accountability surfaces

Model	What it does	Live
Prediction track record	Live 30-day out-of-sample accuracy per model	API ↗
Baseline benchmark	Model lift over trivial baselines (honest)	API ↗
Adversarial robustness	Detection rate under realistic evasion (88–93%)	API ↗
Serving reliability	Uptime + latency across 38 ML endpoints	API ↗
Alert lead-time	Real Sentinel TP / FP rate + lead time	API ↗
Calibration drift	Per-country forecast calibration audit	API ↗
Model uncertainty	Per-day which predictions to question	API ↗
Data freshness	Per-country A–D probe-freshness grade	API ↗

8 honest negative results

Models we built, evaluated, and did NOT promote. Each is a published finding with the full metrics. The headline lesson: stacking and transformer architectures did not beat the simpler regime-weighted GradientBoosting on cross-country generalisation.

v3.4 regime-cluster fine-tune

−3.6pp LOCO F1 — cluster heads added no signal

v3.5 TabPFN

−1pp stratified F1 — small-data regime didn’t favour it

SSL tabular pretraining

−15.6pp F1 — 9.7K unlabeled rows too few

Stacking ensemble v3.7

+1.1pp — below the +2pp promote gate

Meta-ensemble v3.8

LOCO median fails by 0.5pp even with 10 base models

Quantile regression

Zero-inflated target collapsed q5 / q50 to 0