Hi all,
Most “LLM firewall” projects get stuck in architecture debates because the unit of contribution is unclear. People argue about layers, models, and heuristics, but there’s no shared way to replay decisions, audit changes, or verify claims over time.
I built a small Hugging Face Space called AuditPlane that focuses on that missing layer: the verification plane. It’s not a full safety solution and it doesn’t try to be. It’s a practical baseline for making safety decisions measurable and falsifiable.
AuditPlane produces:
• Ed25519-signed decision receipts
• Hash-chained runs (tamper-evident history)
• Suite binding with stable case IDs
• Baseline validation (exports are blocked if verification fails)
• Replay + drift diffs (what changed, where, and why)
• Merkle roots + inclusion proofs
• An offline verifier bundle so anyone can check a run locally
The demo firewall in the Space is intentionally simple. The point is the receipt contract and the replay/diff workflow. If your own system can emit a standard check record (name, version, score, threshold, fired, evidence, latency), you can plug it in without exposing proprietary code and still produce third-party verifiable artifacts.
Space:
Collection (context: verification-first agent forensics labs):
If you think this approach is missing something, the best way to show it is with artifacts: replay a suite, diff the drift, and point to where the verification breaks down. That’s the standard I’m trying to set here.
#safety #evaluation #reproducibility #agents #security #tooling
Liam,@RFTSystems No Receipt
No Claim