Assay is the evidence layer. This page is a small agent-based demo of Assay in action. What Assay is → · Verify proof packs in browser →

Tamper-evident evidence for AI agent runs.

This is a minimal demo of Assay implemented in an agent workflow. It shows one narrow claim: a real proof pack can be verified offline, and even a one-character change is detected immediately.

Try it now — no clone needed

$ pip install 'assay-ai>=1.18.0'
$ assay try

This runs the full demo locally: builds a proof pack, verifies it, tampers one byte, verifies again. You'll see PASS then FAIL. Zero setup. Zero API key.

Inspect the actual artifacts

$ git clone https://github.com/Haserjian/assay-agent-demo
$ cd assay-agent-demo
$ pip install 'assay-ai>=1.18.0'

$ assay verify-pack proof_pack/
VERIFICATION PASSED
  Integrity:  PASS
  Receipts:   1
  Errors:     0

$ assay verify-pack tampered_pack/
VERIFICATION FAILED
  E_MANIFEST_TAMPER: Hash mismatch for receipt_pack.jsonl
Step 1
Verify
Check a real proof pack from an AI agent call. Signature + hashes verified offline.
Step 2
See PASS
Evidence intact. Signature valid. Nothing modified since signing. PASS.
Step 3
Tamper & FAIL
Change one character. Verify again. FAIL. Tamper caught instantly.

That is the difference between logs and evidence.
Logs can be silently changed. Evidence fails visibly. No server. No vendor trust. Offline.

What the evidence actually says

Beyond PASS/FAIL, assay explain produces a plain-English interpretation of what the evidence means — and what it doesn't:

$ assay explain proof_pack/

WHAT HAPPENED
  1 receipts recorded: 1 model_call
  Providers: anthropic
  Models: claude-sonnet-4-20250514
  Total tokens: 110
  Signed by: ci-assay-signer

INTEGRITY CHECK
  PASSED
  All file hashes match. The Ed25519 signature is valid.
  This evidence has not been tampered with since creation.

WHAT THIS PROVES
  The recorded evidence is authentic (signed, hash-verified).

WHAT THIS DOES NOT PROVE
  - That every action was recorded
  - That model outputs are correct or safe
  - That receipts were honestly created
  - That timestamps are externally anchored

The evidence states its own limits. That honesty is part of the design.

Why this matters

Logs tell you what the operator says happened. You trust the system.

Evidence lets another party verify what happened. Independently. Offline. Without trusting anyone.

When an agent causes an incident, when a vendor claims "we logged everything," when an auditor asks "prove it" — logs are testimony. A signed, hash-chained proof pack is evidence.

We scanned 30 popular AI agent repos. None produced tamper-evident, independently verifiable evidence for model calls. The tooling for tracing exists. The tooling for evidence is what's been missing. (Full methodology, limits, and repo list.)

What this demo proves

A proof pack proves:

What this demo does not prove

Evidence that overclaims is worse than no evidence at all. Every proof pack states its limits explicitly.

Technical notes

The demo pack uses shadow mode (evidence collection without enforcement — the equivalent of a dry run). Assurance level is L0 (basic integrity verification; higher levels add witness anchoring and claim coverage). The included verify_report.json is a convenience snapshot — assay verify-pack recomputes verification independently every time. The signer identity ci-assay-signer is the default local signing key, not a CI system.

Go deeper

The full product story: how it works, what it catches, threat model, regulatory context.
Upload a proof pack and verify it client-side. Nothing leaves your machine.
What enterprise buyers can actually verify about an AI product.

Inspect the artifacts Assay on GitHub PyPI