When an AI agent deletes a production branch at 3 AM, the first question is always 'what happened?' An audit trail answers that question - if you built it right. Here's what to log, how to store it, and why append-only matters.

Why audit trails matter for AI agents

Human engineers leave traces: commit messages, PR reviews, Slack threads, deploy logs. When something goes wrong, you can reconstruct what happened.

AI agents leave almost nothing. A LangChain agent that merges a PR doesn't write a commit message explaining its reasoning. A CrewAI workflow that sends 50 Slack messages doesn't log why it chose those recipients. An n8n automation that creates Stripe charges doesn't record which policy authorized the spend.

Without an audit trail, you're flying blind. When an incident happens - and with autonomous agents, it will - you need answers:

What action was taken?

Who (which agent/gateway) took it?

When exactly did it happen?

Why was it allowed? Which policy matched?

What was the context? Request parameters, risk score, connector, target API.

What to log

An effective agent audit trail captures six categories of data:

1. Action identity

Every logged event needs a unique, immutable identifier and a structured action type.

{
  "event_id": "evt_a1b2c3d4",
  "action_type": "github.pr.merge",
  "connector": "github",
  "timestamp": "2026-02-10T03:14:22.847Z"
}

Structured action types (not raw HTTP methods) are critical. github.pr.merge tells you what happened. POST /repos/acme/api/pulls/42/merge requires you to decode the URL.

2. Actor identity

Which gateway processed this action? Which process was running behind it?

{
  "gateway_id": "gw_deploy_bot",
  "gateway_name": "deploy-bot",
  "org_id": "org_acme"
}

In a multi-agent environment, knowing which agent took an action is the difference between a 5-minute investigation and a 5-hour one.

3. Policy evaluation

The most valuable part of an agent audit trail: why was this action allowed or denied?

{
  "decision": "allow",
  "policy_id": "pol_github_merges",
  "policy_name": "Allow PR merges to non-main branches",
  "rules_evaluated": 3,
  "matching_rule": "branch != main AND action = github.pr.merge",
  "risk_score": "medium"
}

Without policy evaluation data, your audit trail is just a log. With it, you can answer "why did the system allow this?" for any action, at any time, months after the fact.

4. Request context

The parameters of the action - enough to understand what happened without replaying the request.

{
  "parameters": {
    "repo": "acme/api",
    "pr_number": 42,
    "target_branch": "staging",
    "merge_method": "squash"
  }
}

Be careful with sensitive data. Log parameter names and non-sensitive values. Never log request bodies containing secrets, PII, or credentials.

5. Outcome

Did the upstream API accept the request? What was the response status?

{
  "outcome": "success",
  "upstream_status": 200,
  "latency_ms": 342
}

6. Approval chain (if applicable)

For actions that required human approval, log the full chain:

{
  "approval": {
    "required": true,
    "requested_at": "2026-02-10T03:14:22Z",
    "approved_at": "2026-02-10T03:16:45Z",
    "approved_by": "admin@acme.com",
    "method": "dashboard"
  }
}

How to store it

Append-only is non-negotiable

An audit trail that can be edited is not an audit trail. It's a log file.

Append-only storage means:

Events can be created but never updated or deleted

The database enforces this at the schema level, not just the application level

Backups preserve the full history

TameFlare uses an append-only audit_events table. There is no UPDATE or DELETE endpoint. The only way to "correct" an audit event is to append a new event referencing the original.

Retention policy

How long to keep audit data depends on your compliance requirements:

Regulation	Minimum retention
GDPR	No specific minimum (but must justify retention period)
NIS2	Sufficient for incident investigation (typically 1-2 years)
DORA	5 years for ICT incident records
SOC 2	1 year minimum
Internal best practice	90 days minimum, 1 year recommended

TameFlare supports configurable retention via the AUDIT_RETENTION_DAYS environment variable. The maintenance cleanup job purges events older than this threshold.

Export format

Auditors and compliance teams need data in formats they can work with:

CSV export for spreadsheet analysis and compliance reviews

JSON for programmatic access and SIEM integration

Filtered views by date range, event type, gateway, or decision

TameFlare's dashboard provides all three: filtered table view, CSV export button, and API access via GET /api/v1/audit.

Decision tokens: cryptographic proof

Logging what happened is necessary. Proving what happened is better.

TameFlare issues ES256 (ECDSA) decision tokens for every policy evaluation. Each token contains:

The action type and parameters

The policy that matched

The decision (allow/deny/require_approval)

A unique nonce (prevents replay)

A timestamp

A digital signature

The token is cryptographically signed by the gateway's private key. Anyone with the public key can verify that:

This decision was actually made by TameFlare (not fabricated)
The decision has not been tampered with
The nonce has not been replayed

This is the difference between "our logs say it was allowed" and "here is cryptographic proof it was allowed."

Common mistakes

1. Logging too little

A log entry that says action: github.pr.merge, result: allowed is almost useless for investigation. You need the policy name, the parameters, the risk score, and the approval chain.

2. Logging too much

Dumping full HTTP request and response bodies creates storage problems, privacy risks, and makes the audit trail harder to search. Log structured metadata, not raw payloads.

3. Mutable storage

If your audit events live in a regular database table with UPDATE and DELETE permissions, they are not audit events. They are logs that anyone with database access can alter.

4. No retention policy

Keeping audit data forever is expensive and may violate data minimization requirements (GDPR Article 5). Define a retention period and enforce it automatically.

5. No export capability

An audit trail that only exists in your application's database is useless to external auditors. Provide CSV/JSON export and API access.

How TameFlare implements this

TameFlare's audit system covers all six categories:

Category	Implementation
Action identity	Structured action types from connectors, unique event IDs
Actor identity	Gateway ID and name, organization context
Policy evaluation	Full policy match details, risk score, rules evaluated
Request context	Parsed parameters (non-sensitive), connector metadata
Outcome	Upstream status, latency, final decision
Approval chain	Requested/approved timestamps, approver identity, method

Storage is append-only (audit_events table, no UPDATE/DELETE). Retention is configurable. Export is available via dashboard CSV button or API.

Decision tokens provide cryptographic proof via ES256 signatures with nonce replay protection.

Getting started with agent audit logging

Install TameFlare - audit logging is enabled by default on all plans
Configure gateways and connectors - every action through the proxy is automatically logged
Set retention - AUDIT_RETENTION_DAYS=365 for 1-year retention
Export regularly - download CSV from the dashboard or set up API polling
Review weekly - check denied actions and approval patterns for policy gaps

The audit trail is not a feature you turn on later. It's the foundation that makes everything else - policies, approvals, kill switches, compliance - trustworthy.

---

*TameFlare is a source-available transparent proxy gateway for AI agents. Read the docs for the full audit trail reference.*

Building an Audit Trail for AI Agent Actions: What to Log and Why