Why Application Logs Aren't Compliance Evidence

Why Application Logs Aren’t Compliance Evidence

When a CMS auditor asks for proof that your Patient Access API was conformant last Tuesday at 03:00 UTC, the worst possible answer is a stack of application logs.

Engineering teams disagree with this assertion at first. The logs are extensive. They are searchable. They are stored in a hardened observability platform with retention policies and access controls. Surely that is compliance evidence.

It is not. And it is worth understanding why, because the gap between “we have great logs” and “we have audit-grade evidence” is the gap that surfaces six weeks before a 2027 enforcement window — when there is no time left to close it.

This post is for engineers and engineering managers who own the systems that will be audited. The conclusion is one most teams arrive at independently: an audit-grade compliance artifact is structurally different from a log line. Knowing the difference earlier than later is cheaper.

Step into the auditor’s chair

Auditors examine claims for a living. The thing they are trained to ask, on every artifact you hand them, is: “how do I know this wasn’t edited?”

It is not a hostile question. It is a baseline question. Every adversary model the auditor has ever worked with — disgruntled engineer, compromised account, well-intentioned cleanup script, regulatory-investigation discovery — produces a scenario in which the artifact in question was modified after the fact. The auditor’s job is to trust artifacts that survive that scenario.

Apply that lens to a typical payer logging stack:

  • Rotation. Logs older than 30 days are aggregated, downsampled, or deleted entirely. Common, defensible, ordinary policy. Also: the artifact you were going to use for an audit covering 2026-Q1 may not exist anymore by the time the audit happens in 2027-Q3.
  • Edits. Most observability platforms let admins delete log entries. Many let engineers redact fields. Some support bulk transformations during migrations. None of this is malicious; all of it makes the log mutable.
  • No signing. A log line in Datadog, Splunk, or CloudWatch is not cryptographically signed. There is no way for the auditor to detect that a particular entry was altered last week.
  • Vendor coupling. The audit story now depends on the auditor trusting your observability vendor’s chain of custody. That is a layer of indirection most auditors are not paid to investigate.

A single one of those properties is enough to disqualify the artifact. All four together are why the auditor will politely ask for something else.

What auditors actually want

The pattern is not specific to FHIR or healthcare. Across every regulated domain — financial reporting under SOX, drug-trial integrity under 21 CFR Part 11, payments under PCI-DSS, federal evidence under FRE Rule 901 — the same three properties recur:

  1. Authenticity. A cryptographic signature from a key whose holder is identified and accountable.
  2. Integrity. A way to detect that the artifact has been altered since it was signed.
  3. Continuity. A way to detect that an entire artifact has been deleted or replaced wholesale, not just modified.

A signed log entry covers properties 1 and 2. It does not cover property 3 — an attacker can delete an entire signed log file and replace it with a different signed log file, and the cryptographic check still passes.

Property 3 is the one that matters for compliance.

What compliance evidence actually looks like

The structural shape of a compliance-evidence artifact is not new. It has been used in financial timestamping, certificate transparency, software-supply-chain provenance, and clinical-trial audit trails for over a decade. The pattern has four parts:

1. A verifiable claim, not a log line. “On 2026-04-19 at 03:00 UTC, the API at https://api.example-payer.com/fhir was structurally conformant to hl7.fhir.us.carin-bb@2.1.0” is a claim. It has a subject, a predicate, and a timestamp. A log line saying INFO: nightly conformance check completed is not a claim — it is a status update.

2. A canonical encoding. The claim must serialize to the same bytes every time. JSON with arbitrary key order, optional whitespace, and locale-dependent number formatting is not canonical. RFC 8785 JSON Canonicalization Scheme is. Without canonicalization, two systems can hash “the same claim” and get different hashes — at which point the cryptography is decorative.

3. A cryptographic signature on those canonical bytes. Ed25519 is a sensible default: 64-byte signatures, no per-signature random nonce required, no side-channel surface comparable to ECDSA, supported in every standard cryptographic library. The signing key’s identity and accountability are part of your control posture, not the artifact’s.

4. A hash-link to the prior claim for the same subject. This is the property logs cannot provide. Each new claim references the SHA-256 hash of the prior claim. Editing any historical claim changes its hash, which breaks every downstream link. Deleting a historical claim breaks the chain at the deletion point. Either tamper is detectable by replaying the chain.

The first three properties are common in regulated industries. The fourth is what turns a stack of signed claims into an evidence chain — an artifact whose history cannot be rewritten without leaving a trace.

What this looks like in production

The mechanics of an evidence chain are not theoretical. We published a working description of one implementation covering Structural Contract Models, RFC 8785 JCS canonicalization, four-level Merkle hash trees, Ed25519 signing, and write-verified SQLite storage.

For a payer evaluating its own posture, the question is not “should we build one of these from scratch” — that is rarely the right answer. The questions are:

  • What artifact will I hand the auditor in 2027? If the answer is “logs,” start a conversation with your CISO about category fit.
  • Can the artifact be replayed by an external party using only public information? If verifying the chain requires our tooling, your auditor’s security team will rightfully push back. If verifying it requires only the standard Ed25519 library and the public key, the conversation is short.
  • Does the artifact survive the routine operational events that destroy logs? Vendor migrations, retention-policy changes, schema upgrades, observability cost-cuts. A signed Merkle-chained verdict survives all of these because it is a self-contained file, not a row in a vendor’s database.

The honest counter-argument

Application logs are not useless. They are the right tool for debugging, performance analysis, on-call response, and anomaly detection. They are the wrong tool for compliance evidence.

The two needs do not collapse into each other. The CISO who tries to make the observability platform serve both ends up with a system that does neither well — log volume tuned for debugging is too verbose for retention, log retention tuned for compliance is too expensive for debugging, and signing every log line is operationally infeasible.

The clean separation: keep your observability platform for what it is good at. Build (or buy) a separate, narrowly-scoped evidence chain for the specific claims that need to survive an audit. The two systems share no infrastructure and no failure modes.

What we built

Tessara produces a continuous evidence chain for FHIR API conformance against the published Implementation Guides — payload-free, signed, replayable. We probe only public /metadata and declared profile snapshots; no patient data ever touches the pipeline.

If your audit roadmap is currently leaning on application logs to answer the “was it conformant on date X” question, the gap closes on January 1, 2027 along with the rest of CMS-0057-F. See the pricing page or contact us to talk through your specific posture.


References: RFC 8785 JSON Canonicalization Scheme, RFC 8032 EdDSA / Ed25519, NIST SP 800-92 Guide to Computer Security Log Management, 21 CFR Part 11 Electronic Records and Electronic Signatures.