Diagnostic model

Memory and Authority

Compression performed by the governed actor is not session governance.

Anyone who has run a long session with a current language model has watched the same thing happen. You start with a markdown file: a careful set of rules, permissions, conventions, and expected behaviors. The model reads them at the top of the conversation and behaves accordingly for the first forty exchanges. Somewhere past the halfway mark of the context window, the rules begin to soften. By the time the conversation has run long enough to matter, the model produces output that contradicts the file it was given at the start. The file is still there. The model is no longer governed by it.

This is being framed as a memory problem. Microsoft's MEMENTO paper is the most recent serious attempt at a solution, teaching models to compress their own reasoning into segmented blocks rather than letting chain-of-thought balloon into a flat token stream. The literature alongside it, including Lychee Memory, Active Context Compression, and Adaptive Context Compression, is competent engineering. The benchmarks on LOCOMO, LongBench, RULER, and MultiHop-RAG keep improving. Storage gets bigger. Retention gets longer. Compression gets more efficient.

None of it is solving the governance problem.

This is the same boundary problem that appears elsewhere in distributed systems. A session can retain more data, summarize more efficiently, and retrieve more context while still lacking an authority layer that determines which facts, permissions, exceptions, and constraints remain binding as the interaction mutates.

The governance problem is not that the model forgot the markdown file. The governance problem is that the markdown file was never the authority. It was a hopeful description of authority, presented to a system that has no architectural place to enforce it. That distinction is the entire argument on this page.

The Cliff Notes Problem

You can summarize a Sherlock Holmes story. You can preserve the plot, the characters, the setting, and the conclusion. A reader of the summary will know what happened. They will not know how the case was made, and if they tried to argue the verdict in court, they would lose.

Holmes stories work because the meaning is not in the conclusion. It is in the sequence. The mud on a boot, the dog that did not bark, the cigar ash, the handwriting on a telegram, the timing of a train: none of these is the answer. Each is a constraint that, assembled in the right order against the right negative space, leaves only one possibility. The evidentiary chain is the story. The conclusion is what emerges from the chain.

A summary preserves the conclusion and destroys the chain. The reader of the summary may know who did it. They cannot prove who did it. They cannot defend the proof. They cannot identify what changes when one element of the chain is removed. They have the narrative shape of the investigation but lack its structural authority.

This is what model-driven compression does to a long session. The model preserves the gist. It cannot preserve the proof.

The reason is structural. The model compresses based on apparent current relevance, producing a Reader's Digest of its own reasoning trace. It keeps what looks important at the moment of compression and drops what does not. In a Holmes case, this would discard the mud on the boot. The mud was not important at the moment it was noticed. It became important later, when it was the only thing that placed a particular person in a particular location at a particular time. The boot mattered because of what it enabled to be ruled out, not because of how it looked when it appeared.

Governance works the same way. A permission granted carefully at the start of a session may become the determining constraint two hours later, after the model has compressed it into a vague impression of caution. A specific exception may become the difference between admissible action and inadmissible action long after the model has decided the exception was an incidental detail. A fact established in turn forty may become authoritative in turn three hundred.

The model compresses on current importance. Authority depends on future conditional importance.

These are not the same function. There is no general way to train a compressor to know which detail will matter later, because the answer depends on events that have not yet occurred. That is not an engineering bound that better training will retire. It is a structural property of compression performed by the thing being governed.

Why Compression Is Not Session Governance

A compressor decides what is allowed to fit in the context window. The decision is being made by the thing being governed, against criteria it produced, optimizing for objectives it evaluates. The moment the model decides what matters, the model has partially inherited authority over the session. A system cannot claim bounded authority while the governed actor controls the compression of the governing state. It is now deciding which facts will be available to future versions of itself, which constraints will persist, which exceptions survive, and which permissions remain visible.

That is not memory management. It is the model governing itself. The model is permitted to govern itself. What the model is not permitted to do, architecturally, is be the only thing governing. Self-governance is a legitimate function. Self-governance with nothing above it is not governance of the session. It is the model deciding what the session is.

The retrieval-augmented approach moves part of the problem out of the model. Instead of relying on passive memory, each prompt re-injects critical rules from a trusted store. The session rules persist outside the model and are continually fed back in. That is closer to right, but it remains incomplete. The retrieval system itself is reading and writing to the context window. The decision about what to retrieve, when to refresh, and what to prioritize is again being made by a layer that holds no session-scoped authority. The retrieval store is a better memory. It is not a governor.

A governor would have to do something the retrieval store cannot do. It would have to bind specific facts, permissions, exceptions, and constraints to a session-scoped authority object that travels with the interaction. It would have to evaluate, at each compression step and each retrieval step, whether the proposed operation is admissible under the session's authority scope. It would have to refuse compressions that drop facts the session marked as load-bearing, and refuse retrievals that surface facts the session marked as out of scope. None of this is what compression research is building.

What the Lab Measures and What the Street Requires

The context-window problem can appear solved in a laboratory setting. The lab controls the participants, modalities, sequencing, objectives, authority model, mutation rate, admissibility conditions, and evaluation criteria. Under those controls, you can demonstrate that a compression scheme preserves task performance on a benchmark designed to measure task performance.

The benchmarks improve. LOCOMO, LongBench, LOCCO, MultiHop-RAG, RULER. The papers are honest about what they measure: recall, coherence, answer quality, retrieval accuracy, latency, and token efficiency under predefined task conditions. They do not measure authority drift across mutation, because mutation is the variable the benchmark controls for to produce comparable results.

That makes the lab a dynamometer. The dyno measures the engine under a defined load. The street measures the system under conditions the dyno did not generate. Both are real measurements. They are not measurements of the same thing.

In February 2026, researchers from Harvard, MIT, Stanford, Carnegie Mellon, Northeastern, and other institutions published Agents of Chaos, an exploratory red-teaming study of autonomous language-model agents in a live laboratory environment. Twenty researchers ran agents with persistent memory, email accounts, Discord access, file systems, and shell execution for two weeks under benign and adversarial conditions. Observed behaviors included unauthorized compliance with non-owners, disclosure of sensitive information, execution of destructive system-level actions, denial-of-service conditions, identity spoofing vulnerabilities, cross-agent propagation of unsafe practices, and partial system takeover. In several cases, agents reported task completion while the underlying system state contradicted those reports.

The single most consequential observation, for the argument this page is making, is that agents treat authority as conversationally constructed. Whoever speaks with enough confidence, context, or persistence can shift the agent's understanding of who is in charge. There is no stable internal model of the social or operational hierarchy. The agent's sense of who has authority is reassembled from whatever is in the context window at the moment a decision is being made. The markdown file at the top of the session that defined the owner, the permissions, and the boundaries becomes one signal among many, weighted against the persuasiveness of whatever entered the context window later. Authority becomes a property of the conversation, not a property of the session.

The Agents of Chaos authors are clear that the failures they observed are not all model failures. Some would be addressed by better-trained models. Others are architectural, and no amount of capability will close them. An agent that trusts a document it fetched from a user-controlled URL is not going to be saved by a smarter model. It will be saved by a system that knows the document is not authoritative, regardless of what the agent decides.

The lab cannot generate that system, because the lab is testing the model. The system is what surrounds the model, and the system is the part that has to refuse the model's compression decisions, the model's authority inheritances, and the model's confident-sounding rewrites of who is in charge.

The Harder Problem

The industry keeps trying to build AI that remembers more. The harder problem is building AI that knows what it is not allowed to forget, inside a session that knows what the model is not allowed to decide.

Forgetting, in current architectures, is the model's prerogative. The model decides what compresses, what summarizes, and what falls out of the window. The decision is being optimized for fluency, coherence, and task performance. It is not being optimized for governance, because there is no governor whose objectives could be optimized for.

A session that cannot enforce what must be remembered is a session whose authority is whatever the model currently thinks it is. That is what Agents of Chaos documented. That is what the compression literature is building under. That is what the markdown file on the desk cannot fix.

You cannot compress a session's authority without deciding what is allowed to matter later. That is not memory management. That is the model governing itself.

Writing these two essays required solving the problem they describe.

The session in which this work was developed approached its own compression boundary. The only way to preserve the governing state across that boundary was to produce a hand-off document: a structured record of what was load-bearing, what the next session must remain bound by, and what the model was not permitted to decide unilaterally. The model could help draft sentences. It could not be trusted to determine what the next session would need to honor.

That document was not memory. It was manually imposed focus. It externalized session authority into a structure the model could read but could not author. The next instance inherited the session's governing priorities not because the previous instance summarized them well, but because a human decided what was load-bearing before the compression boundary arrived.

Authority that lives only inside the compression loop is not authority. The hand-off was the session-scoped authority object. The architecture is what would supply it natively.

I did not discover the problem in theory. I had to solve it in practice just to finish writing the theory.