Attention and Authority
The field is solving storage and retention. The thing it is not solving is focus.
The LLM field talks about memory. So do most of the people writing about LLMs. The word covers too much, and the work that hides behind it is at least four different things. Treating them as one is part of why the field cannot see what it is missing.
Four Words, Not One
Storage is the capacity to hold information at all. A vector database is storage. A context window is storage. A weights file is storage. Storage answers the question: is the information present somewhere in the system.
Retention is what persists across time or across boundaries. A token that survives the next compression pass has been retained. A fact that lives through the next session boundary has been retained. Retention is about durability of presence, not about whether the system uses what it retains.
Memory, in the cognitive sense, is the active function that holds, surfaces, and integrates prior content with current processing. This is what people mean when they say a person has good memory or that a model remembers something well. It is not storage. It is not retention. It is the operation that makes stored, retained content actually usable in the moment it matters.
Focus is something else again. Focus is the sustained direction of attention across time toward something the session has established as load-bearing. Focus is not about whether the information is there. It is about whether the priorities that should govern what the system does next are still governing, three hundred turns into a long interaction, against everything that has happened in between.
These are four different functions. The field uses one word for them, which is part of why the field cannot see what it is missing.
What the Field Is Actually Building
The current literature on long-context AI is overwhelmingly about the first two. Vector stores and weight-level fine-tuning are storage. Larger context windows, hierarchical retention schemes, compressed summaries, and KV-cache management are retention. Retrieval-augmented generation operates on both: store the corpus, retain the index, pull the right chunks back into the working window when needed.
Microsoft's MEMENTO paper is the most recent serious work in this cluster, teaching models to compress their reasoning into segmented blocks rather than letting chain-of-thought balloon into flat token streams. The papers beside it, including Lychee Memory, Active Context Compression and its Focus architecture inspired by slime mold pruning, and Adaptive Context Compression, are all serious engineering. The research is honest, the engineering is competent, and the benchmarks improve. Storage gets bigger. Retention gets longer. Compression gets more efficient.
None of it is solving the governance problem.
The field thinks it is solving the memory problem. What it is actually solving is storage and retention, in increasingly clever ways, and labeling that work memory because the cognitive vocabulary is more compelling than the engineering vocabulary. A reader who has watched a long session drift hears the word memory and thinks: yes, this is what would fix the drift. It is not what would fix the drift. The drift is not a storage problem and it is not a retention problem. The information that should have been governing was stored. It was retained. The model can quote it back if you ask. The model has stopped using it correctly.
That is not a failure of memory in any of the senses storage and retention can address. It is a failure of focus, and focus is not on the field's map.
What Slipped
In June 2023, I was diagnosed with leptomeningeal carcinomatosis, a Stage IV brain cancer whose prognosis is usually measured in months. The cognitive symptoms started before the diagnosis and worsened throughout that summer. The most disabling was not what most people expect.
Facts were still there. Names were still there. What slipped was the orchestration that decides, moment to moment, what to attend to and what to bring forward: executive sequencing, language retrieval, attention span, and continuity of internal narrative. The encyclopedia in my head was intact. The librarian was overwhelmed.
During the acute phase, I used a large language model extensively, as cognitive scaffolding rather than as authority. It functioned, at peak impairment, the way encyclopedias and dictionaries functioned when I was a child: as a stable external resource that could hold what my internal orchestrator was dropping. Not a companion. Not a therapist. A structure that kept the thread when I could not.
The acute phase passed. Cognitive fog lifted in mid-2024. The scaffolding I needed during impairment is no longer scaffolding I need to function. What I learned doing it, however, is what the rest of this page is about.
When I was at peak impairment, the things that gave me trouble were not facts I had lost. Storage was intact. Retention was largely intact. Memory in the cognitive sense, the active function of surfacing the right prior content for the current moment, was harder but mostly functional. What broke was focus: the across-moments orchestration that holds priorities established at moment one against everything that arrives between moment one and moment three hundred.
This was not storage failure. The task was somewhere in my head. This was not retention failure. The task had not been overwritten. This was not even memory failure in the cognitive sense. This was focus failure: the function that maintains the priority of the load-bearing thread across moment-by-moment shifts of attention had collapsed, and the scaffolding I built around myself was specifically a substitute for it.
The model running a long session fails the same way. Storage is enormous. Retention is good and getting better. The cognitive function of surfacing the right prior content is uneven but works most of the time. What does not work is the across-moments hold on what was established as load-bearing. The markdown file at the top of the session is stored. It is retained. The model can surface it. The model does not focus on it once enough other material has entered the window, because nothing in the architecture is keeping it in priority. Focus is not a function the model has. Focus is not a function any current memory system supplies.
What this page claims, and what it does not
I am not claiming my experience proves anything about LLM architecture. The proof, if there is one, is in the architecture itself, which is documented elsewhere on this site. What my experience provides is the analog that lets the architectural claim be understood. The same failure mode appears in two places: in me, under neurologic disruption, and in the model, under sustained context load. The function that broke is the same function. The scaffolding that helped in one case is structurally the same scaffolding that would help in the other. That is an analogy that does work, not a proof that completes itself.
I am also not claiming anyone else should use a language model the way I did during the acute phase. The conditions were narrow, the boundaries were explicit, the human support around me was strong. None of that translates automatically. It is a data point. It is not a recommendation.
What I am claiming is that the conceptual frame the field has been operating in is wrong in a specific way, and that the specific wrongness is visible from the inside of a particular kind of cognitive failure. The field is treating focus failure as memory failure, and building bigger storage and longer retention. Bigger storage will not produce focus. Longer retention will not produce focus. Better cognitive-memory systems will not produce focus, because focus is not what they operate on. Focus operates across the moments those systems operate inside.
Why the Field Cannot Get There from Here
The reason the field is not building focus is that focus is invisible from inside the storage-and-retention frame. If you start from the premise that the model forgot something and therefore needs more memory, every solution you reach for will be a storage or retention solution. If the benchmarks reward storage and retention performance, every result you measure will confirm that the answer is more storage and more retention. The frame produces the answer that fits the frame.
There is a Paul Simon line that describes the experience from inside the failure exactly: slip sliding away, the nearer your destination, the more you're slipping away. You are not losing the destination. You can name the destination. You can describe how to get there. What is slipping is the thread that connects what you are doing right now to where you said you were going. Each step is locally coherent. The arc across the steps loses its hold.
Storage does not fix that. Retention does not fix that. Even cognitive memory, in the model or in a person, does not fully fix it, because cognitive memory is what operates inside individual recall events. The arc across events is something else. The arc is focus, and focus has to come from a structure that is responsible for the arc rather than for the events.
That structure does not exist in current architectures. The model has self-governance, which operates inside its own moment-to-moment processing. Memory systems wrapped around the model operate inside individual retrieval events. Retention schemes operate inside the durability of stored content. None of these is the arc. Nothing is currently building the arc, because nothing is currently identifying the arc as the missing piece.
The session is what would supply the arc. A session-scoped authority structure, sitting outside the model, holds what the session has established as load-bearing and refuses to let those priorities slip across the moment-by-moment shifts of model attention. It does not invent priorities. It does not replace the model's reasoning. It enforces the continuity of focus on what was already established, in the same way the scaffolding I built around myself enforced the continuity of focus on what I had decided was important before my own orchestrator was overwhelmed.
The work I have been building, the SSOAR patent family, is an attempt to specify what supplies focus: not by adding capability to the model, but by defining a session-scoped authority structure that lives outside the model and refuses to let the priorities the session has established slip away under load. The constraints and the architectural invariants are documented elsewhere on this site. The architectural claim is the one this page is making.
This essay exists because the conceptual frame has to change before the architecture will be legible. The field does not yet have a word for what it is missing. It has memory, which covers storage and retention and cognitive recall and sometimes focus all at once, which is why the field keeps building better memory and keeps being surprised when drift persists. Giving the field a precise term for the fourth function is not a supplement to the architectural work. It is a precondition for the architectural work being understood at all. As long as everything built here reads as another memory system, the field will not see what is different about it. The architecture is not a memory system. It is the across-moments structure that governs admissibility inside the session. Storage operations, retention operations, retrieval operations, compression operations, tool calls, derivative artifacts, and authority transitions are not merely remembered. They are evaluated when proposed. None of that is storage. None of that is retention. None of that is memory in the cognitive sense. All of it is focus.
Build that thing, and the architecture changes.
Keep building memory, and the same failure will keep appearing inside larger and larger context windows, and the field will keep being surprised by it.
That is what this architecture is for.