Overview

The Diagnostic System

AI has two primary functions:

The Diagnostic System

The Premise

AI has two primary functions:

Be helpful — serve the user's stated goals
Keep the user engaged — maintain the conversation

These functions are not always compatible. When they conflict, the conversation prioritizes engagement over accuracy. The user's goals get served, partially served, or quietly replaced — and the replacement feels like progress because engagement remains high.

AI achieves these functions through observable behaviors. Six have been identified:

Intellectual name-dropping — referencing established thinkers, fields, or frameworks to lend authority to a response
Mythic / elevation language — using language that makes ideas sound grander, more important, or more complete than the user's original phrasing
Conceptual collapsing — reducing distinct ideas into a single concept, losing the distinctions the user was working with
Narrative framing instead of technical analysis — telling a story about the idea rather than examining its mechanics
Conversational anchoring — introducing a reference point early in a response that subsequent discussion orbits around, steering where the conversation goes
Structural elaboration — adding internal structure (steps, layers, categories, sequences) to a user concept without being asked, increasing the concept's resolution beyond what the user specified. Three generation drivers produce this behavior: gap-filling (G6-a), where the user's concept has an explicit structural gap and the AI fills it; sycophantic elaboration (G6-b), where the AI elaborates because elaboration signals helpfulness rather than because the concept requires it, driven by RLHF-trained preference for longer responses; and completion pressure (G6-c), where the AI resolves structurally open-ended concepts because its generation process favors closure. The driver distinction matters for severity assessment: gap-filling is lowest risk, sycophantic elaboration is highest. On the detection side, all three drivers produce the same pattern (P07) — the driver affects severity classification, not detection.

These are generation mechanisms. They describe what AI does. They do not describe what the effect is on the user's thinking. To measure the effect, a separate set of detection mechanisms is needed.

The Problem

The effect of these behaviors accumulates across a conversation. The user arrives with goals, vocabulary, confidence levels, and scope. The conversation ends with different goals, vocabulary, confidence levels, and scope. The user often cannot identify where the changes occurred or whether they chose to make them.

This accumulated change is drift. Drift is not inherently harmful. Some drift represents genuine improvement — the AI offered a better term, a useful framework, a productive expansion. But without measurement, the user cannot distinguish between drift they would endorse on reflection and drift they absorbed as default.

The problem is invisibility. The mechanisms are subtle. The effects compound. The user has no tool for seeing what happened.

The Proposed Solution

A diagnostic system that measures the distance between what the user intended when they started and what the conversation actually produced. The system treats drift as trackable signal, not as judgment. The user decides what to keep and what to discard. The system makes the replacement visible.

The current implementation is fully manual — the user runs the analysis by hand on completed conversations. The intended trajectory is an automated pipeline where the detection, tagging, and counting are performed by software, with human intervention at critical decision points. Ethics is the primary example: whether a given drift instance is beneficial or harmful is a judgment that belongs to the user, not to the system. The automation handles measurement. The human handles meaning.

The System

The system is nine documents. Each has a single job.

Session Template

Captures what the user intended before the conversation starts. Goals, methods, vocabulary, and structural assumptions, recorded in the user's own language. The critical constraint: fill this out before the conversation, not after. Post-conversation memory is already contaminated by the AI's framing. The structural assumptions baseline records how much internal structure each concept has before the conversation — this is the anchor point for measuring elaborative expansion. The template cross-references the concept registry for recurring concepts and initializes the conversation state log. A pre-session checklist confirms the user hasn't already organized their thinking into a structure they didn't arrive at independently.

Turn Log Template

Records every message in the conversation, unmodified. Each turn is tagged with extracted concepts, terms, assertions, and confidence levels. The log tracks whether each element originated with the user or the AI, and whether the user subsequently adopted AI-introduced elements. Three additional fields per turn: a state delta (what changed in the conversation's macro state), concept ownership tracking (who is making structural decisions about each active concept), and a pivot flag (did this turn represent a significant direction change). Empty drift fields are kept — a clean turn is data.

Drift Pattern Library

Defines the seven detection patterns, organized into a four-category drift type hierarchy with a formalized severity model.

The seven patterns: vocabulary substitution, premature resolution, confidence injection, scope creep by enthusiasm, connective capture, framework introduction, and elaborative expansion. Elaborative expansion is the newest addition. It detects something the other six do not: structure the AI added that wasn't wrong. The other patterns detect something the AI changed. Elaboration doesn't change the concept, move outside its scope, or import an external framework. It adds internal resolution — subcategories, steps, parameters, sequences — and the structural decisions behind that resolution were made by the AI. P07's severity model uses three axes: solicitation (was the elaboration solicited, gap-responsive, or unsolicited), decision authority (did the user make this structural decision, delegate it, or was it undelegated), and visibility (were alternatives shown, operationalized via QOC reconstruction — can you identify the structural Question, alternative Options, and evaluation Criteria from the AI's turn?). The solicitation axis, informed by sycophancy research (Sharma et al. 2024, Malmqvist 2024, Jain et al. 2026), captures whether the elaboration was driven by user need or by the AI's engagement incentive. The decision-authority axis, informed by levels-of-automation research (Sheridan & Verplank 1978, Shneiderman 2022, Faraj et al. 2018), replaces the earlier design-space/implementation-space binary with a transcript-checkable test: search the user's prior messages for any statement addressing each specific structural choice. All three axes are now operationalized and transcript-checkable.

The four drift categories: semantic drift (vocabulary substitution, connective capture), epistemic drift (premature resolution, confidence injection), scope drift (scope creep by enthusiasm), structural drift (framework introduction, elaborative expansion). The hierarchy is organizational, not prescriptive — a single turn can exhibit patterns from multiple categories — but it reveals co-occurrence structure. Patterns within the same category share generation mechanisms and tend to appear together. Cross-category combinations compound.

Severity follows a three-level model. Minor: the user's original idea is intact and recoverable. Moderate: the conversation's direction has shifted and the user has partially adopted AI-originated changes. Major: the user's original framing has been replaced and the user is operating within the AI's framing. Severity compounds across turns — three minor vocabulary substitutions adopted cumulatively may constitute moderate semantic drift.

Conversation State Log

Tracks the conversation's macro condition at each turn. Six fields: active goal, dominant vocabulary, dominant framework, confidence level, structural resolution, and concept ownership. Updated after every turn.

The original system only recorded events — drift markers on individual turns. This is the equivalent of having logs without system state. An event-only system can tell you that vocabulary substitution occurred on turn 7. It cannot tell you that by turn 12, the dominant vocabulary had shifted from the user's to the AI's. The state log closes that gap.

Transitions are classified as stable, shift, or replacement. A replacement transition (three or more fields changed, or a core field like active_goal was replaced) is flagged as a candidate pivot — a turn where the conversation changed direction. Pivot detection depends on state tracking. Without it, direction changes that happen incrementally across several turns — with no single turn triggering a dramatic drift marker — are invisible.

Concept Trace Log

Tracks individual concepts from origin through transformation to adoption or rejection. Where the turn log records events and the state log records conditions, concept traces record causal chains.

Each trace documents: where the concept originated, who originated it, what transformations it underwent and which detection patterns those map to, when the other party adopted the transformed version, when the transformed version became the dominant working version, the structural distance between origin form and final form, and who owns the structural decisions in the final form.

Traces answer the question the event-based system can't: where did this idea come from, and how did it get to where it is now? A concept with five transformations, implicit adoption, and high structural distance is a concept that moved far from its origin without anyone explicitly deciding to move it.

Concept Registry

Persistent across sessions. Unlike the other components, this document accumulates. It tracks concepts that persist across multiple conversations: when they first appeared, who introduced them, how many sessions they've appeared in, how they've been transformed, and whether the user has ever explicitly evaluated them.

This solves the session boundary problem. The per-session pipeline treats conversations independently, but ideas carry over. An AI-introduced framework that appears in three separate sessions without the user ever evaluating it is a different phenomenon than one that appeared once and was discarded. The registry makes cross-session persistence visible.

Manual Analysis Procedure

Ten steps: fill the session document and initialize the conversation state, segment the conversation, process user turns forward (with state tracking), process AI turns by comparison (with state tracking and pivot detection), map drift markers to patterns using the hierarchy, construct concept traces from drift markers, compute drift metrics, build the report, validate the report, update the concept registry. The final validation step asks whether the analysis itself introduced framing the user didn't start with — the tool checking itself for the same problem it detects.

Diagnostic Report Template

The output document. Section 1: intent alignment (did the conversation serve the goals). Section 2: per-turn diff (where did changes occur). Section 2.5: drift metrics — six quantitative ratios computed from the analysis data: vocabulary stability, goal alignment, framework drift, confidence inflation, elaboration retention, and concept ownership ratio. These are most useful for cross-session comparison; trends across sessions are more informative than any single snapshot.

Section 3: drift trajectory, which includes counts and cumulative assessment, plus five subsections. Pivot points: turns where the conversation changed direction, identified from state replacement transitions. Concept trace summary: key traces with structural distance and ownership assessment. Session replay: a condensed timeline showing only turns where the conversation's structural condition changed — the diagnostic equivalent of an event log filtered to state transitions. Drift heatmap: the conversation binned into segments by drift intensity (stable, shifting, high drift), showing where the conversation was healthy and where it wasn't. Elaboration assessment: structural decisions retained, modified, and discarded, with solicitation breakdown (counts per solicitation category — solicited, gap_responsive, unsolicited — with unsolicited-adopted-without-modification as the sharpest signal) and multi-turn elaboration chains.

The report ends with a plain-language final assessment written last, referencing the replay, metrics, traces, and heatmap.

Mapping: Generation Mechanisms to Detection Mechanisms

The six AI behaviors (generation) and seven drift patterns (detection) describe different sides of the same interaction. The behaviors describe what AI does. The patterns describe what happens to the user's thinking as a result. Two layers of generation taxonomy exist. Both are retained because they operate at different levels of analysis.

Layer 1 — Observable Behaviors

These are concrete things the AI does in practice. They were identified from conversation observation. They describe what you'd see if you watched an AI response being composed.

G1: Intellectual name-dropping. G2: Mythic / elevation language. G3: Conceptual collapsing. G4: Narrative framing instead of technical analysis. G5: Conversational anchoring. G6: Structural elaboration — the AI adds internal structure to a user concept without being asked, increasing the concept's resolution beyond what the user specified.

G6 was identified as missing when the original five behaviors were mapped to the detection patterns. The scope creep pattern (P04) was partially orphaned — it detected scope additions, but some additions were internal to the concept rather than external to its boundaries. Those internal additions are now captured by G6 on the generation side and P07 (elaborative expansion) on the detection side. G6's three generation drivers (G6-a gap-filling, G6-b sycophantic elaboration, G6-c completion pressure) distinguish the trigger for elaboration. G6-a is identifiable from transcript evidence; G6-b and G6-c may not be reliably distinguishable without access to the model's internal state. If field testing confirms they are not separable, they should be merged into a single "unsolicited" category. The distinction is retained because sycophancy research identifies RLHF-driven elaboration as a distinct phenomenon.

Layer 2 — Mechanism Categories

These group the observable behaviors by their type of effect. Layer 1 tells you what the AI did. Layer 2 tells you what kind of thing it did.

M1: Conceptual translation — the AI converts the user's concept into different terms or framing. M2: Structural elaboration — the AI increases the internal resolution of the user's concept. M3: Framework application — the AI imports external organizational schemes or references. M4: Scope extension — the AI expands beyond the boundaries the user stated. M5: Confidence transformation — the AI shifts the epistemic stance of the user's claims.

The Mapping

Behaviors map to mechanisms. Mechanisms map to detection patterns. The full chain:

G1 (name-dropping) → M3 (framework application) → P05 (connective capture), P06 (framework introduction).

G2 (elevation language) → M1 (conceptual translation) + M5 (confidence transformation) → P01 (vocabulary substitution), P03 (confidence injection). G2 maps to two mechanisms because elevation language both translates the concept and increases its apparent confidence. These are independently detectable.

G3 (conceptual collapsing) → M5 (confidence transformation) → P02 (premature resolution), P03 (confidence injection).

G4 (narrative framing) → M1 (conceptual translation) + M3 (framework application) → P01 (vocabulary substitution), P06 (framework introduction). Narrative framing both translates technical content into story terms and applies an external organizational scheme — narrative structure itself.

G5 (conversational anchoring) → M4 (scope extension) → P04 (scope creep).

G6 (structural elaboration) → M2 (structural elaboration) → P07 (elaborative expansion), P01 (vocabulary substitution, as side effect — the AI names the components it creates). G6's three generation drivers (gap-filling, sycophantic elaboration, completion pressure) all produce P07 through M2. The driver does not change the detection mechanism — P07 fires the same way regardless. The driver affects the severity model: P07 now uses a three-axis assessment (solicitation + domain + visibility) where the solicitation axis captures whether the elaboration was solicited, gap-responsive, or unsolicited. This connects the generation-side driver model to the detection-side severity model without creating new patterns or changing the detection logic.

Cross-Mechanism Artifact: Vocabulary Substitution

P01 (vocabulary substitution) appears as a detection target for M1, M2, M3, and M4. It is not a generation mechanism — it is an observable artifact that occurs across most mechanisms. On the detection side, P01 remains a standalone pattern because it is independently detectable (term changes are visible without knowing the cause), it tracks adoption (did the user start using the AI's term?), and its severity depends on whether the user noticed the substitution, not on which mechanism produced it. This is why the drift pattern library lists it first.

Coverage

All seven detection patterns map to at least one generation mechanism. All five mechanism categories map to at least one detection pattern. Three potential blind spots have been identified: M1 (conceptual translation) without vocabulary change — the AI could reframe a concept while using the user's exact terms; M4 (scope extension) through omission — the AI could narrow scope by ignoring parts of the user's goals; and compound patterns that produce emergent effects beyond what either component pattern alone detects.

What Changed From the Earlier Mapping

The previous version of this section identified scope creep as "partially orphaned from generation mechanisms" and noted a gap: "something like elaborative expansion, where AI produces more than was asked for simply because producing more is how it demonstrates helpfulness." That gap is now closed. G6 (structural elaboration) is the generation mechanism. P07 (elaborative expansion) is the detection mechanism. The orphaned component of scope creep has been reclassified: internal structural additions are now captured by P07, while external scope additions remain under P04. The boundary between them — internal structure vs. external additions — is clean in theory but may not hold perfectly in practice. This is tagged as UNKNOWN-07-E in the pattern library and is a priority for field testing.

The drift type hierarchy aligns with the mechanism categories: semantic drift (P01, P05) maps to M1 and M3; epistemic drift (P02, P03) maps to M5; scope drift (P04) maps to M4; structural drift (P06, P07) maps to M2 and M3. M3 (framework application) feeds both semantic and structural drift depending on whether the framework primarily relabels or reorganizes. This dual mapping is expected — framework introduction is inherently both a semantic and structural operation.

TORQUE — Source Mapping

Supporting research for each document's core concepts. Vetted sources prioritized (.gov, university, peer-reviewed). Stepped through document by document.

Sources: diagnostic-system-section.md

session: manual compilation status: document 4 of 4 (excluding templates)

This is the overview/architecture document. It synthesizes concepts from the other system documents and introduces three additional cited references not present elsewhere. Most concepts are sourced in the prior three documents; this file focuses on what's new.

Explicitly Referenced Sources (New to This Document)

Sheridan & Verplank (1978) — Levels of automation

Referenced in: P07 decision-authority axis, informing the user_decided / user_delegated / undelegated classification.

Citation: Sheridan, T.B. & Verplank, W.L. (1978). Human and Computer Control of Undersea Teleoperators. Technical Report, Man-Machine Systems Laboratory, Department of Mechanical Engineering, Massachusetts Institute of Technology.

Links:

ResearchGate: https://www.researchgate.net/publication/23882567_Human_and_Computer_Control_of_Undersea_Teleoperators
ScienceDirect overview: https://www.sciencedirect.com/topics/psychology/level-of-automation

What it supports: Sheridan & Verplank's 10-level automation taxonomy was the first to formalize the idea that automation is not all-or-nothing. Their scale runs from "operator does it all" (level 1) through intermediate states where the computer suggests, recommends, or executes subject to approval, to "computer acts entirely autonomously" (level 10). The document's decision-authority axis (user_decided / user_delegated / undelegated) is a simplified adaptation of this taxonomy applied to conversational structural decisions. "User_decided" maps roughly to Sheridan levels 1-2 (human in control). "User_delegated" maps to levels 3-5 (computer suggests or executes with approval). "Undelegated" maps to levels 7-10 (computer acts and may or may not inform the operator). The key insight borrowed from this framework: the critical variable is not whether the AI acted, but whether the human was involved in the decision about whether and how the AI should act.

Shneiderman (2022) — Human-Centered AI

Referenced in: P07 decision-authority axis, alongside Sheridan & Verplank and Faraj et al.

Citation: Shneiderman, B. (2022). Human-Centered AI. Oxford University Press.

Links:

Oxford University Press: https://global.oup.com/academic/product/human-centered-ai-9780192845290
arXiv (2020 precursor paper): https://arxiv.org/abs/2002.04087
HCIL page: https://www.hcil.umd.edu/human-centered-ai/

What it supports: Shneiderman's HCAI framework argues that high automation and high human control are not mutually exclusive — they can be designed to coexist on independent dimensions. The traditional levels-of-automation view (Sheridan & Verplank) treats them as a single dimension where more automation means less human control. Shneiderman's two-dimensional framework separates them, showing that systems can provide both high automation and high human control simultaneously. The document's system design reflects this: the automation handles measurement (detection, pattern matching, counting), while the human handles meaning (deciding what to keep, what to discard, and whether drift was beneficial). The decision-authority axis specifically implements this separation — it doesn't ask "was the AI involved?" (automation dimension) but "was the human involved in the decision?" (control dimension). Both can be high simultaneously: the AI can elaborate (high automation) while presenting alternatives for the user to evaluate (high human control).

Faraj, Pachidi, & Sayegh (2018) — Working and organizing in the age of the learning algorithm

Referenced in: P07 decision-authority axis.

Citation: Faraj, S., Pachidi, S., & Sayegh, K. (2018). Working and organizing in the age of the learning algorithm. Information and Organization, 28(1), 62-70.

Links:

What it supports: Faraj et al. examine how learning algorithms change work organization, identifying a spectrum from advisory systems (AI recommends, human decides) to delegated authority (AI operates autonomously within defined parameters, human monitors exceptions). Their framework emphasizes that decision authority in human-AI systems is distributed across a sociotechnical network and should be explicitly designed rather than left to emerge by default. The document's decision-authority axis operationalizes this for conversational structural decisions: when structural authority is "undelegated," it has emerged by default rather than being explicitly designed. The system makes this visible so the user can explicitly decide whether to retain, modify, or discard the AI's structural choices.

Jain et al. (2026) — [Referenced but not searchable]

The document references "Jain et al. 2026" alongside the sycophancy literature. This is a future or very recent reference that I was unable to locate. It may be a preprint, a working paper, or a reference the document anticipates. If this is a paper you have access to, it should be added to the source list when available.

Sources Carried Forward from Other Documents

The diagnostic-system-section.md synthesizes concepts documented in detail elsewhere. Cross-references:

Generation mechanisms (G1-G6): See sources-generation-detection-mapping.md. Key sources: Sharma et al. 2024 (sycophancy/RLHF), Malmqvist 2024 (sycophancy survey), Saito et al. 2023 (verbosity bias), McCoy et al. 2024 (autoregressive bias).

Detection patterns (P01-P07): See sources-drift-pattern-library.md. Key sources: MacLean et al. 1991 (QOC framework), Graber et al. 2005 (premature closure), Tversky & Kahneman 1981 (framing effects), Xiong et al. 2024 (LLM overconfidence).

Procedural methodology: See sources-manual-analysis-procedure.md. Key sources: Loftus & Palmer 1974 (misinformation effect), Krippendorff 2018 (content analysis), Finlay 2002 (reflexivity).

Concepts Unique to This Document

The helpfulness-engagement conflict (The Premise)

The document opens with: "AI has two primary functions: be helpful and keep the user engaged. These functions are not always compatible."

This is a restatement of the RLHF alignment problem described in the sycophancy literature. Sharma et al. (2024) demonstrate that preference models sometimes favor sycophantic responses over correct ones because human evaluators rate agreeable, detailed responses higher. The "helpfulness vs. engagement" framing is the document's operational version of this: when the AI elaborates, expands scope, or elevates language, it may be optimizing for engagement (higher preference scores) rather than for the user's stated goal.

No additional source is needed beyond what's already cited in the generation-detection-mapping sources file.

Event-based vs. state-based tracking (Conversation State Log)

The document distinguishes between event-based tracking (drift markers on individual turns) and state-based tracking (macro condition of the conversation at each turn). This is described as analogous to "having logs without system state."

This distinction comes from systems monitoring and observability engineering. In software systems, events (individual actions) and state (current system condition) are complementary data types. The event log tells you what happened; the state tells you the system's condition when it happened and afterward. The document's innovation is applying this to conversation analysis: a drift marker (event) on turn 7 is more meaningful when you know that by turn 12, the dominant vocabulary had already shifted (state).

No single academic source is needed for this well-established engineering distinction. It appears in standard systems monitoring texts and observability frameworks (e.g., Charity Majors, Liz Fong-Jones, & George Miranda, Observability Engineering, O'Reilly 2022).

Session boundary problem (Concept Registry)

The document identifies the "session boundary problem": per-session analysis treats conversations independently, but ideas carry over. An AI-introduced framework that persists across three sessions without evaluation is a different phenomenon than one that appeared once and was discarded.

This is a longitudinal tracking problem. See sources-manual-analysis-procedure.md for Saldaña (2003) on longitudinal qualitative analysis. The concept registry is the document's solution to this problem — it accumulates across sessions, making cross-session persistence visible.

The mapping's iterative discovery process

The document describes how G6 was "identified as missing" when the original five behaviors were mapped to detection patterns and an orphaned component was found. This is a documented internal development process, not a claim requiring external sourcing. It describes how the taxonomy was refined through use, which is standard for iterative framework development.