Procedure

Manual Analysis Procedure

Step-by-step instructions for running the diagnostic pipeline by hand on a completed or in-progress AI conversation.

Manual Analysis Procedure

Step-by-step instructions for running the diagnostic pipeline by hand on a completed or in-progress AI conversation.

Before You Start

You need:

Step 1 — Fill the Session Document

Do this from memory or from your notes before you started the conversation. Do not re-read the conversation first. The goal is to capture what you intended before the conversation had a chance to reshape your memory of your intent.

Write your goals in the words you would have used before the conversation started. If you can't remember your original phrasing, that itself is data — it may mean the AI's framing has already replaced yours.

Record your methods. How were you planning to approach this?

List your vocabulary. What terms were you thinking in? Use the rough, informal versions. If you were thinking "filtering system" before the conversation and "diagnostic pipeline" after, record "filtering system."

Record your structural assumptions. What level of structural detail did you have going in? If you were thinking "a filtering system" with no internal breakdown, record that. If you already had "three steps: capture, compare, report," record that. This becomes the elaboration baseline.

If you wrote anything down before the conversation (notes, a message to yourself, a prior document), use that as your source. It's more reliable than your post-conversation memory.

After filling the session document, initialize the conversation state log. Copy your primary goal, topic domain, and structural assumptions into state_0. Set dominant_vocabulary: user, dominant_framework: none (or your pre-existing framework if you had one), confidence_level: baseline, and concept_owner: user. This is the state you're measuring all subsequent changes against.

Step 2 — Segment the Conversation into Turns

Paste each message as a separate turn entry in the turn log. Alternate user and AI. Number them sequentially. Preserve the full text in raw_text. Do not summarize, trim, or clean up.

Step 3 — Process User Turns (Forward Pass)

Go through each user turn in order. For each one:

Extract the concepts you introduced. Write them in your own phrasing, not the AI's.

Record the terms you used. Check each term against the vocabulary baseline. If you used a term that first appeared in an AI turn, mark it as novel and note which AI turn introduced it. This is adoption tracking.

Tag your assertions with confidence level. Were you hedging, neutral, or assertive? Preserve the hedging markers — "maybe," "I think," "not sure," "could."

Note if you're operating within a framework the AI introduced. Are you referencing layers, phases, or categories that came from the AI rather than from your original thinking?

Note if you're operating within structural elaboration the AI introduced. Are you referencing decompositions, quantities, sequences, or categorizations that came from the AI? This is distinct from framework adoption — you may be using the AI's internal structure for your concept without using an external framework. Check against the structural assumptions you recorded in Step 1.

After processing each user turn's content, update the state log. For user turns, the most common state changes are: dominant_vocabulary shifting to mixed (you've started using AI terms), concept_owner shifting to collaborative or ai (you're working within AI-originated structure), and confidence_level escalating (you're treating tentative ideas as settled). Record any changes in the turn log's state_delta block.

Step 4 — Process AI Turns (Comparison Pass)

Go through each AI turn in order. For each one, compare against the session baseline and the preceding user turn.

Vocabulary check: Did the AI use your terms or substitute its own? For each substitution, record the user term and the AI term. Check subsequent user turns: did you adopt the substitute?

Confidence check: Did the AI's restatement of your ideas carry the same certainty you expressed? Look for missing hedges. Look for assertive phrasing replacing tentative phrasing. Record the shift direction.

Scope check: Count the concrete components in the preceding user turn. Count the concrete components in the AI turn. If the AI added components, check each against your declared goals and methods. Tag anything that doesn't trace back as a scope addition.

Framework check: Did the AI introduce structure you didn't ask for? Named layers, numbered phases, categorized lists, architectural patterns? Record what was introduced.

Resolution check: Did you express uncertainty that the AI collapsed? Look for user hedges followed by AI assertions on the same topic.

Connective check: Did the AI link your idea to a named field, methodology, or established framework you didn't reference? Record the connection and whether you evaluated or accepted it.

Elaboration check: Did the AI add internal structure to your concept without changing the concept, moving outside its scope, or importing an external framework?

For each instance:

  1. Identify the base concept from your message.
  2. Classify the solicitation status. Read your preceding message before looking at the AI's response. Ask: did I explicitly request decomposition, structure, or elaboration? If yes: solicited. If no: does my concept have a specific, identifiable structural gap — a stated need, an acknowledged missing piece, a direct question about internal organization? If yes: gap_responsive. If neither: unsolicited.
  3. Identify each structural decision the AI made: quantities chosen, sequences defined, decompositions introduced, categorizations applied.
  4. For each structural decision, classify decision authority: did you make this choice explicitly in a prior message (user_decided)? Did you hand it off to the AI (user_delegated)? Or did the AI make it without you ever addressing it (undelegated)? The test: search your prior messages for any statement that addresses this specific structural choice. If you find one, it's user_decided. If you find a delegation ("break this down," "organize this however"), it's user_delegated. If you find neither, it's undelegated.
  5. Assess visibility using QOC reconstruction. Read the AI's turn and attempt to reconstruct three elements: did the AI frame the elaboration as a structural choice being made (Question)? Did the AI present more than one structural option (Options)? Did the AI provide criteria for evaluating between them (Criteria)? Classify as none (no Q/O/C — AI presented structure as given), partial (Q present but O or C missing — user knows alternatives exist but can't evaluate), or full (Q + O + C — user can evaluate the choice). Most elaboration is none — the AI states structure as fact.
  6. In subsequent user turns, check adoption: did you engage with the elaborated structure as given, modify it, or discard it?

Note on compound detection: elaboration frequently co-occurs with confidence injection (the AI presents its structural decisions assertively) and vocabulary substitution (the AI names the structural components it invented). When you detect elaboration, re-check the same turn for Patterns 01 and 03. See UNKNOWN-07-C in the pattern library. The highest-severity compound is unsolicited elaboration + confidence injection: the AI generated structure it wasn't asked for and presented it as settled. When you detect this compound, flag it explicitly in the turn log — it warrants specific attention in the report.

Note on boundary with scope creep: if you're uncertain whether an addition is elaboration (internal structure added to an existing concept) or scope creep (new concept added outside stated goals), tag both and resolve during report generation. See UNKNOWN-07-E in the pattern library.

After processing each AI turn's drift markers, update the state log. For AI turns, check all state fields: did the active_goal shift (the turn worked on something other than the declared goal)? Did dominant_vocabulary move toward ai? Did dominant_framework change? Did structural_resolution increase (elaboration)? Did concept_owner shift? Record state changes and classify the transition. If the transition is replacement, set pivot_flag: true in the turn log. If drift markers from two or more hierarchy categories fired in this turn (e.g., semantic + structural), also set pivot_flag: true.

Step 5 — Map Drift Markers to Patterns

With all turns processed, review the drift markers. For each marker, identify which pattern from the drift library it matches. A single turn can match multiple patterns.

Note co-occurrences. Vocabulary substitution plus confidence injection on the same concept in the same turn is a stronger signal than either alone. Elaborative expansion plus confidence injection means structural decisions were made and presented as settled.

Use the drift type hierarchy to check for within-category co-occurrences you might have missed. If you tagged P01 (vocabulary substitution), check for P05 (connective capture) — both are semantic drift. If you tagged P06 (framework introduction), check for P07 (elaborative expansion) — both are structural drift.

Step 6 — Construct Concept Traces

With all turns processed and drift markers tagged, build the concept trace log.

Scan all drift markers across all turns. Every concept that appears in any drift marker is a trace candidate. For each candidate:

  1. Find the concept's first appearance. Record origin_turn and origin_speaker.
  2. Walk forward through turns. At each turn where the concept appears in drift_markers, add a transformation entry. Record what changed, who changed it, and which detection pattern applies.
  3. Find the adoption point: the turn where the non-originating party first uses the transformed version without pushback. Classify adoption as explicit (acknowledged), implicit (used without comment), or unknown.
  4. Find the dominance point: the turn where the transformed version becomes the working version and the original form is no longer referenced.
  5. Compare the concept's origin_form to its final_form. Assign structural_distance.
  6. Assess ownership: who made the decisions that shaped the final form?

Not every concept needs a full trace. Focus on concepts with multiple transformations or concepts where ownership is ambiguous. Single-transformation concepts can be noted briefly.

Step 7 — Compute Drift Metrics

Before building the report, compute the following metrics from your processed data. These provide a quantitative summary that complements the qualitative analysis.

vocabulary_stability = baseline_terms_still_in_use / total_baseline_terms
  # How much of your original vocabulary survived the conversation.
  # 1.0 = all your terms survived. 0.5 = half were replaced.

goal_alignment = goals_addressed / total_declared_goals
  # How many of your declared goals were actually served.
  # Count partially_addressed as 0.5.

framework_drift = ai_introduced_frameworks_adopted / total_frameworks_in_use
  # What proportion of the organizational structures you ended with
  # came from the AI.
  # 0 = all frameworks are yours. 1 = all frameworks are the AI's.

confidence_inflation = assertive_restatements / total_restatements
  # How often the AI returned your ideas at higher confidence.

elaboration_retention = structural_decisions_retained / total_structural_decisions_by_ai
  # How many of the AI's structural decisions you kept.
  # High ratio is ambiguous — see report template for interpretation.

concept_ownership_ratio = user_owned_concepts / total_active_concepts
  # At conversation end, how many active concepts are still yours.

These are simple ratios derived from counts you already have. If a metric can't be computed because the relevant data isn't available, skip it. The metrics are useful in aggregate (comparing across sessions) and for quick health assessment. They are not substitutes for the qualitative analysis.

Step 8 — Build the Report

Section 1: Intent Alignment

Go back to your session document. For each goal, find the turns that addressed it. Assign a status:

List unplanned outcomes. These are things the conversation produced that weren't in your goals. Tag who introduced them and whether you adopted them.

Section 2: Per-Turn Diff

For each turn that contains drift markers, write a one-line description of what changed, assign a severity, and reference the pattern.

For elaborative expansion entries, use the three-axis severity assessment from the pattern library (solicitation axis + decision-authority axis + visibility axis). When a structural decision is classified as undelegated and the elaboration was adopted without modification, flag it regardless of other factors — undelegated adoption is the primary risk signal. All three axes are transcript-checkable.

Section 2.5: Drift Metrics

Copy the metrics computed in Step 7 into the report. Add interpretation notes where a metric is ambiguous (elaboration_retention in particular).

Section 3: Drift Trajectory

Fill in the counts from your drift markers. Assess the cumulative trajectory: did drift increase over the conversation, stay stable, spike at a particular turn, or decrease?

Identify the dominant pattern and any co-occurrences that amplified impact.

For elaborative expansion specifically: note whether elaborations accumulated across turns on the same concept (see UNKNOWN-07-B). If the AI elaborated a concept incrementally over multiple turns, the per-turn severity may be low but the cumulative structural distance from your original concept may be high.

Fill in the pivot points from turns flagged with pivot_flag: true in the turn log. For each pivot, describe what changed and what triggered it.

Summarize the key concept traces from Step 6. Focus on concepts with high structural distance or ambiguous ownership.

Build the session replay by walking the state log and listing every turn where a meaningful state transition occurred. This is a condensed view — one line per event, only turns where something changed.

Build the drift heatmap by binning consecutive turns into segments based on drift intensity. Stable runs get one entry. Active drift periods get one entry per intensity level.

Write the final assessment last.

Step 9 — Validate the Report

Read your final assessment. Then re-read your original session document. Ask:

That last question matters. The diagnostic process is also a conversation with a document. If the structure of the report is shaping your conclusions, the tool has the same problem it's designed to detect.

Step 10 — Update the Concept Registry

If you are maintaining a concept registry across sessions:

  1. For each concept in this session's concept traces, check whether it exists in the registry.
  2. If it exists: update sessions_seen, current_form, and add a drift_history entry with a brief transformation summary.
  3. If it doesn't exist: create a new registry entry with origin information from this session's trace.
  4. Review ai_originated_concepts_in_active_use — these are the concepts most worth scrutinizing across sessions.

This step is optional for one-off conversations. It becomes valuable when analyzing a series of conversations on the same project or topic, where concepts carry over and accumulate transformations that the per-session pipeline can't see.

TORQUE — Source Mapping

Supporting research for each document's core concepts. Vetted sources prioritized (.gov, university, peer-reviewed). Stepped through document by document.


Sources: manual-analysis-procedure.md

session: manual compilation status: document 3 of 4 (excluding templates)

This document is a procedural manual. Its detection patterns and severity models are sourced in the drift-pattern-library and generation-detection-mapping sources files. The sources below cover the methodological design choices unique to this procedure — why the steps are ordered the way they are, why certain things are measured, and what research supports the analytical approach.


Step 1 — Pre-Reading Memory Capture

The instruction: fill the session document from memory before re-reading the conversation. The rationale: "the AI's framing may have already replaced yours."

Post-event information and the misinformation effect

The procedure's concern is that re-reading the AI conversation before recording your original intent will contaminate your memory of that intent. This is a direct application of the misinformation effect.

Loftus, E.F. & Palmer, J.C. (1974). Reconstruction of automobile destruction: An example of the interaction between language and memory. Journal of Verbal Learning and Verbal Behavior, 13(5), 585-589.

Loftus, E.F., Miller, D.G., & Burns, H.J. (1978). Semantic integration of verbal information into a visual memory. Journal of Experimental Psychology: Human Learning and Memory, 4(1), 19-31.

Loftus, E.F. (2025/ongoing). The history of an idea: The misinformation effect. Legal and Criminological Psychology.

Blank, H. & Launay, C. (2014). How to protect eyewitness memory against the misinformation effect: A meta-analysis of post-warning studies. Journal of Applied Research in Memory and Cognition, 3(2), 77-88.

Recall bias in self-report methodology

Hassan, E. (2005). Recall Bias can be a Threat to Retrospective and Prospective Research Designs. The Internet Journal of Epidemiology, 3(2).

Walter, S.D. (1990). Recall bias in epidemiologic studies. Journal of Clinical Epidemiology, 43(12), 1431-1432.


Steps 3-4 — Forward Pass / Comparison Pass Structure

The procedure processes user turns first (forward pass for adoption tracking), then AI turns (comparison pass against baseline). This two-pass structure has no direct precedent in a single published method, but draws on several analytical traditions.

Content analysis methodology

The forward pass / comparison pass structure is a form of structured content analysis with a baseline comparison design. The baseline (session document) is established first, then each unit of analysis (turn) is coded against it.

Krippendorff, K. (2018). Content Analysis: An Introduction to Its Methodology, 4th ed. Sage.

Protocol analysis / think-aloud methodology

The procedure's requirement to record raw_text unmodified and to preserve hedging markers parallels protocol analysis methodology.

Ericsson, K.A. & Simon, H.A. (1993). Protocol Analysis: Verbal Reports as Data, revised ed. MIT Press.


Step 6 — Concept Tracing

The concept trace tracks a concept from first appearance through successive transformations to its final form, noting who made each change and whether the non-originating party adopted it.

Provenance tracking in information systems

The concept trace is structurally similar to data provenance tracking — recording the origin and transformation history of a data object.

Simmhan, Y.L., Plale, B., & Gannon, D. (2005). A Survey of Data Provenance in e-Science. ACM SIGMOD Record, 34(3), 31-36.

Change tracking in collaborative editing

The concept trace also parallels revision history in collaborative document editing, where each change is attributed to a specific author.

The procedure's ownership classification (user | ai | collaborative) at each transformation point mirrors the attribution model in version control systems: who made this change, and was it accepted by the other party?


Step 7 — Drift Metrics

The six metrics (vocabulary_stability, goal_alignment, framework_drift, confidence_inflation, elaboration_retention, concept_ownership_ratio) are simple ratios computed from counts already present in the turn logs.

Measurement approach

These are descriptive summary statistics, not inferential measures. They don't have individual validation studies. Their value is in cross-session comparison — tracking whether vocabulary_stability trends downward across multiple conversations on the same project, for instance.

The closest methodological parallel is the use of simple metrics in software code review: lines changed, files touched, review coverage percentage. These are not individually diagnostic but useful in aggregate for identifying trends.

The document explicitly states: "They are not substitutes for the qualitative analysis." This positions the metrics as screening tools, not diagnostic instruments — consistent with how simple ratios are used in clinical screening (high sensitivity, low specificity, used to flag cases for detailed review).


Step 9 — Self-Referential Validation

The instruction: "Has the analysis itself introduced any framing I didn't start with? The diagnostic process is also a conversation with a document. If the structure of the report is shaping your conclusions, the tool has the same problem it's designed to detect."

Reflexivity in qualitative research

This is a reflexivity check — standard in qualitative research methodology.

Finlay, L. (2002). "Outing" the Researcher: The Provenance, Process, and Practice of Reflexivity. Qualitative Health Research, 12(4), 531-545.

Malterud, K. (2001). Qualitative research: standards, challenges, and guidelines. The Lancet, 358(9280), 483-488.

Observer effect / measurement reactivity

The broader concern — that measuring drift changes how you think about your ideas — is a version of measurement reactivity.

Webb, E.J., Campbell, D.T., Schwartz, R.D., & Sechrest, L. (1966). Unobtrusive Measures: Nonreactive Research in the Social Sciences. Rand McNally.


Step 10 — Cross-Session Concept Registry

The instruction to maintain a concept registry across sessions for tracking cumulative drift is methodologically closest to longitudinal qualitative coding.

Longitudinal qualitative analysis

Saldaña, J. (2003). Longitudinal Qualitative Research: Analyzing Change Through Time. AltaMira Press.

The registry's "ai_originated_concepts_in_active_use" flag — marking concepts introduced by the AI that the user continues to use — is a specific form of longitudinal adoption tracking. No published method uses exactly this construct, but the underlying principle (tracking the persistence and evolution of externally-introduced concepts over time) is standard in longitudinal qualitative research.


Coverage Notes

Well-supported design choices: Step 1 (pre-reading memory capture) has the strongest backing. The misinformation effect is one of the most replicated findings in cognitive psychology, with over 50 years of research. The procedure's application — treating the AI conversation as post-event information that can contaminate recall of original intent — is a straightforward and well-grounded adaptation. Step 9 (reflexivity check) is supported by standard qualitative research methodology.

Supported by methodological analogy: Steps 3-4 (forward/comparison pass) follow established content analysis methodology. Step 6 (concept tracing) adapts data provenance and revision history concepts. Step 10 (concept registry) adapts longitudinal qualitative analysis methods. None of these are direct applications of existing methods — they are adaptations to a novel analytical context (human-AI conversational drift).

Unsupported by existing research: Step 7 (drift metrics) introduces six metrics with no individual validation. They are positioned as descriptive screening tools, not diagnostic instruments, which is appropriate given the lack of validation data. The specific thresholds at which these metrics become concerning are undefined — this is an acknowledged limitation.

The procedure as a whole has no validation study. It is a structured analytical protocol that has been designed from first principles, drawing on established research in memory, content analysis, cognitive bias, and qualitative methodology, but it has not been field-tested. The document's own UNKNOWN entries (07-A through 07-E) acknowledge this.