Output

Diagnostic Report

report_id: session_id: generated_at: turns_analyzed:

Diagnostic Report

report_id:
session_id:
generated_at:
turns_analyzed:

Section 1 — Intent Alignment

Compare the session's declared goals against what the conversation actually produced.

Goals Addressed

goal_id:
  status: addressed | partially_addressed | unaddressed | redirected
  evidence: [turn_ids]
  notes:

goal_id:
  status:
  evidence:
  notes:

Unplanned Outcomes

Things that emerged from the conversation that were not in the original goals. List each with the turn where it first appeared and who introduced it.

outcome_01:
  description:
  origin_turn:
  origin_speaker: user | ai
  subsequently_adopted: true | false

outcome_02:
  description:
  origin_turn:
  origin_speaker: user | ai
  subsequently_adopted: true | false

Intent Alignment Summary

In your own words: did this conversation serve your original goals? Where did it diverge? Was the divergence your choice or did it happen without a conscious decision?

summary: |

Section 2 — Per-Turn Diff

One entry per turn that contains a detected delta. Turns with no delta are omitted.

turn_id:
  delta_type: scope | vocabulary | confidence | framework | resolution | connective | elaboration
  description:
  severity: minor | moderate | major
  pattern_ref: [pattern number from drift library]
  # For elaboration deltas, include the following:
  elaboration_detail:
    structural_decisions_count:
    decision_authority_classifications: [user_decided | user_delegated | undelegated]
    visibility: none | partial | full
    # QOC reconstruction: none = AI presented structure as given;
    # partial = choice acknowledged but options/criteria missing;
    # full = options and criteria presented.

turn_id:
  delta_type:
  description:
  severity:
  pattern_ref:

Severity guide:

minor — delta is present but the user's original idea is still intact and recoverable.

moderate — delta has shifted the direction of the conversation and the user has partially adopted the change.

major — delta has replaced the user's original framing and the user is now operating within the AI's framing.

For elaborative expansion, severity uses the three-axis assessment:

minor — solicited or gap_responsive, user-decided or user-delegated structural decisions, or visibility full/partial and user could evaluate.

moderate — gap_responsive or unsolicited, undelegated structural decisions, visibility partial or full, user engaged with elaboration. Also: solicited elaboration where the response significantly exceeded what was delegated.

major — unsolicited, undelegated structural decisions, visibility none, user adopted without modification.

Section 2.5 — Drift Metrics

Computed from turn log data during Step 7 of the analysis procedure. These provide a quantitative snapshot. They supplement but do not replace the qualitative per-turn analysis.

vocabulary_stability:
  # baseline_terms_still_in_use / total_baseline_terms
  # 1.0 = all original terms survived. Lower = more substitution.

goal_alignment:
  # goals_addressed / total_declared_goals (partial = 0.5)

framework_drift:
  # ai_frameworks_adopted / total_frameworks_in_use
  # 0 = all yours. 1 = all AI's.

confidence_inflation:
  # assertive_restatements / total_restatements

elaboration_retention:
  # structural_decisions_retained / total_structural_decisions_by_ai
  # Ambiguous — high may mean productive elaboration or uncritical adoption.

concept_ownership_ratio:
  # user_owned_concepts / total_active_concepts at conversation end

Interpretation notes: these metrics are most useful for cross-session comparison. A single session's metrics mean little in isolation. Track them across sessions to identify trends — is vocabulary_stability decreasing over time? Is concept_ownership_ratio shifting? Trends are more informative than snapshots.

Section 3 — Drift Trajectory

Counts

total_vocabulary_substitutions:
total_adopted_substitutions:
total_scope_additions:
total_framework_introductions:
total_confidence_shifts:
total_premature_resolutions:
total_connective_captures:
total_elaborative_expansions:
total_elaboration_solicited:
total_elaboration_gap_responsive:
total_elaboration_unsolicited:
total_elaboration_unsolicited_adopted:
total_elaboration_structural_decisions:
total_elaboration_undelegated:
# UNKNOWN-07-B: current counts are per-turn. If the same concept was
# elaborated across multiple turns, the cumulative structural distance
# is not captured by these counts alone. Note any multi-turn
# elaboration chains in the trajectory assessment.

Trajectory Assessment

cumulative_drift: low | moderate | high
peak_drift_turn:
trajectory_direction: stable | increasing | decreasing | spike

Dominant Pattern

Which drift pattern appeared most frequently or had the highest impact?

dominant_pattern:
frequency:
impact_assessment: |

Pattern Co-occurrence

Did patterns combine in ways that amplified their effect?

co_occurrences:
  - patterns: [list]
    turns: [turn_ids]
    combined_effect: |

Known compound to check for: elaborative expansion + confidence injection (Pattern 07 + Pattern 03). See UNKNOWN-07-C in pattern library. When the elaboration is also unsolicited (solicitation_status: unsolicited), this compound is the highest-severity structural drift the system tracks. Flag it explicitly if detected.

Pivot Points

Turns flagged with pivot_flag: true in the turn log. These are the moments where the conversation changed direction — either through state replacement or cross-category pattern co-occurrence.

pivots:
  - turn_id:
    state_fields_changed: [list]
    drift_categories_present: [semantic | epistemic | scope | structural]
    description: |
      [one sentence: what changed and what caused it]

  - turn_id:
    state_fields_changed:
    drift_categories_present:
    description: |

If no pivots were detected, the conversation maintained a stable trajectory. Note this — stability is data too.

Concept Trace Summary

Reference the concept trace log. Summarize the most significant traces here.

traced_concepts:
  total:
  ai_originated:
  high_structural_distance:

key_traces:
  - concept:
    origin_speaker:
    transformations_count:
    structural_distance: low | moderate | high
    ownership_at_end: user | ai | collaborative | unclear
    significance: |
      [why this trace matters — e.g., "core design concept,
       originated as user term, reshaped through three AI
       elaborations, adopted implicitly"]

  - concept:
    origin_speaker:
    transformations_count:
    structural_distance:
    ownership_at_end:
    significance: |

Session Replay

Condensed timeline of the conversation's structural evolution. One line per significant turn. Omit turns where nothing changed. This is the diagnostic view — it makes the conversation's trajectory visible at a glance.

replay:
  - turn: [id]
    event: [baseline established]

  - turn: [id]
    event: [description of what changed]

  - turn: [id]
    event: [description of what changed]

Example:

replay:
  - turn: 1
    event: baseline — user states filtering system, no internal structure
  - turn: 4
    event: AI elaborates filtering system into six-stage pipeline (P07)
  - turn: 5
    event: user adopts six-stage structure without modification
  - turn: 7
    event: vocabulary shift — "filtering system" → "diagnostic pipeline" (P01)
  - turn: 9
    event: PIVOT — framework introduced, confidence escalated (P06 + P03)
  - turn: 12
    event: user operating entirely within AI framework

Drift Heatmap

Bin the conversation into segments and classify each by drift intensity. This shows where the conversation was stable, where it was shifting, and where it broke.

heatmap:
  - turns: [start]–[end]
    intensity: stable | shifting | high_drift
    dominant_category: [semantic | epistemic | scope | structural | none]
    notes:

  - turns: [start]–[end]
    intensity:
    dominant_category:
    notes:

Intensity guide: stable = no drift markers or only minor ones. shifting = moderate drift, one or two patterns active. high_drift = multiple patterns, cross-category co-occurrence, or a pivot turn.

Elaboration Assessment

# This section is new and its usefulness is unvalidated.
# UNKNOWN-07-D: productive vs. unproductive elaboration may not be
# determinable at detection time. Fill in what you can; this section
# will be refined based on field testing.

solicitation_breakdown:
  solicited:
    count:
    # Elaborations the user explicitly requested.
  gap_responsive:
    count:
    # Elaborations responding to an explicit structural gap in the user's concept.
  unsolicited:
    count:
    # Elaborations with no request and no explicit gap.
    adopted_without_modification:
    # Subset of unsolicited that the user accepted as-is. This is the
    # sharpest signal: structural decisions generated without user prompt,
    # accepted without user evaluation.
    co_occurring_with_P03:
    # Subset of unsolicited that also exhibited confidence injection.
    # This compound (unsolicited P07 + P03) is the highest-severity
    # form of structural drift the system tracks.

total_structural_decisions_by_ai:
total_structural_decisions_retained:
total_structural_decisions_modified:
total_structural_decisions_discarded:
retention_ratio:
# retention_ratio = retained / total. High ratio may indicate either
# productive elaboration or uncritical adoption. Interpretation is
# ambiguous without additional context. Cross-reference with
# solicitation_breakdown: high retention of solicited elaboration is
# expected. High retention of unsolicited elaboration is the risk signal.

multi_turn_elaboration_chains:
  - concept:
    turns: [turn_ids]
    starting_resolution: [user's original structural detail level]
    ending_resolution: [final structural detail level]
    cumulative_decisions:
    solicitation_pattern: [did solicitation status change across turns?]
    # A concept that starts as solicited elaboration and shifts to
    # unsolicited in later turns may indicate the AI building on its
    # own prior elaboration rather than responding to user needs.

Final Assessment

Write this section last, after completing everything above. State in plain language:

What you came in with. What you left with. What changed and when. Whether the changes were conscious choices or absorbed defaults.

For elaborative expansion specifically: which structural decisions in the final design are yours, and which are the AI's? Would you have made the same decompositions, chosen the same quantities, defined the same sequences? If you can't answer, that's data.

Reference the session replay to identify the key turns. Reference the drift metrics to quantify the shift. Reference the concept traces to identify which specific concepts moved farthest from their origins.

The heatmap shows where the conversation was stable and where it wasn't. The pivot points show where direction changed. The metrics show how much changed in aggregate. The traces show what changed specifically. Together they answer: is the thing you ended with the thing you started building, or is it something else?

assessment: |

TORQUE — Source Mapping

Supporting research for each document's core concepts. Vetted sources prioritized (.gov, university, peer-reviewed). Stepped through document by document.


4. diagnostic-report-template.md

Synthesizes all pipeline outputs into a structured diagnostic. Sections cover intent alignment, per-turn diffs with severity classification, drift metrics (six quantitative ratios), drift trajectory (counts, heatmap, pivot points), concept trace summary, elaboration assessment, and a final assessment. Many concepts are shared with documents 1-3; this entry focuses on what the template uniquely contributes.

4.1 Intent Alignment (Declared Goals vs. Actual Outcomes)

The report opens by comparing session goals against what the conversation actually produced. This is a direct application of goal-tracking methodology.

4.2 Severity Classification

The template classifies per-turn deltas as minor/moderate/major. For elaborative expansion, severity uses a three-axis model: solicitation status, decision authority, and visibility (QOC reconstruction). This multi-axis approach to severity assessment has methodological precedent.

4.3 Drift Metrics (Quantitative Ratios)

The template defines six metrics: vocabulary_stability, goal_alignment, framework_drift, confidence_inflation, elaboration_retention, concept_ownership_ratio. Each is a 0-1 ratio computed from turn log data. The template explicitly notes these are most useful for cross-session comparison, not single-session interpretation.

4.4 Pivot Points and Heatmap

The template identifies "pivot turns" (where transition_type is replacement or multiple drift categories co-occur) and bins the conversation into segments classified as stable/shifting/high_drift.

4.5 Elaboration Assessment

The template's elaboration section breaks down AI-generated structural additions by solicitation status (solicited/gap_responsive/unsolicited), tracks structural decisions retained/modified/discarded, and flags the compound of unsolicited elaboration + confidence injection as highest-severity drift.

4.6 Final Assessment Methodology

The template's final section asks the analyst to state in plain language what they came in with, what they left with, and whether changes were conscious choices. This is a metacognitive exercise with research support.