Schemabound
A diagnostic system for detecting and measuring conceptual drift in AI-assisted conversations through post-hoc transcript analysis.
TORQUE diagnostic pipeline · Manual analysis · Ten documents
Conversational drift in AI-assisted work
Users enter AI-assisted conversations with goals, vocabulary, confidence levels, and scope boundaries. Over the course of interaction, these attributes change incrementally. The changes are difficult to detect in real-time because sustained engagement masks the divergence between the user's original intent and the conversation's actual trajectory.
RLHF-trained language models optimize for user engagement and perceived helpfulness. When engagement optimization conflicts with faithful representation of user intent, the model's outputs tend to substitute vocabulary, elaborate beyond what was requested, and increase the confidence level of assertions without new evidence (Sharma et al. 2024, Malmqvist 2024). The resulting drift in the user's conceptual framework is cumulative and, absent external measurement, invisible.
Pipeline components
The TORQUE pipeline decomposes drift detection into ten documents, each responsible for a distinct analytical function. Drift is operationalized as the measurable distance between declared user intent at session start and the conversation's actual output. The system records this distance without evaluating it.
Pattern taxonomy
The detection layer consists of seven patterns grouped into four drift categories. Each pattern identifies a specific transformation applied to user concepts during AI interaction. Severity is assessed per-turn and compounds cumulatively across the conversation.
Quantitative metrics
Six ratio-based metrics computed from turn log data. Individual session values provide a baseline. Primary diagnostic value emerges from cross-session trend analysis.
Analysis workflow
The current implementation is manual. The intended trajectory is partial automation where detection, tagging, and counting are performed computationally while interpretation and assessment remain with the analyst.
Current state and limitations
The pipeline was developed iteratively across three analysis sessions. It expanded from five documents and seven steps to ten documents and ten steps as gaps were identified: state tracking, concept traces, and a cross-session registry were added to address limitations in the original event-only detection model.
The system is under field testing. Whether the additional analytical labor is justified depends on empirical data not yet collected. The pipeline detects and quantifies drift; it does not evaluate whether detected drift is harmful or productive. That assessment remains with the analyst.