Post
23
Real-Time Introspection for Qwen3 235B
An SRT adapter is now available for Qwen3 235B, a large open Mixture of Experts model. This adapter adds a lightweight monitoring layer that provides signals about the models internal state during generation without modifying any of its original weights.
The adapter can indicate whether the model is operating in a more stable or more divergent internal condition. It also estimates how self referential the models processing is at each step and tracks shifts in representation across layers. In addition it produces information about the kind of discourse style the model appears to be using at any moment.
This is the first adapter of its kind released on a model of this scale with fully open weights. It was trained in a read only configuration so the base model remains unchanged. The strongest result is in regime detection where the adapter achieved very high accuracy in distinguishing between different internal operating states on held out data.
For researchers focused on model interpretability this provides a practical way to inspect internal dynamics at frontier scale. For those working in semiotics it offers empirical access to how processes of meaning and interpretation unfold inside large contemporary language models.
Model: RiverRider/srt-adapter-qwen3-235b
Repository and documentation: https://github.com/space-bacon/SRT
An SRT adapter is now available for Qwen3 235B, a large open Mixture of Experts model. This adapter adds a lightweight monitoring layer that provides signals about the models internal state during generation without modifying any of its original weights.
The adapter can indicate whether the model is operating in a more stable or more divergent internal condition. It also estimates how self referential the models processing is at each step and tracks shifts in representation across layers. In addition it produces information about the kind of discourse style the model appears to be using at any moment.
This is the first adapter of its kind released on a model of this scale with fully open weights. It was trained in a read only configuration so the base model remains unchanged. The strongest result is in regime detection where the adapter achieved very high accuracy in distinguishing between different internal operating states on held out data.
For researchers focused on model interpretability this provides a practical way to inspect internal dynamics at frontier scale. For those working in semiotics it offers empirical access to how processes of meaning and interpretation unfold inside large contemporary language models.
Model: RiverRider/srt-adapter-qwen3-235b
Repository and documentation: https://github.com/space-bacon/SRT