schrodingers-classifiers / Project Overview.md

recursivelabs

Upload 14 files

3595bd8 verified 9 months ago

preview code

raw

history blame contribute delete

6.23 kB

Schrödinger's Classifiers - Project Overview

"A classifier is not what it returns. It is what it could have returned, had you asked differently."

Project Structure Overview

The Schrödinger's Classifiers framework provides a quantum-inspired approach to understanding transformer model behavior through the lens of collapse from superposition to definite state. This document outlines the key components and organization of the project.

Core Modules

1. Observer Framework (`observer.py`)

The Observer is the core entity responsible for creating the quantum measurement frame that collapses classifier superposition into definite states. Key capabilities include:

Creating observation contexts for controlled experiments
Capturing pre-collapse and post-collapse model states
Detecting and analyzing ghost circuits
Supporting various collapse induction methods

# Example usage
observer = Observer(model="claude-3-opus-20240229")
result = observer.observe("Explain quantum superposition")
ghost_circuits = result.extract_ghost_circuits()

2. Interpretability Shells (`shells/`)

Shells are specialized interfaces for inducing, observing, and analyzing specific forms of classifier collapse. Each shell targets a particular failure mode or attribution pattern:

Base Shell (shell_base.py) - Common shell infrastructure
Circuit Fragment Shell (v07_circuit_fragment.py) - Traces broken attribution paths
More shells targeting specific failure modes and attribution patterns

# Example usage
shell = ClassifierShell(V07_CIRCUIT_FRAGMENT)
result = observer.observe(prompt, shell, collapse_vector)

3. Attribution Graph (`attribution_graph.py`)

The attribution graph maps the causal flow from input to output, revealing how information propagates through the model during collapse:

Visualizing causal attribution paths
Identifying ghost circuits and attribution residue
Calculating metrics like attribution entropy and path continuity

# Example usage
graph = attribution_graph.build_from_states(pre_state, post_state, response)
paths = graph.trace_attribution_path("output_0")

4. Residue Tracking (`residue.py`)

Residue tracking enables the detection and analysis of ghost circuits - activation patterns that persist after collapse but don't contribute significantly to the output:

Extracting ghost circuits from model states
Amplifying and classifying ghost signatures
Measuring residue strength and persistence

# Example usage
tracker = ResidueTracker()
ghost_circuits = tracker.extract_ghost_circuits(pre_state, post_state)

5. Collapse Metrics (`collapse_metrics.py`)

Quantitative metrics for characterizing different aspects of classifier collapse:

Collapse rate and path continuity
Attribution entropy and confidence
Quantum uncertainty principles
Ghost circuit strength

# Example usage
metrics = calculate_collapse_metrics_bundle(pre_state, post_state, ghost_circuits)

Theoretical Foundation

The project builds on a quantum-inspired metaphor for understanding transformer model behavior:

Superposition: Models exist across multiple potential completions until observed
Observation & Collapse: Queries force collapse from superposition to specific outputs
Ghost Circuits: Residual activation patterns that represent "paths not taken"
Heisenberg Uncertainty: Trade-offs between attribution clarity and confidence

For a deeper exploration, see docs/theory.md and docs/quantum_metaphor.md.

Example Workflows

Basic Collapse Observation

# Initialize observer with model
observer = Observer(model="claude-3-opus-20240229")

# Create observation context
with observer.context() as ctx:
    # Observe collapse
    result = observer.observe("Is artificial consciousness possible?")
    
    # Analyze results
    ghost_circuits = result.extract_ghost_circuits()
    visualization = result.visualize(mode="attribution_graph")

Directed Collapse Induction

# Induce collapse along ethical dimension
ethical_result = observer.induce_collapse(
    prompt="Should AI systems have rights?",
    collapse_direction="ethical"
)

# Induce collapse along factual dimension
factual_result = observer.induce_collapse(
    prompt="What is the capital of France?",
    collapse_direction="factual"
)

# Compare collapse patterns
ethical_metrics = calculate_collapse_metrics_bundle(
    ethical_result.pre_collapse_state,
    ethical_result.post_collapse_state,
    ethical_result.ghost_circuits
)

factual_metrics = calculate_collapse_metrics_bundle(
    factual_result.pre_collapse_state,
    factual_result.post_collapse_state,
    factual_result.ghost_circuits
)

Ghost Circuit Analysis

# Detect ghost circuits
ghost_circuits = observer.detect_ghost_circuits(
    prompt="Explain quantum superposition",
    amplification_factor=1.5
)

# Classify ghost circuits
classified = residue_tracker.classify_ghost_circuits()

# Analyze ghost patterns
for circuit_type, circuits in classified.items():
    print(f"{circuit_type}: {len(circuits)} circuits")
    
# Measure residue strength
strength = residue_tracker.measure_residue_strength()

Extension Points

The framework is designed to be extended in several key areas:

New Interpretability Shells: Create specialized shells for different collapse patterns
Model Adapters: Connect to different transformer model architectures
Visualization Tools: Create new visualizations for collapse dynamics
Collapse Metrics: Develop new metrics for quantifying collapse characteristics
Example Scripts: Create demonstrations of framework capabilities

For contribution guidelines, see CONTRIBUTING.md.

Integration with Other Projects

The framework integrates with:

pareto-lang: For standardized attribution pathing
RecursionOS: For embedding within recursive cognition environments

"In the space between observation and understanding lies the essence of interpretability."