LLM Complexity Router
A fine-tuned DeBERTa-v3-small classifier that routes queries between gpt-4o-mini (cheap) and gpt-4o (expensive) β saving ~41% cost while improving response quality vs always using the expensive model.
Performance (WildBench β 200 real user queries)
| Strategy | Quality (1-10) | Cost/1K | % Cheap | Quality Ξ | Cost Saved |
|---|---|---|---|---|---|
| always_expensive | 8.11 | $6.00 | 0% | baseline | baseline |
| length_based | 8.02 | $3.35 | 47% | -0.09 | +44.2% |
| deberta_router | 8.24 | $3.55 | 43.5% | +0.13 | +40.9% |
| routellm_mf | 7.96 | $3.60 | 42.5% | -0.15 | +39.9% |
Only router that beats the expensive baseline on quality and saves cost.
Category Breakdown (vs always_expensive)
| Category | Router | Baseline | Ξ |
|---|---|---|---|
| Advice seeking | 9.50 | 9.00 | +0.50 |
| Brainstorming | 8.40 | 8.20 | +0.20 |
| Coding & Debugging | 7.73 | 7.89 | -0.16 |
| Creative Writing | 8.00 | 7.74 | +0.26 |
| Data Analysis | 9.20 | 9.00 | +0.20 |
| Editing | 8.40 | 8.40 | +0.00 |
| Information seeking | 8.11 | 7.83 | +0.28 |
| Math | 8.33 | 7.83 | +0.50 |
| Planning | 8.59 | 8.77 | -0.18 |
| Reasoning | 8.82 | 8.61 | +0.21 |
| Role playing | 7.00 | 6.71 | +0.29 |
Usage
from transformers import pipeline
classifier = pipeline(
"text-classification",
model="your-username/complexity-router"
)
result = classifier("What is the capital of France?")
# β [{'label': 'SIMPLE', 'score': 0.98}] β route to gpt-4o-mini
result = classifier("Prove the Riemann hypothesis step by step")
# β [{'label': 'COMPLEX', 'score': 0.95}] β route to gpt-4o
Training
- Base model:
microsoft/deberta-v3-small - Training data: proprietary (not released)
- Labels: SIMPLE / COMPLEX
- Benchmarked against: RouteLLM mf router, length-based baseline
Limitations
- Weaker on Coding & Debugging (-0.16) and Planning (-0.18)
- Optimized for gpt-4o vs gpt-4o-mini routing specifically
- Training data distribution may not match all use cases
- Downloads last month
- 33
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support