|
|
--- |
|
|
license: cc-by-nc-4.0 |
|
|
language: |
|
|
- en |
|
|
- fr |
|
|
tags: |
|
|
- complexity-deep |
|
|
- transformer |
|
|
- moe |
|
|
- token-routed |
|
|
- inl-dynamics |
|
|
- mu-guided |
|
|
- causal-lm |
|
|
- chat |
|
|
- conversational |
|
|
- sft |
|
|
pipeline_tag: text-generation |
|
|
library_name: complexity-deep |
|
|
base_model: Pacific-Prime/pacific-prime |
|
|
model-index: |
|
|
- name: chat-node |
|
|
results: [] |
|
|
--- |
|
|
|
|
|
# Chat-Node 1.5B |
|
|
|
|
|
> **Conversational chat model built on Pacific-Prime 1.5B with Mu-Guided Attention and Token-Routed MLP** |
|
|
|
|
|
Chat-Node is a conversational variant of [Pacific-Prime 1.5B](https://huggingface.co/Pacific-Prime/pacific-prime), fine-tuned for general-purpose chat using the Alpaca-Cleaned dataset. Part of the Pacific-Prime node architecture for modular AI agents. |
|
|
|
|
|
## Generation Example (Epoch 350) |
|
|
|
|
|
 |
|
|
|
|
|
--- |
|
|
|
|
|
## Model Details |
|
|
|
|
|
| Attribute | Value | |
|
|
|-----------|-------| |
|
|
| Base Model | Pacific-Prime 1.5B v0.13.0 | |
|
|
| Parameters | ~1.52B | |
|
|
| Fine-tuning | SFT (Supervised Fine-Tuning) | |
|
|
| Base Checkpoint | pacific-prime-python epoch 450 | |
|
|
| Dataset | [yahma/alpaca-cleaned](https://huggingface.co/datasets/yahma/alpaca-cleaned) (20K samples) | |
|
|
| Current Epoch | 350 | |
|
|
| Precision | F32 | |
|
|
| Hardware | H100 80GB | |
|
|
| Context Length | 2048 tokens | |
|
|
|
|
|
### Training Hyperparameters |
|
|
|
|
|
| Parameter | Value | |
|
|
|-----------|-------| |
|
|
| Learning Rate | 2e-5 | |
|
|
| Batch Size | 4 | |
|
|
| Gradient Accumulation | 8 (effective batch: 32) | |
|
|
| Weight Decay | 0.01 | |
|
|
| Warmup Ratio | 3% | |
|
|
| Gradient Checkpointing | Enabled | |
|
|
|
|
|
--- |
|
|
|
|
|
## Chat Format |
|
|
|
|
|
Chat-Node uses a simple User / Assistant prompt format with an optional system message: |
|
|
|
|
|
User: Give three tips for staying healthy. |
|
|
|
|
|
Assistant: |
|
|
|
|
|
### Chat Template (Jinja) |
|
|
|
|
|
The model includes a chat template compatible with HuggingFace's `apply_chat_template`: |
|
|
|
|
|
{% if messages[0]['role'] == 'system' %}{{ messages[0]['content'] }} |
|
|
{% set messages = messages[1:] %}{% endif %} |
|
|
{% for message in messages %} |
|
|
{% if message['role'] == 'user' %}User: {{ message['content'] }} |
|
|
{% elif message['role'] == 'assistant' %}Assistant: {{ message['content'] }} |
|
|
{% endif %} |
|
|
{% endfor %} |
|
|
|
|
|
--- |
|
|
|
|
|
## Architecture |
|
|
|
|
|
| Parameter | Value | |
|
|
|-----------|-------| |
|
|
| Hidden Size | 2048 | |
|
|
| Intermediate Size | 5632 | |
|
|
| Layers | 24 | |
|
|
| Attention Heads | 16 | |
|
|
| KV Heads (GQA) | 8 | |
|
|
| Max Position | 2048 | |
|
|
| Vocab Size | 32,000 | |
|
|
| Experts (Token-Routed MLP) | 4 | |
|
|
|
|
|
### Key Innovations (v0.13.0) |
|
|
|
|
|
- **Mu-Guided KQV** - Learned equilibrium parameter biases K, Q, and V projections |
|
|
- **Mu-Guided Expert Routing** - mu influences MLP expert selection |
|
|
- **Mu Residual Highway** - Accumulated context across layers |
|
|
- **Token-Routed MLP** - Deterministic 4-expert MoE with zero routing overhead |
|
|
- **INL Dynamics** - Velocity tracking for temporal coherence (alpha=0.9, beta=0.1) |
|
|
- **Grouped Query Attention** - 16 heads / 8 KV heads for efficient inference |
|
|
- **QK Normalization** + **Flash Attention (SDPA)** |
|
|
- **RoPE** positional embeddings |
|
|
|
|
|
--- |
|
|
|
|
|
## Usage |
|
|
|
|
|
### CLI (generate.py) |
|
|
|
|
|
```bash |
|
|
python generate.py -c ./checkpoints/pacific-prime-chat -m 300 -t 0.3 \ |
|
|
$'User: Give three tips for staying healthy.\n\nAssistant:' |
|
|
``` |
|
|
|
|
|
### Python |
|
|
|
|
|
```python |
|
|
from complexity_deep import DeepForCausalLM |
|
|
from tokenizers import Tokenizer |
|
|
import torch |
|
|
|
|
|
model = DeepForCausalLM.from_pretrained("Pacific-Prime/chat-node") |
|
|
tokenizer = Tokenizer.from_file("tokenizer.json") |
|
|
|
|
|
prompt = "User: Explain what a neural network is.\n\nAssistant:" |
|
|
|
|
|
input_ids = torch.tensor([tokenizer.encode(prompt).ids]) |
|
|
output = model.generate(input_ids, max_new_tokens=300, temperature=0.3) |
|
|
print(tokenizer.decode(output[0].tolist())) |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## Files |
|
|
|
|
|
| File | Description | |
|
|
|------|-------------| |
|
|
| `checkpoint_epoch350.pt` | Model weights (F32) | |
|
|
| `config.json` | Architecture configuration | |
|
|
| `tokenizer.json` | BPE tokenizer (32K vocab) | |
|
|
| `tokenizer_config.json` | Tokenizer settings | |
|
|
| `special_tokens_map.json` | Special tokens | |
|
|
| `chat_template.jinja` | Chat prompt template | |
|
|
|
|
|
--- |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- **In development**: Training ongoing, not yet production-ready |
|
|
- **English-focused**: Alpaca dataset is primarily English |
|
|
- **Instruction following**: May overshoot requested list lengths |
|
|
- **Context window**: Limited to 2048 tokens |
|
|
|
|
|
--- |
|
|
|
|
|
## Links |
|
|
|
|
|
- [Paper - Zenodo](https://zenodo.org/records/18293026) |
|
|
- [Base Model - Pacific-Prime 1.5B](https://huggingface.co/Pacific-Prime/pacific-prime) |
|
|
- [GitHub - complexity-deep](https://github.com/Complexity-ML/complexity-deep) |
|
|
- [PyPI - complexity-deep](https://pypi.org/project/complexity-deep/) |
|
|
- [GitHub - mu-inference](https://github.com/Complexity-ML/mu-inference) |
|
|
|
|
|
--- |
|
|
|
|
|
## License |
|
|
|
|
|
**CC-BY-NC-4.0** (Creative Commons Attribution-NonCommercial 4.0) |
|
|
|
|
|
--- |
|
|
|
|
|
## Citation |
|
|
|
|
|
```bibtex |
|
|
@misc{chat-node-2025, |
|
|
title={Chat-Node: A Conversational 1.5B Model with Mu-Guided Attention}, |
|
|
author={Boris Peyriguere}, |
|
|
year={2025}, |
|
|
url={https://huggingface.co/Pacific-Prime/chat-node} |
|
|
} |
|
|
``` |
|
|
|