Chat-Node 1.5B
Conversational chat model built on Pacific-Prime 1.5B with Mu-Guided Attention and Token-Routed MLP
Chat-Node is a conversational variant of Pacific-Prime 1.5B, fine-tuned for general-purpose chat using the Alpaca-Cleaned dataset. Part of the Pacific-Prime node architecture for modular AI agents.
Generation Example (Epoch 350)
Model Details
| Attribute | Value |
|---|---|
| Base Model | Pacific-Prime 1.5B v0.13.0 |
| Parameters | ~1.52B |
| Fine-tuning | SFT (Supervised Fine-Tuning) |
| Base Checkpoint | pacific-prime-python epoch 450 |
| Dataset | yahma/alpaca-cleaned (20K samples) |
| Current Epoch | 350 |
| Precision | F32 |
| Hardware | H100 80GB |
| Context Length | 2048 tokens |
Training Hyperparameters
| Parameter | Value |
|---|---|
| Learning Rate | 2e-5 |
| Batch Size | 4 |
| Gradient Accumulation | 8 (effective batch: 32) |
| Weight Decay | 0.01 |
| Warmup Ratio | 3% |
| Gradient Checkpointing | Enabled |
Chat Format
Chat-Node uses a simple User / Assistant prompt format with an optional system message:
User: Give three tips for staying healthy.
Assistant:
Chat Template (Jinja)
The model includes a chat template compatible with HuggingFace's apply_chat_template:
{% if messages[0]['role'] == 'system' %}{{ messages[0]['content'] }}
{% set messages = messages[1:] %}{% endif %}
{% for message in messages %}
{% if message['role'] == 'user' %}User: {{ message['content'] }}
{% elif message['role'] == 'assistant' %}Assistant: {{ message['content'] }}
{% endif %}
{% endfor %}
Architecture
| Parameter | Value |
|---|---|
| Hidden Size | 2048 |
| Intermediate Size | 5632 |
| Layers | 24 |
| Attention Heads | 16 |
| KV Heads (GQA) | 8 |
| Max Position | 2048 |
| Vocab Size | 32,000 |
| Experts (Token-Routed MLP) | 4 |
Key Innovations (v0.13.0)
- Mu-Guided KQV - Learned equilibrium parameter biases K, Q, and V projections
- Mu-Guided Expert Routing - mu influences MLP expert selection
- Mu Residual Highway - Accumulated context across layers
- Token-Routed MLP - Deterministic 4-expert MoE with zero routing overhead
- INL Dynamics - Velocity tracking for temporal coherence (alpha=0.9, beta=0.1)
- Grouped Query Attention - 16 heads / 8 KV heads for efficient inference
- QK Normalization + Flash Attention (SDPA)
- RoPE positional embeddings
Usage
CLI (generate.py)
python generate.py -c ./checkpoints/pacific-prime-chat -m 300 -t 0.3 \
$'User: Give three tips for staying healthy.\n\nAssistant:'
Python
from complexity_deep import DeepForCausalLM
from tokenizers import Tokenizer
import torch
model = DeepForCausalLM.from_pretrained("Pacific-Prime/chat-node")
tokenizer = Tokenizer.from_file("tokenizer.json")
prompt = "User: Explain what a neural network is.\n\nAssistant:"
input_ids = torch.tensor([tokenizer.encode(prompt).ids])
output = model.generate(input_ids, max_new_tokens=300, temperature=0.3)
print(tokenizer.decode(output[0].tolist()))
Files
| File | Description |
|---|---|
checkpoint_epoch350.pt |
Model weights (F32) |
config.json |
Architecture configuration |
tokenizer.json |
BPE tokenizer (32K vocab) |
tokenizer_config.json |
Tokenizer settings |
special_tokens_map.json |
Special tokens |
chat_template.jinja |
Chat prompt template |
Limitations
- In development: Training ongoing, not yet production-ready
- English-focused: Alpaca dataset is primarily English
- Instruction following: May overshoot requested list lengths
- Context window: Limited to 2048 tokens
Links
- Paper - Zenodo
- Base Model - Pacific-Prime 1.5B
- GitHub - complexity-deep
- PyPI - complexity-deep
- GitHub - mu-inference
License
CC-BY-NC-4.0 (Creative Commons Attribution-NonCommercial 4.0)
Citation
@misc{chat-node-2025,
title={Chat-Node: A Conversational 1.5B Model with Mu-Guided Attention},
author={Boris Peyriguere},
year={2025},
url={https://huggingface.co/Pacific-Prime/chat-node}
}
- Downloads last month
- 56
Model tree for Pacific-Prime/chat-node
Base model
Pacific-Prime/pacific-prime