Chat-Node 1.5B

Conversational chat model built on Pacific-Prime 1.5B with Mu-Guided Attention and Token-Routed MLP

Chat-Node is a conversational variant of Pacific-Prime 1.5B, fine-tuned for general-purpose chat using the Alpaca-Cleaned dataset. Part of the Pacific-Prime node architecture for modular AI agents.

Generation Example (Epoch 350)

Model Details

Attribute	Value
Base Model	Pacific-Prime 1.5B v0.13.0
Parameters	~1.52B
Fine-tuning	SFT (Supervised Fine-Tuning)
Base Checkpoint	pacific-prime-python epoch 450
Dataset	yahma/alpaca-cleaned (20K samples)
Current Epoch	350
Precision	F32
Hardware	H100 80GB
Context Length	2048 tokens

Training Hyperparameters

Parameter	Value
Learning Rate	2e-5
Batch Size	4
Gradient Accumulation	8 (effective batch: 32)
Weight Decay	0.01
Warmup Ratio	3%
Gradient Checkpointing	Enabled

Chat Format

Chat-Node uses a simple User / Assistant prompt format with an optional system message:

User: Give three tips for staying healthy.

Assistant:

Chat Template (Jinja)

The model includes a chat template compatible with HuggingFace's apply_chat_template:

{% if messages[0]['role'] == 'system' %}{{ messages[0]['content'] }}
{% set messages = messages[1:] %}{% endif %}
{% for message in messages %}
  {% if message['role'] == 'user' %}User: {{ message['content'] }}
  {% elif message['role'] == 'assistant' %}Assistant: {{ message['content'] }}
  {% endif %}
{% endfor %}

Architecture

Parameter	Value
Hidden Size	2048
Intermediate Size	5632
Layers	24
Attention Heads	16
KV Heads (GQA)	8
Max Position	2048
Vocab Size	32,000
Experts (Token-Routed MLP)	4

Key Innovations (v0.13.0)

Mu-Guided KQV - Learned equilibrium parameter biases K, Q, and V projections
Mu-Guided Expert Routing - mu influences MLP expert selection
Mu Residual Highway - Accumulated context across layers
Token-Routed MLP - Deterministic 4-expert MoE with zero routing overhead
INL Dynamics - Velocity tracking for temporal coherence (alpha=0.9, beta=0.1)
Grouped Query Attention - 16 heads / 8 KV heads for efficient inference
QK Normalization + Flash Attention (SDPA)
RoPE positional embeddings

Usage

CLI (generate.py)

python generate.py -c ./checkpoints/pacific-prime-chat -m 300 -t 0.3 \
  $'User: Give three tips for staying healthy.\n\nAssistant:'

Python

from complexity_deep import DeepForCausalLM
from tokenizers import Tokenizer
import torch

model = DeepForCausalLM.from_pretrained("Pacific-Prime/chat-node")
tokenizer = Tokenizer.from_file("tokenizer.json")

prompt = "User: Explain what a neural network is.\n\nAssistant:"

input_ids = torch.tensor([tokenizer.encode(prompt).ids])
output = model.generate(input_ids, max_new_tokens=300, temperature=0.3)
print(tokenizer.decode(output[0].tolist()))

Files

File	Description
`checkpoint_epoch350.pt`	Model weights (F32)
`config.json`	Architecture configuration
`tokenizer.json`	BPE tokenizer (32K vocab)
`tokenizer_config.json`	Tokenizer settings
`special_tokens_map.json`	Special tokens
`chat_template.jinja`	Chat prompt template

Limitations

In development: Training ongoing, not yet production-ready
English-focused: Alpaca dataset is primarily English
Instruction following: May overshoot requested list lengths
Context window: Limited to 2048 tokens

License

CC-BY-NC-4.0 (Creative Commons Attribution-NonCommercial 4.0)

Citation

@misc{chat-node-2025,
  title={Chat-Node: A Conversational 1.5B Model with Mu-Guided Attention},
  author={Boris Peyriguere},
  year={2025},
  url={https://huggingface.co/Pacific-Prime/chat-node}
}

Downloads last month: 56

Model tree for Pacific-Prime/chat-node

Base model

Pacific-Prime/pacific-prime

Finetuned

(1)

this model

Pacific-Prime
/

chat-node