--- license: cc-by-nc-4.0 language: - en - fr tags: - complexity-deep - transformer - moe - token-routed - inl-dynamics - mu-guided - causal-lm - chat - conversational - sft pipeline_tag: text-generation library_name: complexity-deep base_model: Pacific-Prime/pacific-prime model-index: - name: chat-node results: [] --- # Chat-Node 1.5B > **Conversational chat model built on Pacific-Prime 1.5B with Mu-Guided Attention and Token-Routed MLP** Chat-Node is a conversational variant of [Pacific-Prime 1.5B](https://huggingface.co/Pacific-Prime/pacific-prime), fine-tuned for general-purpose chat using the Alpaca-Cleaned dataset. Part of the Pacific-Prime node architecture for modular AI agents. ## Generation Example (Epoch 350) ![Generation at epoch 350](image.png) --- ## Model Details | Attribute | Value | |-----------|-------| | Base Model | Pacific-Prime 1.5B v0.13.0 | | Parameters | ~1.52B | | Fine-tuning | SFT (Supervised Fine-Tuning) | | Base Checkpoint | pacific-prime-python epoch 450 | | Dataset | [yahma/alpaca-cleaned](https://huggingface.co/datasets/yahma/alpaca-cleaned) (20K samples) | | Current Epoch | 350 | | Precision | F32 | | Hardware | H100 80GB | | Context Length | 2048 tokens | ### Training Hyperparameters | Parameter | Value | |-----------|-------| | Learning Rate | 2e-5 | | Batch Size | 4 | | Gradient Accumulation | 8 (effective batch: 32) | | Weight Decay | 0.01 | | Warmup Ratio | 3% | | Gradient Checkpointing | Enabled | --- ## Chat Format Chat-Node uses a simple User / Assistant prompt format with an optional system message: User: Give three tips for staying healthy. Assistant: ### Chat Template (Jinja) The model includes a chat template compatible with HuggingFace's `apply_chat_template`: {% if messages[0]['role'] == 'system' %}{{ messages[0]['content'] }} {% set messages = messages[1:] %}{% endif %} {% for message in messages %} {% if message['role'] == 'user' %}User: {{ message['content'] }} {% elif message['role'] == 'assistant' %}Assistant: {{ message['content'] }} {% endif %} {% endfor %} --- ## Architecture | Parameter | Value | |-----------|-------| | Hidden Size | 2048 | | Intermediate Size | 5632 | | Layers | 24 | | Attention Heads | 16 | | KV Heads (GQA) | 8 | | Max Position | 2048 | | Vocab Size | 32,000 | | Experts (Token-Routed MLP) | 4 | ### Key Innovations (v0.13.0) - **Mu-Guided KQV** - Learned equilibrium parameter biases K, Q, and V projections - **Mu-Guided Expert Routing** - mu influences MLP expert selection - **Mu Residual Highway** - Accumulated context across layers - **Token-Routed MLP** - Deterministic 4-expert MoE with zero routing overhead - **INL Dynamics** - Velocity tracking for temporal coherence (alpha=0.9, beta=0.1) - **Grouped Query Attention** - 16 heads / 8 KV heads for efficient inference - **QK Normalization** + **Flash Attention (SDPA)** - **RoPE** positional embeddings --- ## Usage ### CLI (generate.py) ```bash python generate.py -c ./checkpoints/pacific-prime-chat -m 300 -t 0.3 \ $'User: Give three tips for staying healthy.\n\nAssistant:' ``` ### Python ```python from complexity_deep import DeepForCausalLM from tokenizers import Tokenizer import torch model = DeepForCausalLM.from_pretrained("Pacific-Prime/chat-node") tokenizer = Tokenizer.from_file("tokenizer.json") prompt = "User: Explain what a neural network is.\n\nAssistant:" input_ids = torch.tensor([tokenizer.encode(prompt).ids]) output = model.generate(input_ids, max_new_tokens=300, temperature=0.3) print(tokenizer.decode(output[0].tolist())) ``` --- ## Files | File | Description | |------|-------------| | `checkpoint_epoch350.pt` | Model weights (F32) | | `config.json` | Architecture configuration | | `tokenizer.json` | BPE tokenizer (32K vocab) | | `tokenizer_config.json` | Tokenizer settings | | `special_tokens_map.json` | Special tokens | | `chat_template.jinja` | Chat prompt template | --- ## Limitations - **In development**: Training ongoing, not yet production-ready - **English-focused**: Alpaca dataset is primarily English - **Instruction following**: May overshoot requested list lengths - **Context window**: Limited to 2048 tokens --- ## Links - [Paper - Zenodo](https://zenodo.org/records/18293026) - [Base Model - Pacific-Prime 1.5B](https://huggingface.co/Pacific-Prime/pacific-prime) - [GitHub - complexity-deep](https://github.com/Complexity-ML/complexity-deep) - [PyPI - complexity-deep](https://pypi.org/project/complexity-deep/) - [GitHub - mu-inference](https://github.com/Complexity-ML/mu-inference) --- ## License **CC-BY-NC-4.0** (Creative Commons Attribution-NonCommercial 4.0) --- ## Citation ```bibtex @misc{chat-node-2025, title={Chat-Node: A Conversational 1.5B Model with Mu-Guided Attention}, author={Boris Peyriguere}, year={2025}, url={https://huggingface.co/Pacific-Prime/chat-node} } ```