chat-node / README.md

Upload README.md with huggingface_hub

13f8c22 verified 5 days ago

4.94 kB

	---
	license: cc-by-nc-4.0
	language:
	- en
	- fr
	tags:
	- complexity-deep
	- transformer
	- moe
	- token-routed
	- inl-dynamics
	- mu-guided
	- causal-lm
	- chat
	- conversational
	- sft
	pipeline_tag: text-generation
	library_name: complexity-deep
	base_model: Pacific-Prime/pacific-prime
	model-index:
	- name: chat-node
	results: []
	---

	# Chat-Node 1.5B

	> Conversational chat model built on Pacific-Prime 1.5B with Mu-Guided Attention and Token-Routed MLP

	Chat-Node is a conversational variant of [Pacific-Prime 1.5B](https://huggingface.co/Pacific-Prime/pacific-prime), fine-tuned for general-purpose chat using the Alpaca-Cleaned dataset. Part of the Pacific-Prime node architecture for modular AI agents.

	## Generation Example (Epoch 350)

	![Generation at epoch 350](image.png)

	---

	## Model Details

	\| Attribute \| Value \|
	\|-----------\|-------\|
	\| Base Model \| Pacific-Prime 1.5B v0.13.0 \|
	\| Parameters \| ~1.52B \|
	\| Fine-tuning \| SFT (Supervised Fine-Tuning) \|
	\| Base Checkpoint \| pacific-prime-python epoch 450 \|
	\| Dataset \| [yahma/alpaca-cleaned](https://huggingface.co/datasets/yahma/alpaca-cleaned) (20K samples) \|
	\| Current Epoch \| 350 \|
	\| Precision \| F32 \|
	\| Hardware \| H100 80GB \|
	\| Context Length \| 2048 tokens \|

	### Training Hyperparameters

	\| Parameter \| Value \|
	\|-----------\|-------\|
	\| Learning Rate \| 2e-5 \|
	\| Batch Size \| 4 \|
	\| Gradient Accumulation \| 8 (effective batch: 32) \|
	\| Weight Decay \| 0.01 \|
	\| Warmup Ratio \| 3% \|
	\| Gradient Checkpointing \| Enabled \|

	---

	## Chat Format

	Chat-Node uses a simple User / Assistant prompt format with an optional system message:

	User: Give three tips for staying healthy.

	Assistant:

	### Chat Template (Jinja)

	The model includes a chat template compatible with HuggingFace's `apply_chat_template`:

	{% if messages[0]['role'] == 'system' %}{{ messages[0]['content'] }}
	{% set messages = messages[1:] %}{% endif %}
	{% for message in messages %}
	{% if message['role'] == 'user' %}User: {{ message['content'] }}
	{% elif message['role'] == 'assistant' %}Assistant: {{ message['content'] }}
	{% endif %}
	{% endfor %}

	---

	## Architecture

	\| Parameter \| Value \|
	\|-----------\|-------\|
	\| Hidden Size \| 2048 \|
	\| Intermediate Size \| 5632 \|
	\| Layers \| 24 \|
	\| Attention Heads \| 16 \|
	\| KV Heads (GQA) \| 8 \|
	\| Max Position \| 2048 \|
	\| Vocab Size \| 32,000 \|
	\| Experts (Token-Routed MLP) \| 4 \|

	### Key Innovations (v0.13.0)

	- Mu-Guided KQV - Learned equilibrium parameter biases K, Q, and V projections
	- Mu-Guided Expert Routing - mu influences MLP expert selection
	- Mu Residual Highway - Accumulated context across layers
	- Token-Routed MLP - Deterministic 4-expert MoE with zero routing overhead
	- INL Dynamics - Velocity tracking for temporal coherence (alpha=0.9, beta=0.1)
	- Grouped Query Attention - 16 heads / 8 KV heads for efficient inference
	- QK Normalization + Flash Attention (SDPA)
	- RoPE positional embeddings

	---

	## Usage

	### CLI (generate.py)

	```bash
	python generate.py -c ./checkpoints/pacific-prime-chat -m 300 -t 0.3 \
	$'User: Give three tips for staying healthy.\n\nAssistant:'
	```

	### Python

	```python
	from complexity_deep import DeepForCausalLM
	from tokenizers import Tokenizer
	import torch

	model = DeepForCausalLM.from_pretrained("Pacific-Prime/chat-node")
	tokenizer = Tokenizer.from_file("tokenizer.json")

	prompt = "User: Explain what a neural network is.\n\nAssistant:"

	input_ids = torch.tensor([tokenizer.encode(prompt).ids])
	output = model.generate(input_ids, max_new_tokens=300, temperature=0.3)
	print(tokenizer.decode(output[0].tolist()))
	```

	---

	## Files

	\| File \| Description \|
	\|------\|-------------\|
	\| `checkpoint_epoch350.pt` \| Model weights (F32) \|
	\| `config.json` \| Architecture configuration \|
	\| `tokenizer.json` \| BPE tokenizer (32K vocab) \|
	\| `tokenizer_config.json` \| Tokenizer settings \|
	\| `special_tokens_map.json` \| Special tokens \|
	\| `chat_template.jinja` \| Chat prompt template \|

	---

	## Limitations

	- In development: Training ongoing, not yet production-ready
	- English-focused: Alpaca dataset is primarily English
	- Instruction following: May overshoot requested list lengths
	- Context window: Limited to 2048 tokens

	---

	## Links

	- [Paper - Zenodo](https://zenodo.org/records/18293026)
	- [Base Model - Pacific-Prime 1.5B](https://huggingface.co/Pacific-Prime/pacific-prime)
	- [GitHub - complexity-deep](https://github.com/Complexity-ML/complexity-deep)
	- [PyPI - complexity-deep](https://pypi.org/project/complexity-deep/)
	- [GitHub - mu-inference](https://github.com/Complexity-ML/mu-inference)

	---

	## License

	CC-BY-NC-4.0 (Creative Commons Attribution-NonCommercial 4.0)

	---

	## Citation

	```bibtex
	@misc{chat-node-2025,
	title={Chat-Node: A Conversational 1.5B Model with Mu-Guided Attention},
	author={Boris Peyriguere},
	year={2025},
	url={https://huggingface.co/Pacific-Prime/chat-node}
	}
	```