Text Generation
Transformers
Safetensors
llama
math
combinatorics
permutations
algebraic-combinatorics
causal-lm
text-generation-inference
Instructions to use ACDRepo/PermuFormer with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ACDRepo/PermuFormer with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="ACDRepo/PermuFormer")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("ACDRepo/PermuFormer") model = AutoModelForCausalLM.from_pretrained("ACDRepo/PermuFormer") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use ACDRepo/PermuFormer with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "ACDRepo/PermuFormer" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ACDRepo/PermuFormer", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/ACDRepo/PermuFormer
- SGLang
How to use ACDRepo/PermuFormer with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "ACDRepo/PermuFormer" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ACDRepo/PermuFormer", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "ACDRepo/PermuFormer" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ACDRepo/PermuFormer", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use ACDRepo/PermuFormer with Docker Model Runner:
docker model run hf.co/ACDRepo/PermuFormer
File size: 5,876 Bytes
24f62d5 eb03713 fbc206b 24f62d5 fbc206b e694085 fbc206b ccfa311 fbc206b eb03713 fbc206b eb03713 fbc206b eb03713 fbc206b e694085 fbc206b eb03713 fbc206b eb03713 fbc206b eb03713 fbc206b eb03713 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 | ---
library_name: transformers
pipeline_tag: text-generation
tags:
- math
- combinatorics
- permutations
- algebraic-combinatorics
- llama
- causal-lm
---
# PermuFormer
PermuFormer is a small Llama-style causal language model trained on symbolic permutation tasks from algebraic combinatorics. It is intended as a specialist base model for permutation representation, reasoning, and finetuning experiments rather than as a general natural-language assistant.
The model operates on a compact word-level vocabulary for permutation syntax. Training examples are stored as pre-tokenized lists of tokens; at inference time, the Hugging Face tokenizer can also consume equivalent whitespace-separated strings. Prompts are formulaic equations: the left side specifies a permutation task and generation begins after the `=` token.
## Model Details
- **Architecture:** `LlamaForCausalLM`
- **Parameters:** about 75.7M
- **Layers:** 12
- **Hidden size:** 768
- **Attention heads:** 12 query heads, 4 key/value heads
- **MLP intermediate size:** 2048
- **Activation:** SiLU/SwiGLU
- **Position encoding:** RoPE, theta 10000
- **Vocabulary size:** 186
- **Context length used by tokenizer:** 1000 tokens
- **Checkpoint:** `step_2600000`
## Training Data
PermuFormer was trained autoregressively on synthetic permutation examples generated with exact combinatorial algorithms. The paper describes a dataset of 39.8M instances, approximately 2.66B tokens, over the symmetric groups `S_2` through `S_11`.
Training tasks cover three broad families:
- **Translation between encodings:** one-line notation, cycle notation, reduced Coxeter expressions, RSK tableaux, inversion vectors, and Lehmer codes.
- **Permutation statistics and properties:** length, descents, fixed points, sign/parity, cycle type, RSK shape, pattern avoidance, longest increasing/decreasing subsequences, and related statistics.
- **Algebraic operations and comparisons:** product/composition, inverse, powers, conjugation, commutator, relative products, multiplication by simple transpositions, complement, reverse, descent tests, and Bruhat order.
Some targets include computational witnesses before the final answer, for example inversion lists before a length answer or pattern witnesses before an avoidance answer.
## Usage
Use deterministic decoding for most evaluation-style tasks. Make sure special token IDs come from the tokenizer.
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "YOUR_ORG/permuformer"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
model.eval()
prompt = (
"<|endoftext|> n3 "
"1linebegin [ 3 , 1 , 2 ] 1lineend "
"in cyclenotationmake ="
)
inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
output_ids = model.generate(
**inputs,
max_new_tokens=80,
do_sample=False,
eos_token_id=tokenizer.eos_token_id,
pad_token_id=tokenizer.pad_token_id,
)
print(tokenizer.decode(output_ids[0], skip_special_tokens=False))
```
### Prompt Format
Training data is represented as lists of token strings. When writing prompts as plain text, separate every token with spaces. Multi-digit integers, delimiters, and task names are individual tokens. A typical example starts with `<|endoftext|>`, then a size token such as `n7`, then the task expression, then `=`.
Translation example:
```text
<|endoftext|> n3 1linebegin [ 3 , 1 , 2 ] 1lineend in cyclenotationmake =
```
Property example:
```text
<|endoftext|> n3 1linebegin [ 3 , 2 , 1 ] 1lineend property lengthmake =
```
Algebraic operation example:
```text
<|endoftext|> n3 1linebegin [ 2 , 1 , 3 ] 1lineend inversemake =
```
## Evaluation Notes
The training code evaluates by exact match on the generated right-hand side after `=`. The local training log for this repository reports, at step 2,522,000 on a 2,560-example stratified evaluation sample:
- Overall exact match: **98.44%**
- Translation: **97.78%**
- Property/statistic tasks: **99.17%**
- Algebraic tasks: **98.36%**
These figures are from the local log and should be treated as checkpoint-adjacent repository metadata, not a full benchmark report for every downstream setting.
The paper also reports that PermuFormer is substantially more accurate than frontier general-purpose LLMs on a small held-out sample from the model's symbolic test distribution, while noting that the comparison is imperfect because PermuFormer was trained directly in this syntax.
## Finetuning
PermuFormer is designed to be finetuned on specialized permutation tasks. Experiments in the paper include:
- 231-avoidance and 2143-avoidance
- mHeight
- Schubert polynomial structure constants
- Kazhdan-Lusztig polynomial degree prediction
The repository's finetuning scripts compare starting from this pretrained checkpoint with training the same architecture from scratch.
## Limitations
- This is a specialist symbolic model. It expects the exact whitespace-tokenized syntax used during training and is brittle to natural-language paraphrases or malformed prompts.
- The model is trained on permutations of sizes represented in the training data, primarily `S_2` through `S_11`; behavior outside that regime is not guaranteed.
- Exact-match accuracy depends on canonical output formatting. Some mathematical tasks may have multiple valid answers, but evaluation expects the chosen canonical form.
- The model focuses on permutations. It does not natively handle broader combinatorial structures such as arbitrary graphs or partitions unless encoded through the supported task syntax.
- Outputs should be verified by exact combinatorial software for research-critical use.
## Citation
If you use this model, please cite the accompanying PermuFormer paper once citation details are available.
|