Instructions to use ACDRepo/PermuFormer with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ACDRepo/PermuFormer with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="ACDRepo/PermuFormer")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("ACDRepo/PermuFormer") model = AutoModelForCausalLM.from_pretrained("ACDRepo/PermuFormer") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use ACDRepo/PermuFormer with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "ACDRepo/PermuFormer" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ACDRepo/PermuFormer", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/ACDRepo/PermuFormer
- SGLang
How to use ACDRepo/PermuFormer with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "ACDRepo/PermuFormer" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ACDRepo/PermuFormer", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "ACDRepo/PermuFormer" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ACDRepo/PermuFormer", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use ACDRepo/PermuFormer with Docker Model Runner:
docker model run hf.co/ACDRepo/PermuFormer
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("ACDRepo/PermuFormer")
model = AutoModelForCausalLM.from_pretrained("ACDRepo/PermuFormer")PermuFormer
PermuFormer is a small Llama-style causal language model trained on symbolic permutation tasks from algebraic combinatorics. It is intended as a specialist base model for permutation representation, reasoning, and finetuning experiments rather than as a general natural-language assistant.
The model operates on a compact word-level vocabulary for permutation syntax. Training examples are stored as pre-tokenized lists of tokens; at inference time, the Hugging Face tokenizer can also consume equivalent whitespace-separated strings. Prompts are formulaic equations: the left side specifies a permutation task and generation begins after the = token.
Model Details
- Architecture:
LlamaForCausalLM - Parameters: about 75.7M
- Layers: 12
- Hidden size: 768
- Attention heads: 12 query heads, 4 key/value heads
- MLP intermediate size: 2048
- Activation: SiLU/SwiGLU
- Position encoding: RoPE, theta 10000
- Vocabulary size: 186
- Context length used by tokenizer: 1000 tokens
- Checkpoint:
step_2600000
Training Data
PermuFormer was trained autoregressively on synthetic permutation examples generated with exact combinatorial algorithms. The paper describes a dataset of 39.8M instances, approximately 2.66B tokens, over the symmetric groups S_2 through S_11.
Training tasks cover three broad families:
- Translation between encodings: one-line notation, cycle notation, reduced Coxeter expressions, RSK tableaux, inversion vectors, and Lehmer codes.
- Permutation statistics and properties: length, descents, fixed points, sign/parity, cycle type, RSK shape, pattern avoidance, longest increasing/decreasing subsequences, and related statistics.
- Algebraic operations and comparisons: product/composition, inverse, powers, conjugation, commutator, relative products, multiplication by simple transpositions, complement, reverse, descent tests, and Bruhat order.
Some targets include computational witnesses before the final answer, for example inversion lists before a length answer or pattern witnesses before an avoidance answer.
Usage
Use deterministic decoding for most evaluation-style tasks. Make sure special token IDs come from the tokenizer.
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "YOUR_ORG/permuformer"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
model.eval()
prompt = (
"<|endoftext|> n3 "
"1linebegin [ 3 , 1 , 2 ] 1lineend "
"in cyclenotationmake ="
)
inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
output_ids = model.generate(
**inputs,
max_new_tokens=80,
do_sample=False,
eos_token_id=tokenizer.eos_token_id,
pad_token_id=tokenizer.pad_token_id,
)
print(tokenizer.decode(output_ids[0], skip_special_tokens=False))
Prompt Format
Training data is represented as lists of token strings. When writing prompts as plain text, separate every token with spaces. Multi-digit integers, delimiters, and task names are individual tokens. A typical example starts with <|endoftext|>, then a size token such as n7, then the task expression, then =.
Translation example:
<|endoftext|> n3 1linebegin [ 3 , 1 , 2 ] 1lineend in cyclenotationmake =
Property example:
<|endoftext|> n3 1linebegin [ 3 , 2 , 1 ] 1lineend property lengthmake =
Algebraic operation example:
<|endoftext|> n3 1linebegin [ 2 , 1 , 3 ] 1lineend inversemake =
Evaluation Notes
The training code evaluates by exact match on the generated right-hand side after =. The local training log for this repository reports, at step 2,522,000 on a 2,560-example stratified evaluation sample:
- Overall exact match: 98.44%
- Translation: 97.78%
- Property/statistic tasks: 99.17%
- Algebraic tasks: 98.36%
These figures are from the local log and should be treated as checkpoint-adjacent repository metadata, not a full benchmark report for every downstream setting.
The paper also reports that PermuFormer is substantially more accurate than frontier general-purpose LLMs on a small held-out sample from the model's symbolic test distribution, while noting that the comparison is imperfect because PermuFormer was trained directly in this syntax.
Finetuning
PermuFormer is designed to be finetuned on specialized permutation tasks. Experiments in the paper include:
- 231-avoidance and 2143-avoidance
- mHeight
- Schubert polynomial structure constants
- Kazhdan-Lusztig polynomial degree prediction
The repository's finetuning scripts compare starting from this pretrained checkpoint with training the same architecture from scratch.
Limitations
- This is a specialist symbolic model. It expects the exact whitespace-tokenized syntax used during training and is brittle to natural-language paraphrases or malformed prompts.
- The model is trained on permutations of sizes represented in the training data, primarily
S_2throughS_11; behavior outside that regime is not guaranteed. - Exact-match accuracy depends on canonical output formatting. Some mathematical tasks may have multiple valid answers, but evaluation expects the chosen canonical form.
- The model focuses on permutations. It does not natively handle broader combinatorial structures such as arbitrary graphs or partitions unless encoded through the supported task syntax.
- Outputs should be verified by exact combinatorial software for research-critical use.
Citation
If you use this model, please cite the accompanying PermuFormer paper once citation details are available.
- Downloads last month
- 1
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="ACDRepo/PermuFormer")