--- library_name: transformers pipeline_tag: text-generation tags: - math - combinatorics - permutations - algebraic-combinatorics - llama - causal-lm --- # PermuFormer PermuFormer is a small Llama-style causal language model trained on symbolic permutation tasks from algebraic combinatorics. It is intended as a specialist base model for permutation representation, reasoning, and finetuning experiments rather than as a general natural-language assistant. The model operates on a compact word-level vocabulary for permutation syntax. Training examples are stored as pre-tokenized lists of tokens; at inference time, the Hugging Face tokenizer can also consume equivalent whitespace-separated strings. Prompts are formulaic equations: the left side specifies a permutation task and generation begins after the `=` token. ## Model Details - **Architecture:** `LlamaForCausalLM` - **Parameters:** about 75.7M - **Layers:** 12 - **Hidden size:** 768 - **Attention heads:** 12 query heads, 4 key/value heads - **MLP intermediate size:** 2048 - **Activation:** SiLU/SwiGLU - **Position encoding:** RoPE, theta 10000 - **Vocabulary size:** 186 - **Context length used by tokenizer:** 1000 tokens - **Checkpoint:** `step_2600000` ## Training Data PermuFormer was trained autoregressively on synthetic permutation examples generated with exact combinatorial algorithms. The paper describes a dataset of 39.8M instances, approximately 2.66B tokens, over the symmetric groups `S_2` through `S_11`. Training tasks cover three broad families: - **Translation between encodings:** one-line notation, cycle notation, reduced Coxeter expressions, RSK tableaux, inversion vectors, and Lehmer codes. - **Permutation statistics and properties:** length, descents, fixed points, sign/parity, cycle type, RSK shape, pattern avoidance, longest increasing/decreasing subsequences, and related statistics. - **Algebraic operations and comparisons:** product/composition, inverse, powers, conjugation, commutator, relative products, multiplication by simple transpositions, complement, reverse, descent tests, and Bruhat order. Some targets include computational witnesses before the final answer, for example inversion lists before a length answer or pattern witnesses before an avoidance answer. ## Usage Use deterministic decoding for most evaluation-style tasks. Make sure special token IDs come from the tokenizer. ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch model_id = "YOUR_ORG/permuformer" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id) model.eval() prompt = ( "<|endoftext|> n3 " "1linebegin [ 3 , 1 , 2 ] 1lineend " "in cyclenotationmake =" ) inputs = tokenizer(prompt, return_tensors="pt") with torch.no_grad(): output_ids = model.generate( **inputs, max_new_tokens=80, do_sample=False, eos_token_id=tokenizer.eos_token_id, pad_token_id=tokenizer.pad_token_id, ) print(tokenizer.decode(output_ids[0], skip_special_tokens=False)) ``` ### Prompt Format Training data is represented as lists of token strings. When writing prompts as plain text, separate every token with spaces. Multi-digit integers, delimiters, and task names are individual tokens. A typical example starts with `<|endoftext|>`, then a size token such as `n7`, then the task expression, then `=`. Translation example: ```text <|endoftext|> n3 1linebegin [ 3 , 1 , 2 ] 1lineend in cyclenotationmake = ``` Property example: ```text <|endoftext|> n3 1linebegin [ 3 , 2 , 1 ] 1lineend property lengthmake = ``` Algebraic operation example: ```text <|endoftext|> n3 1linebegin [ 2 , 1 , 3 ] 1lineend inversemake = ``` ## Evaluation Notes The training code evaluates by exact match on the generated right-hand side after `=`. The local training log for this repository reports, at step 2,522,000 on a 2,560-example stratified evaluation sample: - Overall exact match: **98.44%** - Translation: **97.78%** - Property/statistic tasks: **99.17%** - Algebraic tasks: **98.36%** These figures are from the local log and should be treated as checkpoint-adjacent repository metadata, not a full benchmark report for every downstream setting. The paper also reports that PermuFormer is substantially more accurate than frontier general-purpose LLMs on a small held-out sample from the model's symbolic test distribution, while noting that the comparison is imperfect because PermuFormer was trained directly in this syntax. ## Finetuning PermuFormer is designed to be finetuned on specialized permutation tasks. Experiments in the paper include: - 231-avoidance and 2143-avoidance - mHeight - Schubert polynomial structure constants - Kazhdan-Lusztig polynomial degree prediction The repository's finetuning scripts compare starting from this pretrained checkpoint with training the same architecture from scratch. ## Limitations - This is a specialist symbolic model. It expects the exact whitespace-tokenized syntax used during training and is brittle to natural-language paraphrases or malformed prompts. - The model is trained on permutations of sizes represented in the training data, primarily `S_2` through `S_11`; behavior outside that regime is not guaranteed. - Exact-match accuracy depends on canonical output formatting. Some mathematical tasks may have multiple valid answers, but evaluation expects the chosen canonical form. - The model focuses on permutations. It does not natively handle broader combinatorial structures such as arbitrary graphs or partitions unless encoded through the supported task syntax. - Outputs should be verified by exact combinatorial software for research-critical use. ## Citation If you use this model, please cite the accompanying PermuFormer paper once citation details are available.