--- license: cc-by-sa-4.0 library_name: transformers tags: - oeis - qwen3 - causal-lm - checkpoint --- # NextTerm-440M Checkpoints Transformers-compatible checkpoints from the OEIS NextTerm-440M run. These checkpoints use a Qwen3-style causal LM architecture with a 16-token OEIS digit vocabulary. They were converted from the training checkpoints by remapping the custom interleaved RoPE basis into the Hugging Face / Qwen split-half RoPE basis, so they can be loaded directly with `AutoModelForCausalLM`. ## Checkpoints | Folder | Tokens trained | Notes | | --- | ---: | --- | | `checkpoints/final_latest` | 13,999,999,995 | Final checkpoint; recommended default | | `checkpoints/best_val` | 9,500,200,875 | Best validation-loss checkpoint | | `checkpoints/checkpoint_tokens_012000258345` | 12,000,258,345 | Historical checkpoint | | `checkpoints/checkpoint_tokens_012500265837` | 12,500,265,837 | Historical checkpoint | | `checkpoints/checkpoint_tokens_013000266889` | 13,000,266,889 | Historical checkpoint | | `checkpoints/checkpoint_tokens_013500289737` | 13,500,289,737 | Historical checkpoint | ## OEIS Vocab The model is token-ID based; no text tokenizer is included. | Token ID | Meaning | | ---: | --- | | `0`-`9` | decimal digits | | `10` | negative sign | | `11` | term separator | | `12` | BOS | | `13` | EOS | | `14` | PAD | | `15` | reserved | For next-term generation, stop on any of `[11, 13, 14]`. ## Loading ```python import torch from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained( "N8Programs/NextTerm-440M-Checkpoints", subfolder="checkpoints/final_latest", dtype=torch.bfloat16, device_map="auto", ) ``` Example input IDs for the prefix `1, 2, 3, ...`: ```python input_ids = torch.tensor([[12, 1, 11, 2, 11, 3, 11]], device=model.device) out = model.generate( input_ids, max_new_tokens=192, do_sample=False, eos_token_id=[11, 13, 14], pad_token_id=14, ) ``` ## Evaluation Notes OEIS Eval Neo excludes exact packed-sequence overlaps with the training data and uses `max_new_tokens=192`, which is sufficient for every answer in that eval set. Known OEIS Eval Neo results: | Checkpoint | Accuracy | | --- | ---: | | `final_latest` | 6545 / 19034 = 34.39% | | `best_val` | 6477 / 19034 = 34.03% | Each checkpoint folder includes an `oeis_checkpoint_meta.json` file with training tokens, source checkpoint path, and conversion details.