Instructions to use N8Programs/NextTerm-440M-Checkpoints with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use N8Programs/NextTerm-440M-Checkpoints with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("N8Programs/NextTerm-440M-Checkpoints", dtype="auto") - Notebooks
- Google Colab
- Kaggle
| license: cc-by-sa-4.0 | |
| library_name: transformers | |
| tags: | |
| - oeis | |
| - qwen3 | |
| - causal-lm | |
| - checkpoint | |
| # NextTerm-440M Checkpoints | |
| Transformers-compatible checkpoints from the OEIS NextTerm-440M run. | |
| These checkpoints use a Qwen3-style causal LM architecture with a 16-token OEIS digit vocabulary. They were converted from the training checkpoints by remapping the custom interleaved RoPE basis into the Hugging Face / Qwen split-half RoPE basis, so they can be loaded directly with `AutoModelForCausalLM`. | |
| ## Checkpoints | |
| | Folder | Tokens trained | Notes | | |
| | --- | ---: | --- | | |
| | `checkpoints/final_latest` | 13,999,999,995 | Final checkpoint; recommended default | | |
| | `checkpoints/best_val` | 9,500,200,875 | Best validation-loss checkpoint | | |
| | `checkpoints/checkpoint_tokens_012000258345` | 12,000,258,345 | Historical checkpoint | | |
| | `checkpoints/checkpoint_tokens_012500265837` | 12,500,265,837 | Historical checkpoint | | |
| | `checkpoints/checkpoint_tokens_013000266889` | 13,000,266,889 | Historical checkpoint | | |
| | `checkpoints/checkpoint_tokens_013500289737` | 13,500,289,737 | Historical checkpoint | | |
| ## OEIS Vocab | |
| The model is token-ID based; no text tokenizer is included. | |
| | Token ID | Meaning | | |
| | ---: | --- | | |
| | `0`-`9` | decimal digits | | |
| | `10` | negative sign | | |
| | `11` | term separator | | |
| | `12` | BOS | | |
| | `13` | EOS | | |
| | `14` | PAD | | |
| | `15` | reserved | | |
| For next-term generation, stop on any of `[11, 13, 14]`. | |
| ## Loading | |
| ```python | |
| import torch | |
| from transformers import AutoModelForCausalLM | |
| model = AutoModelForCausalLM.from_pretrained( | |
| "N8Programs/NextTerm-440M-Checkpoints", | |
| subfolder="checkpoints/final_latest", | |
| dtype=torch.bfloat16, | |
| device_map="auto", | |
| ) | |
| ``` | |
| Example input IDs for the prefix `1, 2, 3, ...`: | |
| ```python | |
| input_ids = torch.tensor([[12, 1, 11, 2, 11, 3, 11]], device=model.device) | |
| out = model.generate( | |
| input_ids, | |
| max_new_tokens=192, | |
| do_sample=False, | |
| eos_token_id=[11, 13, 14], | |
| pad_token_id=14, | |
| ) | |
| ``` | |
| ## Evaluation Notes | |
| OEIS Eval Neo excludes exact packed-sequence overlaps with the training data and uses `max_new_tokens=192`, which is sufficient for every answer in that eval set. | |
| Known OEIS Eval Neo results: | |
| | Checkpoint | Accuracy | | |
| | --- | ---: | | |
| | `final_latest` | 6545 / 19034 = 34.39% | | |
| | `best_val` | 6477 / 19034 = 34.03% | | |
| Each checkpoint folder includes an `oeis_checkpoint_meta.json` file with training tokens, source checkpoint path, and conversion details. | |