Instructions to use N8Programs/NextTerm-440M-Checkpoints with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use N8Programs/NextTerm-440M-Checkpoints with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("N8Programs/NextTerm-440M-Checkpoints", dtype="auto") - Notebooks
- Google Colab
- Kaggle
NextTerm-440M Checkpoints
Transformers-compatible checkpoints from the OEIS NextTerm-440M run.
These checkpoints use a Qwen3-style causal LM architecture with a 16-token OEIS digit vocabulary. They were converted from the training checkpoints by remapping the custom interleaved RoPE basis into the Hugging Face / Qwen split-half RoPE basis, so they can be loaded directly with AutoModelForCausalLM.
Checkpoints
| Folder | Tokens trained | Notes |
|---|---|---|
checkpoints/final_latest |
13,999,999,995 | Final checkpoint; recommended default |
checkpoints/best_val |
9,500,200,875 | Best validation-loss checkpoint |
checkpoints/checkpoint_tokens_012000258345 |
12,000,258,345 | Historical checkpoint |
checkpoints/checkpoint_tokens_012500265837 |
12,500,265,837 | Historical checkpoint |
checkpoints/checkpoint_tokens_013000266889 |
13,000,266,889 | Historical checkpoint |
checkpoints/checkpoint_tokens_013500289737 |
13,500,289,737 | Historical checkpoint |
OEIS Vocab
The model is token-ID based; no text tokenizer is included.
| Token ID | Meaning |
|---|---|
0-9 |
decimal digits |
10 |
negative sign |
11 |
term separator |
12 |
BOS |
13 |
EOS |
14 |
PAD |
15 |
reserved |
For next-term generation, stop on any of [11, 13, 14].
Loading
import torch
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained(
"N8Programs/NextTerm-440M-Checkpoints",
subfolder="checkpoints/final_latest",
dtype=torch.bfloat16,
device_map="auto",
)
Example input IDs for the prefix 1, 2, 3, ...:
input_ids = torch.tensor([[12, 1, 11, 2, 11, 3, 11]], device=model.device)
out = model.generate(
input_ids,
max_new_tokens=192,
do_sample=False,
eos_token_id=[11, 13, 14],
pad_token_id=14,
)
Evaluation Notes
OEIS Eval Neo excludes exact packed-sequence overlaps with the training data and uses max_new_tokens=192, which is sufficient for every answer in that eval set.
Known OEIS Eval Neo results:
| Checkpoint | Accuracy |
|---|---|
final_latest |
6545 / 19034 = 34.39% |
best_val |
6477 / 19034 = 34.03% |
Each checkpoint folder includes an oeis_checkpoint_meta.json file with training tokens, source checkpoint path, and conversion details.