Instructions to use N8Programs/NextTerm-440M with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use N8Programs/NextTerm-440M with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="N8Programs/NextTerm-440M")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("N8Programs/NextTerm-440M")
model = AutoModelForCausalLM.from_pretrained("N8Programs/NextTerm-440M")

MLX

How to use N8Programs/NextTerm-440M with MLX:

# Make sure mlx-lm is installed
# pip install --upgrade mlx-lm
# if on a CUDA device, also pip install mlx[cuda]

# Generate text with mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("N8Programs/NextTerm-440M")

prompt = "Once upon a time in"
text = generate(model, tokenizer, prompt=prompt, verbose=True)

Notebooks
Google Colab
Kaggle
Local Apps Settings
LM Studio

vLLM

How to use N8Programs/NextTerm-440M with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "N8Programs/NextTerm-440M"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "N8Programs/NextTerm-440M",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/N8Programs/NextTerm-440M

SGLang

How to use N8Programs/NextTerm-440M with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "N8Programs/NextTerm-440M" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "N8Programs/NextTerm-440M",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "N8Programs/NextTerm-440M" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "N8Programs/NextTerm-440M",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

MLX LM

How to use N8Programs/NextTerm-440M with MLX LM:

Generate or start a chat session

# Install MLX LM
uv tool install mlx-lm
# Generate some text
mlx_lm.generate --model "N8Programs/NextTerm-440M" --prompt "Once upon a time"

Docker Model Runner
How to use N8Programs/NextTerm-440M with Docker Model Runner:
```
docker model run hf.co/N8Programs/NextTerm-440M
```

NextTerm-440M

Model Summary

NextTerm-440M is a 440M parameter causal transformer trained to continue integer sequences. It uses a Qwen3 architecture with a compact 16-token digit vocabulary: decimal digits, negative sign, comma separator, BOS, EOS, PAD, and one unused token.

The model was trained on an extended OEIS corpus that enhanced many OEIS sequences with additional terms from b-files (supplemental appendices provided with OEIS) and then further augmented the data w/ a variety of prefix-preserving transforms empirically selected via small pilot experiments. The model was trained for 14B tokens w/ preserved sequence prefixes rather than concatenating distinct documents (as this was found to improve performance in pilot experiments).

NextTerm-440M improves dramatically over NextTerm-47M on long-context sequence continuation (as it was trained w/ a context length of 4096), innate OEIS knowledge, and long-range in context learning. The 47M model, however, remains ahead on very short prefixes that require simple rule induction without much context, which may be due to the 440M model's training on longer contexts and more complex sequences.

The tokenizer accepts integer sequences formatted as comma-separated values, for example:

1,-2,3,-4,

The tokenizer ignores characters other than digits, commas, and -. Digits are tokenized individually, so there is no fixed integer-magnitude limit, but large integers consume more context. The model was not trained on numbers with leading zeros, so strings like 01,02,03, should be treated as out of distribution.

Training Details

Field	Value
Parameters	440,500,224
Architecture	Qwen3-style causal LM
Layers	28
Hidden size	1024
FFN size	3072
Attention heads	16
KV heads	8
Vocabulary size	16
Training tokens	13,999,999,995
Sequence length cap	4096 training tokens per sequence
Batch mode	Length-bucketed sequence batches
Optimizer	Muon/AdamW hybrid
LR schedule	Linear warmup to `1e-2` for Muon, `1e-4` for AdamW, cosine decay to 0.1x, final cooldown to 0
Training hardware	Single H100
Export dtype	bfloat16

A classic Muon/AdamW hybrid was used: Muon for 2D weight matrices and AdamW for 1D parameters and embedding matrices.

The model was trained on the following files in the N8Programs/oeis-massive dataset, randomly mixed:

oeis_train_bfile_prefix4096.packed
oeis_synth_aug0_inv_len_13245370099_seed0.packed

Evaluation Results

Main Benchmarks

Model	OEIS-Eval-Neo	Ryskina & Knight	M1 Competition 111 MAPE
NextTerm-440M	34.43%	52.63%	17.6239
NextTerm-47M	29.49%	70.18%	18.7621
Qwen3-0.6B	18.44%	33.33%	22.7984
Qwen3-1.7B	20.77%	49.12%	22.2411
Qwen3-4B	23.74%	63.16%	19.1731
Qwen3-8B	24.62%	57.89%	18.4027
Qwen3-14B	26.00%	59.65%	17.9837

OEIS-Eval-Neo is a decontaminated held-out OEIS next-term evaluation. M1 Competition 111 reports macro MAPE, where lower is better. Ryskina & Knight (2021) is a 57-sequence next-term benchmark based on psychometrics and puzzles. Note that the 47M model's strong performance on Ryskina & Knight is indicative of its strength on short-prefix sequences and rule induction.

Polynomial Continuation

The polynomial continuation evaluation samples integer sequences from polynomials of degree 1 through 4 and asks for the next term. Accuracy is exact match across 200 samples for each prompt length k.

Model	Arithmetic	Quadratic	Cubic	Quartic
NextTerm-440M	94.38%	86.39%	75.20%	67.83%
NextTerm-47M	94.15%	81.07%	37.43%	15.17%
Qwen3-0.6B	90.31%	8.72%	0.30%	0.02%
Qwen3-1.7B	93.10%	41.57%	5.36%	0.71%
Qwen3-4B	93.90%	77.26%	28.18%	5.98%
Qwen3-8B	96.10%	80.59%	32.93%	7.95%
Qwen3-14B	95.60%	84.61%	49.16%	14.98%

Usage

MLX

mlx_lm.generate --model N8Programs/NextTerm-440M --prompt "1,2,3,"

Hugging Face Transformers

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "N8Programs/NextTerm-440M"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

prompt = "1,2,3,4,5,"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=64,
    do_sample=False,
    eos_token_id=[tokenizer.convert_tokens_to_ids(","), tokenizer.eos_token_id],
    pad_token_id=tokenizer.pad_token_id,
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

For strict next-term evaluation, stop generation on comma or EOS and parse the text before the first comma as the predicted integer.

Reproducibility

This repository contains the local evaluation scripts and artifacts used for the results above, including the small evaluation datasets needed to rerun them:

oeis_eval_mlx_neo.py for OEIS-Eval-Neo with MLX batch generation.
arithmetic_eval.py for arithmetic/quadratic/cubic/quartic continuation.
eval_m1_competition_mape_mlx.py for M1 Competition 111 MAPE.
oeis_val_neo.jsonl for OEIS-Eval-Neo.
m1_competition_111.jsonl for M1 Competition 111.
eval_results.txt for the compact result table.

The last three training checkpoints are available separately at N8Programs/NextTerm-440M-Checkpoints. The released final_latest checkpoint was trained for 14B tokens. Additionally, the checkpoint corresponding to the best val loss is available as well (although it is not included in the main results table as it was inferior on downstream eval performance).

The .packed files used for training are binary files containing the tokenized and augmented OEIS data - w/ tokens encoded as nibbles. A dedicate decoder is provided in this repo as decode_packed_oeis.py.

Citation

@misc{nextterm440m2026,
  author       = {Nathan Breslow},
  title        = {NextTerm-440M: A Pretrained Transformer for Integer Sequence Prediction},
  year         = {2026},
  publisher    = {Hugging Face},
  howpublished = {\url{https://huggingface.co/N8Programs/NextTerm-440M}},
  note         = {440.5M parameter model trained on augmented OEIS data}
}

Attribution

This model and dataset were trained and created using data from the On-Line Encyclopedia of Integer Sequences (OEIS).

Source: https://oeis.org/
License: Creative Commons Attribution-ShareAlike 4.0 (CC BY-SA 4.0)
OEIS End-User License Agreement: https://oeis.org/wiki/The_OEIS_End-User_License_Agreement

Downloads last month: 21

Safetensors

Model size

0.4B params

Tensor type

BF16

MLX

Hardware compatibility

Quantized

Model tree for N8Programs/NextTerm-440M

Finetunes

1 model

N8Programs
/

NextTerm-440M