Instructions to use N8Programs/NextTerm-440M with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use N8Programs/NextTerm-440M with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="N8Programs/NextTerm-440M")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("N8Programs/NextTerm-440M")
model = AutoModelForCausalLM.from_pretrained("N8Programs/NextTerm-440M")

MLX

How to use N8Programs/NextTerm-440M with MLX:

# Make sure mlx-lm is installed
# pip install --upgrade mlx-lm
# if on a CUDA device, also pip install mlx[cuda]

# Generate text with mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("N8Programs/NextTerm-440M")

prompt = "Once upon a time in"
text = generate(model, tokenizer, prompt=prompt, verbose=True)

Notebooks
Google Colab
Kaggle
Local Apps Settings
LM Studio

vLLM

How to use N8Programs/NextTerm-440M with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "N8Programs/NextTerm-440M"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "N8Programs/NextTerm-440M",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/N8Programs/NextTerm-440M

SGLang

How to use N8Programs/NextTerm-440M with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "N8Programs/NextTerm-440M" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "N8Programs/NextTerm-440M",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "N8Programs/NextTerm-440M" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "N8Programs/NextTerm-440M",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

MLX LM

How to use N8Programs/NextTerm-440M with MLX LM:

Generate or start a chat session

# Install MLX LM
uv tool install mlx-lm
# Generate some text
mlx_lm.generate --model "N8Programs/NextTerm-440M" --prompt "Once upon a time"

Docker Model Runner
How to use N8Programs/NextTerm-440M with Docker Model Runner:
```
docker model run hf.co/N8Programs/NextTerm-440M
```

N8Programs commited on May 28

Commit

fc79e8b

verified ·

1 Parent(s): ef65102

Upload folder using huggingface_hub

Browse files

Files changed (7) hide show

config.json +65 -0
generation_config.json +8 -0
model.safetensors +3 -0
oeis_checkpoint_meta.json +141 -0
special_tokens_map.json +9 -0
tokenizer.json +159 -0
tokenizer_config.json +47 -0

config.json ADDED Viewed

	@@ -0,0 +1,65 @@

+{
+  "architectures": [
+    "Qwen3ForCausalLM"
+  ],
+  "attention_bias": false,
+  "attention_dropout": 0.0,
+  "bos_token_id": 12,
+  "dtype": "bfloat16",
+  "eos_token_id": 13,
+  "head_dim": 128,
+  "hidden_act": "silu",
+  "hidden_size": 1024,
+  "initializer_range": 0.02,
+  "intermediate_size": 3072,
+  "layer_types": [
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention"
+  ],
+  "max_position_embeddings": 40960,
+  "max_window_layers": 28,
+  "model_type": "qwen3",
+  "num_attention_heads": 16,
+  "num_hidden_layers": 28,
+  "num_key_value_heads": 8,
+  "pad_token_id": 14,
+  "rms_norm_eps": 1e-06,
+  "rope_parameters": {
+    "rope_theta": 1000000.0,
+    "rope_type": "default"
+  },
+  "sliding_window": null,
+  "tie_word_embeddings": false,
+  "transformers_version": "5.9.0",
+  "use_cache": true,
+  "use_sliding_window": false,
+  "vocab_size": 16,
+  "rope_theta": 1000000.0,
+  "torch_dtype": "bfloat16"
+}

generation_config.json ADDED Viewed

	@@ -0,0 +1,8 @@

+{
+  "bos_token_id": 12,
+  "do_sample": false,
+  "eos_token_id": 13,
+  "pad_token_id": 14,
+  "transformers_version": "5.9.0",
+  "use_cache": true
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:84fe6511b007c9c85706f563f547525c7cd57571227940d91b1c514041316128
+size 881035576

oeis_checkpoint_meta.json ADDED Viewed

	@@ -0,0 +1,141 @@

+{
+  "name": "final_latest",
+  "source_checkpoint": "/root/oeis_runs/oeis-440m-14b-full-20260525_025101/latest.pt",
+  "trained_tokens": 13999999995,
+  "trainer_state": {
+    "completed_steps": 581678,
+    "train_tokens_seen": 13999999995,
+    "last_loss": 0.5624260902404785
+  },
+  "checkpoint_args": {
+    "data": "/root/oeis-massive/packed_data/oeis_train_full_synth_plus_organic_13999999995.packed",
+    "model_backend": "custom",
+    "param_dtype": "fp32",
+    "weight_update_mode": "bf16_live_fp32_master",
+    "batch_size": 1,
+    "seq_len": 4096,
+    "steps": 5,
+    "warmup_steps": 2,
+    "target_tokens": 13999999995,
+    "max_steps": 0,
+    "batch_mode": "bucketed",
+    "pad_to_seq_len": false,
+    "index_dir": "",
+    "bucket_tokens_per_batch": 16384,
+    "bucket_token_budget_spec": "512:32768,*:24576",
+    "bucket_max_batch_size": 512,
+    "bucket_pad_multiple": 8,
+    "bucket_pad_to_upper": true,
+    "bucket_replacement": false,
+    "bucket_repeat_epochs": true,
+    "bucket_sampling": "token_mass",
+    "vocab_mode": "oeis",
+    "hidden_size": 1024,
+    "intermediate_size": 3072,
+    "num_hidden_layers": 28,
+    "num_attention_heads": 16,
+    "num_key_value_heads": 8,
+    "head_dim": 0,
+    "compile": true,
+    "compile_mode": "reduce-overhead",
+    "compile_dynamic": true,
+    "compile_skip_dynamic_cudagraphs": true,
+    "prewarm_bucket_shapes": true,
+    "prewarm_bucket_passes": 2,
+    "prewarm_restore_state": true,
+    "prewarm_verify_restore": true,
+    "prewarm_update_optimizer": true,
+    "prewarm_materialize_optimizer_state": true,
+    "gradient_checkpointing": false,
+    "native_gqa": true,
+    "amp": true,
+    "lr": 0.0003,
+    "optimizer_mode": "torch_muon_hybrid",
+    "adamw_lr": 0.0001,
+    "muon_lr": 0.01,
+    "body_lr_mult": 1.0,
+    "adamw_weight_decay": 0.01,
+    "muon_weight_decay": 0.0,
+    "muon_momentum": 0.95,
+    "muon_ns_steps": 5,
+    "muon_adjust_lr_fn": "",
+    "no_muon_nesterov": false,
+    "lr_schedule": "warmup_cosine_cooldown",
+    "lr_total_tokens": 13999999995,
+    "lr_warmup_tokens": 0,
+    "lr_warmup_fraction": 0.005,
+    "lr_decay_end_fraction": 0.95,
+    "lr_min_factor": 0.1,
+    "lr_final_factor": 0.0,
+    "checkpoint_dir": "/root/oeis_runs/oeis-440m-14b-full-20260525_025101",
+    "checkpoint_every_tokens": 500000000,
+    "checkpoint_every_steps": 0,
+    "keep_last_checkpoints": 4,
+    "resume": "/root/oeis_runs/oeis-440m-14b-full-20260525_025101/latest.pt",
+    "save_final": true,
+    "trim_final_batch": true,
+    "allow_token_overshoot": false,
+    "val_data": "/root/oeis_val_decontam.jsonl",
+    "val_format": "auto",
+    "val_index_dir": "",
+    "val_every_tokens": 500000000,
+    "val_every_steps": 0,
+    "val_batches": 256,
+    "val_batch_size": 32,
+    "val_max_examples": 0,
+    "val_max_context_tokens": 0,
+    "oeis_eval_data": "",
+    "oeis_eval_every_tokens": 0,
+    "oeis_eval_every_steps": 0,
+    "oeis_eval_batch_size": 64,
+    "oeis_eval_max_examples": 0,
+    "oeis_eval_max_new_tokens": 20,
+    "oeis_eval_max_context_tokens": 0,
+    "oeis_eval_collect_examples": 3,
+    "oeis_eval_generation_backend": "legacy",
+    "expected_loss_tokens": 0,
+    "safe_preflight": true,
+    "allow_synthetic_only": false,
+    "allow_replacement_sampling": false,
+    "preflight_only": false,
+    "wandb": true,
+    "wandb_project": "oeis-massive",
+    "wandb_entity": "n8programs",
+    "wandb_run_name": "oeis-440m-14b-full-20260525_025101",
+    "wandb_id": "oeis440m14b_20260525_025101",
+    "wandb_resume": "allow",
+    "wandb_mode": "online",
+    "wandb_tags": "full,440m,14b,resume_skipdyn",
+    "log_every_steps": 10,
+    "seed": 0,
+    "report_json": "/root/oeis_runs/oeis-440m-14b-full-20260525_025101/report_resume_skipdyn_20260526_114400.json"
+  },
+  "transformers_compatible": true,
+  "rope_basis": "HF/Qwen split-half RoPE basis",
+  "source_rope_basis": "custom interleaved even/odd RoPE basis",
+  "conversion": "q_proj/k_proj rows and q_norm/k_norm weights permuted with run_oeis_nextterm_eval_torch._map_custom_state_to_transformers",
+  "oeis_vocab": {
+    "0-9": "digit tokens",
+    "10": "negative sign",
+    "11": "term separator",
+    "12": "BOS",
+    "13": "EOS",
+    "14": "PAD",
+    "15": "reserved"
+  },
+  "generation_defaults": {
+    "max_context_tokens": 4096,
+    "recommended_max_new_tokens_for_oeis_eval_neo": 192,
+    "stop_token_ids": [
+      11,
+      13,
+      14
+    ]
+  },
+  "dtype": "bfloat16",
+  "elapsed_seconds": 7.9373568799346685,
+  "local_conversion": {
+    "source": "/Users/natebreslow/Documents/khashabiLab/bigOEIS/hf_downloads/NextTerm-440M-Checkpoints/checkpoints/final_latest",
+    "conversion": "float tensors downcast from fp32 safetensors to bf16 safetensors for local MLX inference throughput"
+  }
+}

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+  "additional_special_tokens": [
+    "<unused>"
+  ],
+  "bos_token": "<bos>",
+  "eos_token": "<eos>",
+  "pad_token": "<pad>",
+  "unk_token": "<pad>"
+}

tokenizer.json ADDED Viewed

	@@ -0,0 +1,159 @@

+{
+  "version": "1.0",
+  "truncation": null,
+  "padding": null,
+  "added_tokens": [
+    {
+      "id": 12,
+      "content": "<bos>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    {
+      "id": 13,
+      "content": "<eos>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    {
+      "id": 14,
+      "content": "<pad>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    {
+      "id": 15,
+      "content": "<unused>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    }
+  ],
+  "normalizer": {
+    "type": "Sequence",
+    "normalizers": [
+      {
+        "type": "Replace",
+        "pattern": {
+          "Regex": "[^0-9,-]+"
+        },
+        "content": ""
+      },
+      {
+        "type": "Replace",
+        "pattern": {
+          "Regex": ",+"
+        },
+        "content": ","
+      },
+      {
+        "type": "Strip",
+        "strip_left": true,
+        "strip_right": true
+      },
+      {
+        "type": "Replace",
+        "pattern": {
+          "Regex": "^,+"
+        },
+        "content": ""
+      },
+  { "type": "Replace", "pattern": { "Regex": ",{2,}$" }, "content": "," }
+    ]
+  },
+  "pre_tokenizer": {
+    "type": "Split",
+    "pattern": {
+      "Regex": ""
+    },
+    "behavior": "Isolated",
+    "invert": false
+  },
+  "post_processor": {
+    "type": "TemplateProcessing",
+    "single": [
+      {
+        "SpecialToken": {
+          "id": "<bos>",
+          "type_id": 0
+        }
+      },
+      {
+        "Sequence": {
+          "id": "A",
+          "type_id": 0
+        }
+      }
+    ],
+    "pair": [
+      {
+        "Sequence": {
+          "id": "A",
+          "type_id": 0
+        }
+      },
+      {
+        "Sequence": {
+          "id": "B",
+          "type_id": 1
+        }
+      }
+    ],
+    "special_tokens": {
+      "<bos>": {
+        "id": "<bos>",
+        "ids": [
+          12
+        ],
+        "tokens": [
+          "<bos>"
+        ]
+      }
+    }
+  },
+  "decoder": {
+    "type": "Sequence",
+    "decoders": [
+      {
+        "type": "Replace",
+        "pattern": {
+          "String": " "
+        },
+        "content": ""
+      }
+    ]
+  },
+  "model": {
+    "type": "WordLevel",
+    "vocab": {
+      "0": 0,
+      "1": 1,
+      "2": 2,
+      "3": 3,
+      "4": 4,
+      "5": 5,
+      "6": 6,
+      "7": 7,
+      "8": 8,
+      "9": 9,
+      "-": 10,
+      ",": 11,
+      "<bos>": 12,
+      "<eos>": 13,
+      "<pad>": 14,
+      "<unused>": 15
+    },
+    "unk_token": "<pad>"
+  }
+}

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,47 @@

+{
+  "added_tokens_decoder": {
+    "12": {
+      "content": "<bos>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "13": {
+      "content": "<eos>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "14": {
+      "content": "<pad>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "15": {
+      "content": "<unused>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "additional_special_tokens": [
+    "<unused>"
+  ],
+  "bos_token": "<bos>",
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "<eos>",
+  "extra_special_tokens": {},
+  "model_max_length": 40960,
+  "pad_token": "<pad>",
+  "tokenizer_class": "PreTrainedTokenizerFast",
+  "unk_token": "<pad>"
+}