YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

AGILLM 4.3 โ€” Autoregressive + DiffusionBlock + MoE Language Model

Single-file implementation: agillm41.py Parameters: 1.22B (1,221,580,802) Architecture: d_model=1280, layers=28, heads=20, d_k=64, rank=160 (2.5ร— expansion), tied weights


โš ๏ธ CHECKPOINT PROVENANCE โ€” READ FIRST

Checkpoint filenames (e.g. pretrain_step00050650.pt) reflect the step counter within the current training run, NOT total training steps.

This repo contains multiple checkpoint lineages. The 2026-06-24 pretrain_step00050650.pt artifact did warm-start from step 2,182,564 (~2.1M) of a prior run, but that is historical provenance, not the current recovery base. Do not restart current AGILLM4.3 recovery from raw pretrain_step02182564.pt unless explicitly doing a clean historical rollback experiment.

Artifact / run Meaning
pretrain_step00050650.pt Historical current-run step 50,650 after the 2,182,564 warm-start.
pretrain_step00243186_from00050650_20260630T1811Z.pt Later-lineage v100a0 checkpoint selected for the 2026-07-01 recovery because its June 30 inference was materially better than the July 1 latest delta.
pretrain_step00359091.pt + FedC delta pretrain_delta_step00030961_from00359091_20260701T0522Z.pt July 1 path that produced fragment/date-token regression in AR/SAT/NAT smoke tests; do not report this quality as healthy.

Current recovery checkpoint on HF:

checkpoints/pretrain_step00243186_from00050650_20260630T1811Z.pt

Architecture

Component Value
Backbone Autoregressive transformer (AR)
DiffusionBlocks Active โ€” layers cycle AR/SAT/NAT objectives
Mixture-of-Experts Active โ€” 14 slots per block
d_model 1280
Layers 28
Attention heads 20
Tied weights Yes
Tokenizer Llama-compatible (from checkpoint)

Training Fleet (as of 2026-06-24)

  • FedA (41441116): 2ร— V100-SXM2-32GB, ssh2.vast.ai:11116, $0.0593/hr
    • a0: role=coverage, B=56, L=1536
    • a1: role=hard-blocks, B=48, L=1536
  • Target: 67.2B tokens total
  • Budget runway: ~Jul 24, 2026

Current Recovery Run (2026-07-01)

  • FedC Vast host: ae2bb300509f / RTX 3090 Ti.
  • Live recovery PID at verification: 7100.
  • Warm-start: checkpoints/pretrain_step00243186_from00050650_20260630T1811Z.pt (v100a0 later-lineage checkpoint, SHA256 e65d65ba82239f28e10188767fe16ba091dad11c60bb57aac346ded684604349).
  • Corrected source mix: FineWeb, FineWeb-Edu sample-10BT, Wikipedia 20231101.en, C4 en, OpenWebText, Falcon-RefinedWeb, Proof-Pile-2.
  • Excluded from AGILLM4.3 pretraining: local AGILLM3 numeracy JSONL (/workspace/agillm_math_numeracy_synth/train.jsonl) and Dolma sample source.
  • Initial corrected validation: ce=9.1199; first stable progress line step=101, 61962.18 tok/s, loss=6.818.

Inference

# AR mode (standard autoregressive)
python3 agillm41.py infer \
  --ckpt checkpoints/warmstart_step2182564__current_step50650/pretrain_step00050650.pt \
  --prompt "Your prompt here" \
  --mode ar --max_new 100 --plain-output --block_stream

# SAT mode (score-and-threshold diffusion)
python3 agillm41.py infer ... --mode sat

# NAT mode (non-autoregressive diffusion)
python3 agillm41.py infer ... --mode nat

Note: If both GPUs are busy with training, add CUDA_VISIBLE_DEVICES="" to force CPU inference (slow but functional: ~1.2 tok/s).

Dependency: agillm_checkpoint_provenance.py must be in the same directory as agillm41.py.


Current Inference Quality / Recovery Status (2026-07-01)

See INFERENCE_QUALITY.md for AR/SAT/NAT benchmark outputs and regression notes.

The July 1 FedC latest-delta smoke test was not healthy: AR/SAT/NAT outputs were dominated by date/number/token fragments. Treat that as a quality regression, not as a pass.

A corrected FedC recovery run is live from later-lineage v100a0 checkpoint pretrain_step00243186_from00050650_20260630T1811Z.pt, whose archived June 30 AR sample was materially better than the regressed July 1 delta. At launch, the corrected run used the language/generic-math mix only and validated with language_mix=True numeracy=False.

Before reporting model quality healthy again, run AR + SAT + NAT inference on the next saved checkpoint from the recovery run and record it in INFERENCE_QUALITY.md.


Repositories

Repo Type Notes
Marxist-Leninist/agillm4.3-private GitHub private Source of truth for code
Marxist-Leninist/AGILLM4.3 GitHub public Mirror
Marxist-Leninist/AGILLM4.1 GitHub public Mirror (same codebase)
Marxist-Leninist/agillm4.1-private GitHub private Mirror
OpenTransformer/AGILLM-4.3 HuggingFace public Code, inference artifacts, and active recovery checkpoints
OpenTransformer/agillm4.3-private HuggingFace private Historical/private mirror; do not use for active recovery checkpoint uploads unless explicitly requested
OpenTransformer/AGILLM-4.3 HuggingFace public Code + checkpoints

For Future Claude/AI Agents

MCP memory (Silicon Goddess) slot index for AGILLM4.3 state: slots 42, 95, 481โ€“525+. Standing instruction: always run AR + SAT + NAT inference checks before reporting training healthy. See INFERENCE_QUALITY.md.

Latest Inference Smoke Test - 2026-06-26

Latest smoke-test artifacts were uploaded under training/agillm43_shared/inference/20260626T183400Z/.

  • Monolithic latest-checkpoint AR: /workspace/agillm4_v100a0_ckpts/pretrain_step00065633_from00050650_20260626T1811Z.pt, 32 tokens at 5.0 tok/s on CPU.
  • Distributed AR: existing 2026-06-06 split packages across GETH/MCP/Prime/communist-web, 32 tokens at 1.504 tok/s.
  • Status aliases: training/agillm43_shared/status/latest_inference.md and .json.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support