SentinelBrain 14B MoE v0.1 - Frankenstein Realignment v2

This repository now includes SentinelBrain Frankenstein realignment v2 artifacts from the AMD MI300X run completed on 2026-05-03.

v2 Training Update

  • Architecture: custom SentinelBrain sparse MoE decoder, approximately 14.4B stored parameters, 4 experts, top-2 routing, 24 layers, d_model 4096, seq_len 4096.
  • Hardware: AMD Instinct MI300X via ROCm/HIP.
  • Run: Frankenstein realignment v2 from raw merged checkpoint.
  • Completed steps: 5,000.
  • Total training tokens during realignment: approximately 0.98B.
  • Best validation loss observed: 5.1359.
  • Final checkpoint: checkpoints/frankenstein_v2_final.pt.
  • Best checkpoint: checkpoints/frankenstein_v2_best.pt.
  • EMA best checkpoint: checkpoints/frankenstein_v2_ema_best.pt.
  • Previous Hugging Face version preserved on branch: previous-before-v2-realign-5000-20260503-103121.

Included Files

  • checkpoints/frankenstein_v2_final.pt: full final checkpoint at step 5000, including optimizer/progress state.
  • checkpoints/frankenstein_v2_best.pt: best model-only checkpoint by validation loss.
  • checkpoints/frankenstein_v2_ema_best.pt: EMA best checkpoint from the v2 run.
  • checkpoints/sentinelbrain_pretrain_step2471_hf.pt: pretrain anchor used for comparison.
  • logs/realign_v2.log: full realignment console log.
  • logs/realign_v2_metrics.jsonl: step metrics emitted during training.
  • reports/train_metrics_final.json: final dashboard training metrics snapshot.
  • reports/conductor_state_final.json: final dashboard/conductor state.
  • reports/sft_combined_ready_report.*: cleaned SFT dataset preflight report.
  • reports/sentinelbrain_quality_stub_full_fixed.json: MI300X executable-code benchmark report.

A full progress archive containing all v2 milestones and optimizer-bearing checkpoints is backed up off-Hub on the Azure VM at /home/msrusu/sentinelbrain_backups/v2_realign_5000/sentinelbrain_v2_realign_full_20260503.tar.zst. A SHA256 sidecar is generated at archive completion.

Current Evaluation Notes

MI300X executable-code tests show that v2 realignment is not yet ready as a coding assistant checkpoint:

Checkpoint Pass@1 Syntax Rate Notes
frankenstein_v2_best.pt 0.0% 62.5% Failed all 8 HumanEval-style stub tasks.
frankenstein_v2_final.pt 0.0% 75.0% Failed all 8 HumanEval-style stub tasks.
sentinelbrain_pretrain_step2471_hf.pt 0.0% 87.5% Failed all 8 tasks but produced the most syntactically valid Python.

Interpretation: v2 successfully completed corpus realignment and preserved all progress artifacts, but it needs a focused next phase of executable code SFT, function-call/chat formatting, and auto-critic rejection sampling before quality claims should be made.

Dataset Preparation Status

The next SFT combined dataset was cleaned non-destructively on the MI300X server:

  • Input rows: 42,138.
  • Kept rows: 32,996 (78.3%).
  • Removed rows: 9,142.
  • Max estimated tokens: 3,072.
  • Main removals: short assistant/user outputs, garbage responses, repetitive responses, and over-length samples.

Loading

These are custom SentinelBrain PyTorch checkpoints, not standard Hugging Face AutoModelForCausalLM weights. Load with the SentinelBrain code from /workspace/sentinelprime or the matching source package.

import torch
from config import ModelConfig
from model.sentinel import SentinelBrain

ckpt = torch.load("checkpoints/frankenstein_v2_best.pt", map_location="cpu", weights_only=False)
model = SentinelBrain(ModelConfig())
state = ckpt.get("model_state_dict") or ckpt.get("model") or ckpt
model.load_state_dict(state, strict=False)
model.eval()

Next Phase Direction

The recommended next phase is a controlled SFT/auto-critic cycle: train from the pretrain anchor plus selected v2 weights only after passing format probes, prioritize executable Python/TypeScript/code-repair datasets, reject non-compiling generations, and benchmark every 250-500 steps before continuing.

Downloads last month
174
Safetensors
Model size
15B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support