--- license: other license_name: msai-sovereign license_link: LICENSE language: - en - cy - gd - ga pipeline_tag: text-generation tags: - mother-core - msai - sovereign-ai - united-kingdom - causal-lm library_name: transformers --- # MOTHER CORE V2 — chunk 600 (W2.8 cutover base) **Sovereign UK AI built from scratch by [MediaStream AI Limited (MSAI)](https://mediastreamai.com).** This is **MOTHER CORE BASE** — the frozen foundation checkpoint at chunk 600 of the W2.7 → W2.8 training programme. All downstream MOTHER models (DEFENCE, ROBOTICS, LLM, CODE) build on this base. - **Founder & CEO and Lead AI Architect:** Christopher Kenna - **Parameters:** 6.88B (FP32 source, BF16 weights here) - **Architecture:** 48 layers, dim 3072, 24 heads, 6 KV heads (GQA 4:1), RoPE θ=10000, RMS norm, tied embeddings - **Context:** 4096 tokens - **Training:** From-scratch sovereign UK build — no fine-tuning of external models - **Source SHA256:** `0b1ef35ec60af4a7ad0648498de8526cb775a19501dda94dfbda1713e0475b60` ## Training journey | Milestone | Eval (105-question harness) | |---|---| | Chunk 450 (initial W2.7 baseline) | 47/105 (45%) | | Chunk 506 (post LR-fix rollback) | 44/105 (42%) | | Chunk 550 (recovery, LR-capped) | 46/105 (44%) | | **Chunk 600 (BASE freeze)** | **49/105 (47%)** | ## Scope **MOTHER CORE handles:** math, science, reasoning, chain-of-thought, UK knowledge, MOTHER identity, tool calling (agents, RAG, memory, workflows), multilingual responses (English, Welsh, Irish, Scottish Gaelic), safety refusals. **MOTHER CORE does NOT handle (separate sister models):** - **MOTHER CODE** — software engineering, code generation - **MOTHER LLM** — long-form creative writing, instruction-tuned content - **MOTHER DEFENCE** — defence reasoning and strategy (W3 programme, builds on this BASE) - **MOTHER ROBOTICS** — humanoid robot embodiment (W4 programme, builds on this BASE) ## Usage ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch tok = AutoTokenizer.from_pretrained("MediaStreamAI/MOTHER_CORE_V2") model = AutoModelForCausalLM.from_pretrained( "MediaStreamAI/MOTHER_CORE_V2", torch_dtype=torch.bfloat16, device_map="auto", ) prompt = "Question:\n\nWhat is the capital of Wales?\n\nAnswer:" inputs = tok(prompt, return_tensors="pt", add_special_tokens=True).to(model.device) out = model.generate( **inputs, max_new_tokens=200, do_sample=False, repetition_penalty=1.3, no_repeat_ngram_size=4, pad_token_id=tok.pad_token_id, ) print(tok.decode(out[0], skip_special_tokens=True)) ``` **Critical inference rules:** - Prompt wrap: `"Question:\n\n{q}\n\nAnswer:"` (exact whitespace) - BOS token: 1 (required, `add_bos_token=True`) - EOS token: 2 - PAD token: 0 - **Use greedy decoding only.** Sampling produces gibberish. - Repetition penalty: 1.3, frequency-scaled - No-repeat n-gram size: 4 ## Programme context - **W2.7 (complete)** — Core capability training: math, science, reasoning, identity, UK knowledge, multilingual, agent tool-calling, RAG, chat, memory, workflows - **W2.8 (in progress)** — Document routing, argument validation, agent verifier loops, multi-step orchestration - **W3** — MOTHER DEFENCE (defence reasoning and strategy) - **W4** — MOTHER ROBOTICS (embodied awareness for humanoid platforms) UK sovereign infrastructure: Manchester (HQ), Dundee (flagship DC), Durham. Phase 2 expansion H2 2026 to Düsseldorf, South Africa, Jamaica. ## License MSAI Sovereign License. See LICENSE file. Built sovereign in the UK, not derived from any externally-licensed pre-trained model. ## Contact MediaStream AI Limited West Tower, 371 Deansgate, Manchester M15 4UR, United Kingdom [mediastreamai.com](https://mediastreamai.com)