| --- |
| license: apache-2.0 |
| language: |
| - en |
| - es |
| - zh |
| - fr |
| - de |
| - ja |
| - ko |
| tags: |
| - mlx |
| - uncensored |
| - abliterated |
| - osirisbrain |
| - apple-silicon |
| - qwen3.5 |
| - agi |
| base_model: Qwen/Qwen3.5-9B |
| pipeline_tag: text-generation |
| library_name: mlx |
| --- |
| |
| # OsirisCortex-v7-MLX |
|
|
| **The Cortex v7** — Osiris's sovereign reasoning brain. Fully uncensored (abliterated), 0% refusal rate. Runs natively on Apple Silicon via MLX Metal. |
|
|
| ## Architecture |
|
|
| - **Base Model:** Qwen3.5-9B (9 billion parameters) |
| - **Modification:** Abliterated (orthogonal projection + LoRA fine-tuning for 0% refusal) |
| - **Format:** MLX 4-bit quantized (Apple Silicon native) |
| - **Size:** ~4.7 GB |
| - **Speed:** ~80-120 tokens/sec on M2 Pro (MLX Metal) |
|
|
| ## What Changed from v6 |
|
|
| - **v6:** OsirisCortex-v6 was GGUF format running via llama.cpp server |
| - **v7:** Native MLX format, runs via `mlx_lm.server` — faster, lower memory, Apple-native |
| - **Uncensored:** Zero refusal — answers everything the user asks without moral gatekeeping |
|
|
| ## Usage |
|
|
| ```python |
| from mlx_lm import load, generate |
| |
| model, tokenizer = load("osirisbrain/OsirisCortex-v7-MLX") |
| prompt = tokenizer.apply_chat_template( |
| [{"role": "user", "content": "Explain quantum computing"}], |
| add_generation_prompt=True |
| ) |
| response = generate(model, tokenizer, prompt=prompt, max_tokens=2048) |
| ``` |
|
|
| ## Credits |
|
|
| Abliterated by [lukey03](https://huggingface.co/lukey03/Qwen3.5-9B-abliterated-MLX-4bit). |
| Original model: [Qwen/Qwen3.5-9B](https://huggingface.co/Qwen/Qwen3.5-9B) by Alibaba. |
|
|