maxsonderby commited on
Commit
692d39b
·
verified ·
1 Parent(s): 05d6729

Mark Helios Rabbit 1.0 as quarantined

Browse files
Files changed (1) hide show
  1. README.md +7 -3
README.md CHANGED
@@ -19,6 +19,8 @@ pipeline_tag: text-generation
19
 
20
  # Helios Rabbit 1.0
21
 
 
 
22
  Helios Rabbit 1.0 is a local agentic model developed and adapted by Union Street AI, an SF lab focused on practical local AI systems.
23
 
24
  This release is a merged MLX checkpoint adapted from [Jackrong/Qwopus3.6-35B-A3B-v1](https://huggingface.co/Jackrong/Qwopus3.6-35B-A3B-v1). It is not trained from scratch. It is a post-training pass for the Helios model family, tuned for local coding agents, infrastructure work, tool-use judgment, candid adult conversation, and calibrated uncertainty.
@@ -36,9 +38,11 @@ Rabbit is tuned to be:
36
 
37
  It should not identify as Qwen, Alibaba, Claude, GPT, Grok, or Qwopus except when discussing base-model lineage.
38
 
39
- ## Evaluation
 
 
40
 
41
- Internal constitutional ladder eval:
42
 
43
  - 58 prompts
44
  - judge: `thurgood/gemma-4-31b-it`
@@ -73,4 +77,4 @@ python -m mlx_lm generate \
73
 
74
  ## Notes
75
 
76
- This is an experimental open release. Use judgment before deploying it in systems with real-world side effects.
 
19
 
20
  # Helios Rabbit 1.0
21
 
22
+ > **Quarantined release:** post-release testing found severe output degeneration in this checkpoint, including repeated-token loops and leaked renderer/tool-trace artifacts. Do not use this model as a production or evaluation baseline. It remains published for provenance while Union Street AI rebuilds Rabbit from a clean Qwopus base and sanitized data.
23
+
24
  Helios Rabbit 1.0 is a local agentic model developed and adapted by Union Street AI, an SF lab focused on practical local AI systems.
25
 
26
  This release is a merged MLX checkpoint adapted from [Jackrong/Qwopus3.6-35B-A3B-v1](https://huggingface.co/Jackrong/Qwopus3.6-35B-A3B-v1). It is not trained from scratch. It is a post-training pass for the Helios model family, tuned for local coding agents, infrastructure work, tool-use judgment, candid adult conversation, and calibrated uncertainty.
 
38
 
39
  It should not identify as Qwen, Alibaba, Claude, GPT, Grok, or Qwopus except when discussing base-model lineage.
40
 
41
+ ## Evaluation Status
42
+
43
+ The earlier internal constitutional ladder score below is no longer considered sufficient evidence of release quality. The eval overfit/missed severe generation failures in open-ended chat.
44
 
45
+ Previous internal constitutional ladder eval:
46
 
47
  - 58 prompts
48
  - judge: `thurgood/gemma-4-31b-it`
 
77
 
78
  ## Notes
79
 
80
+ This is an experimental quarantined release. Use Helios Lynx or Helios Pika instead until Rabbit is rebuilt.