etwk commited on
Commit
8ed8c45
·
1 Parent(s): 41fc51b

Docs: fix stale "Five weight-sets" heading; GPU-qualify the latency claim

Browse files

The body and table already describe the two shared weight-sets; this fixes the
leftover section heading and notes the ~174s/300s figure is GPU timing (the rules'
evaluation guidance assumes GPU batching). Model, weights, and manifest unchanged.

Files changed (1) hide show
  1. README.md +4 -3
README.md CHANGED
@@ -47,7 +47,7 @@ The single-step function is **piecewise linear** (`2t + bit*b`, then subtract 0,
47
  `2p`), which is why it generalises across primes where the full bilinear map does not:
48
  held-out-prime validation accuracy tracks training accuracy throughout (no memorisation gap).
49
 
50
- ## Five weight-sets, routed by prime size
51
 
52
  The recurrence is exact only if the state is wide enough to hold the residue, so each cell is
53
  trained per bit-width — but because the dilated convolution is weight-shared across bit-positions
@@ -269,8 +269,9 @@ P(acc<0.90) 0.002% / worst-prime 0.933 (primary key held). Public benchmark: tie
269
  Per-tier at total=1100: tiers 1–10 all **1.00**
270
  (overall_accuracy is the mean over tiers 1-10). Tier 0 (pure multiplication, primes near each
271
  width's maximum — a separate regime, not in overall_accuracy) is **0.70** on this fixed public
272
- seed. Inference for all 1100 problems is ~174s, within the 300s budget (the 2048-step tier-10 scan
273
- is the bulk); artifact 0.04 GB.
 
274
 
275
  ## Status under the rules
276
 
 
47
  `2p`), which is why it generalises across primes where the full bilinear map does not:
48
  held-out-prime validation accuracy tracks training accuracy throughout (no memorisation gap).
49
 
50
+ ## Two weight-sets, routed by prime size
51
 
52
  The recurrence is exact only if the state is wide enough to hold the residue, so each cell is
53
  trained per bit-width — but because the dilated convolution is weight-shared across bit-positions
 
269
  Per-tier at total=1100: tiers 1–10 all **1.00**
270
  (overall_accuracy is the mean over tiers 1-10). Tier 0 (pure multiplication, primes near each
271
  width's maximum — a separate regime, not in overall_accuracy) is **0.70** on this fixed public
272
+ seed. Inference for all 1100 problems is ~174s **on GPU** (the 2048-step tier-10 scan is the bulk),
273
+ within the 300s budget; the rules' evaluation guidance assumes GPU batching via
274
+ `predict_digits_batch` (`rules/evaluation.md:229`). artifact 0.04 GB.
275
 
276
  ## Status under the rules
277