etwk commited on
Commit ·
8ed8c45
1
Parent(s): 41fc51b
Docs: fix stale "Five weight-sets" heading; GPU-qualify the latency claim
Browse filesThe body and table already describe the two shared weight-sets; this fixes the
leftover section heading and notes the ~174s/300s figure is GPU timing (the rules'
evaluation guidance assumes GPU batching). Model, weights, and manifest unchanged.
README.md
CHANGED
|
@@ -47,7 +47,7 @@ The single-step function is **piecewise linear** (`2t + bit*b`, then subtract 0,
|
|
| 47 |
`2p`), which is why it generalises across primes where the full bilinear map does not:
|
| 48 |
held-out-prime validation accuracy tracks training accuracy throughout (no memorisation gap).
|
| 49 |
|
| 50 |
-
##
|
| 51 |
|
| 52 |
The recurrence is exact only if the state is wide enough to hold the residue, so each cell is
|
| 53 |
trained per bit-width — but because the dilated convolution is weight-shared across bit-positions
|
|
@@ -269,8 +269,9 @@ P(acc<0.90) 0.002% / worst-prime 0.933 (primary key held). Public benchmark: tie
|
|
| 269 |
Per-tier at total=1100: tiers 1–10 all **1.00**
|
| 270 |
(overall_accuracy is the mean over tiers 1-10). Tier 0 (pure multiplication, primes near each
|
| 271 |
width's maximum — a separate regime, not in overall_accuracy) is **0.70** on this fixed public
|
| 272 |
-
seed. Inference for all 1100 problems is ~174s
|
| 273 |
-
|
|
|
|
| 274 |
|
| 275 |
## Status under the rules
|
| 276 |
|
|
|
|
| 47 |
`2p`), which is why it generalises across primes where the full bilinear map does not:
|
| 48 |
held-out-prime validation accuracy tracks training accuracy throughout (no memorisation gap).
|
| 49 |
|
| 50 |
+
## Two weight-sets, routed by prime size
|
| 51 |
|
| 52 |
The recurrence is exact only if the state is wide enough to hold the residue, so each cell is
|
| 53 |
trained per bit-width — but because the dilated convolution is weight-shared across bit-positions
|
|
|
|
| 269 |
Per-tier at total=1100: tiers 1–10 all **1.00**
|
| 270 |
(overall_accuracy is the mean over tiers 1-10). Tier 0 (pure multiplication, primes near each
|
| 271 |
width's maximum — a separate regime, not in overall_accuracy) is **0.70** on this fixed public
|
| 272 |
+
seed. Inference for all 1100 problems is ~174s **on GPU** (the 2048-step tier-10 scan is the bulk),
|
| 273 |
+
within the 300s budget; the rules' evaluation guidance assumes GPU batching via
|
| 274 |
+
`predict_digits_batch` (`rules/evaluation.md:229`). artifact 0.04 GB.
|
| 275 |
|
| 276 |
## Status under the rules
|
| 277 |
|