etwk commited on
Commit ·
e6d48a0
1
Parent(s): fc0c8f4
docs: clarify the two-step artifact reduction (0.77 -> ~0.13 -> 0.08 GB)
Browse filesThe 16/32-bit MLP->TCN shrink took the artifact to ~0.13 GB; the later mid-cell
collapse to one shared weight-set brought the total to 0.08 GB. Avoids implying
the small-cell change alone produced 0.08 GB.
README.md
CHANGED
|
@@ -83,7 +83,8 @@ makes the 128/256/512/1024-step chains hold up.
|
|
| 83 |
**Every cell — including the 16- and 32-bit small-prime cells — is now this same architecture.**
|
| 84 |
The two small cells were originally width-4096/6144 MLPs (660 MB combined); replacing them with
|
| 85 |
the carry-aware TCN, trained width-matched (bit-length-uniform over the cell's whole range),
|
| 86 |
-
shrank the artifact from 0.77 GB to
|
|
|
|
| 87 |
the small-prime tiers width-robust — a TCN trained near-max-width only has a short-prime blind
|
| 88 |
spot (see the audit note below), which the width-matched training removes.
|
| 89 |
|
|
|
|
| 83 |
**Every cell — including the 16- and 32-bit small-prime cells — is now this same architecture.**
|
| 84 |
The two small cells were originally width-4096/6144 MLPs (660 MB combined); replacing them with
|
| 85 |
the carry-aware TCN, trained width-matched (bit-length-uniform over the cell's whole range),
|
| 86 |
+
shrank the artifact from 0.77 GB to ~0.13 GB (the later mid-cell collapse then brought the total
|
| 87 |
+
to **0.08 GB**), raised tier 4 from 0.99 to **1.00**, and made
|
| 88 |
the small-prime tiers width-robust — a TCN trained near-max-width only has a short-prime blind
|
| 89 |
spot (see the audit note below), which the width-matched training removes.
|
| 90 |
|