SlitherCode commited on
Commit
0479392
·
verified ·
1 Parent(s): 2b83ce3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -31,7 +31,7 @@ Custom decoder-only transformer:
31
  - **Tokens seen:** ~4B
32
  - **Steps:** 30,000
33
  - **Optimizer:** AdamW (lr=3e-4, cosine decay to 3e-5)
34
- - **Hardware:** Single A100 80GB
35
 
36
  ## Installation
37
 
@@ -57,4 +57,5 @@ print(tokenizer.decode(out[0], skip_special_tokens=True))
57
 
58
  ## License
59
 
60
- Model weights: MIT. Training data: ODC-By 1.0.
 
 
31
  - **Tokens seen:** ~4B
32
  - **Steps:** 30,000
33
  - **Optimizer:** AdamW (lr=3e-4, cosine decay to 3e-5)
34
+ - **Hardware:** Single A100 40GB
35
 
36
  ## Installation
37
 
 
57
 
58
  ## License
59
 
60
+ Model weights: MIT.
61
+ Training data: This work uses the FineWeb-Edu dataset, available under the Open Data Commons Attribution License (ODC-By 1.0).