Welcome to htmLLM v2 124M!
With this LLM, we wanted to see, how well tiny LLMs with just 124 million parameters can perform on coding tasks.
This model is also a bit finetuned using html_alpaca directly in the pretraining.
If you want to try it, you can use htmllm.ipynb in the HF model files and download the model weight from this HF model.
Code
All code can be accessed via the file htmllm_v2_124m.ipynb in this HF model.
Weights
The final base model checkpoint can be downloaded here in the files list as ckpt.pt. It will be available soon!
Training
We trained our model on a single Kaggle T4 GPU.
Thanks to:
- Andrej Karpathy and his nanoGPT code
- Kaggle for the free GPU hours for training on the T4
- You all for your support on my reddit.
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support