shibatch commited on
Commit
8f66657
·
verified ·
1 Parent(s): 807ecce

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +2 -4
README.md CHANGED
@@ -16,7 +16,7 @@ tags:
16
  This repository provides ultra-lightweight Llama2 model files across various formats (both **GGUF** and **Hugging Face / Safetensors**), trained on the TinyStories dataset and optimized for compatibility with Andrej Karpathy's `llama2.c` and `llama.cpp`.
17
 
18
  ### Why this repository exists
19
- When developing a custom LLM inference engine from scratch (C/C++, Vulkan, WebAssembly, etc.) or testing custom hardware kernels, debugging with a full-sized model is slow. This suite offers a true **1M parameter scale model** (~1MB to ~4MB depending on the quantization format), allowing developers to validate their loaders, serialization, quantization kernels, and inference logic step-by-step with maximum efficiency.
20
 
21
  ---
22
 
@@ -68,8 +68,6 @@ To verify your local setup or compare tokens using the official native utilities
68
 
69
  The `model.bin` is fully compatible with the 512-vocab `tokenizer.bin` derived from the `stories260k` asset pipeline.
70
 
71
- > ⚠️ **Important Note for `llama2.c/run`:** When passing a prompt to the `run` binary, you must use the **`-i`** option. Do not use `-p`, as `-p` is reserved for the Top-p sampling threshold in `llama2.c`, which will cause the prompt to be ignored.
72
-
73
  ```bash
74
  ./run model.bin -z tokenizer.bin -i "Tom and Jerry are " -n 64
75
 
@@ -83,7 +81,7 @@ You can import the Hugging Face variant directly into Python using the `transfor
83
  import torch
84
  from transformers import AutoTokenizer, AutoModelForCausalLM
85
 
86
- repo_id = "your-username/your-repo-name"
87
 
88
  # The library automatically looks into the hf/ folder using the subfolder argument
89
  tokenizer = AutoTokenizer.from_pretrained(repo_id, subfolder="hf")
 
16
  This repository provides ultra-lightweight Llama2 model files across various formats (both **GGUF** and **Hugging Face / Safetensors**), trained on the TinyStories dataset and optimized for compatibility with Andrej Karpathy's `llama2.c` and `llama.cpp`.
17
 
18
  ### Why this repository exists
19
+ When developing a custom LLM inference engine, debugging with a full-sized model is slow. This suite offers a true **1M parameter scale model** (~1MB to ~4MB depending on the quantization format), allowing developers to validate their loaders, serialization, quantization kernels, and inference logic step-by-step with maximum efficiency.
20
 
21
  ---
22
 
 
68
 
69
  The `model.bin` is fully compatible with the 512-vocab `tokenizer.bin` derived from the `stories260k` asset pipeline.
70
 
 
 
71
  ```bash
72
  ./run model.bin -z tokenizer.bin -i "Tom and Jerry are " -n 64
73
 
 
81
  import torch
82
  from transformers import AutoTokenizer, AutoModelForCausalLM
83
 
84
+ repo_id = "shibatch/tiny1m"
85
 
86
  # The library automatically looks into the hf/ folder using the subfolder argument
87
  tokenizer = AutoTokenizer.from_pretrained(repo_id, subfolder="hf")