Rohanify commited on
Commit
34186d6
·
verified ·
1 Parent(s): 313038f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +32 -7
README.md CHANGED
@@ -19,17 +19,20 @@ ollama:
19
  ### Response:
20
  params:
21
  temperature: 0.1
22
- top_p: 0.9
23
- repeat_penalty: 1.6
 
 
24
  stop:
25
  - "### Instruction:"
 
 
26
  ---
27
 
28
  # 🚀 Indenta-13M-Python (GGUF)
29
 
30
- An optimized from-scratch model made with a custom tokenizer and gpt-2 architecture.
31
- This model is made for python code completions and basic python code generation.
32
- This model has ~13M parameters, making it ideal for almost any machine!
33
 
34
  ---
35
 
@@ -37,7 +40,29 @@ This model has ~13M parameters, making it ideal for almost any machine!
37
 
38
  Because the system configuration is baked directly into this Hugging Face repository card, nobody needs to manually create a local `Modelfile`.
39
 
40
- You or anyone else can spin it up immediately by targeting the Hugging Face repository link. Run this command in your terminal:
 
 
 
 
 
 
 
41
 
42
  ```bash
43
- ollama run hf.co/Rohanify/Indenta-13M-Python
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
  ### Response:
20
  params:
21
  temperature: 0.1
22
+ top_p: 0.8
23
+ repeat_penalty: 2.0
24
+ repeat_last_n: 64
25
+ num_ctx: 256
26
  stop:
27
  - "### Instruction:"
28
+ - "### Response:"
29
+ - "<|endoftext|>"
30
  ---
31
 
32
  # 🚀 Indenta-13M-Python (GGUF)
33
 
34
+ An optimized from-scratch model made with a custom tokenizer and GPT-2 architecture.
35
+ This model is built specifically for lightning-fast Python code completions and basic code generation. At ~13M parameters, it runs with near-zero latency on absolutely any hardware!
 
36
 
37
  ---
38
 
 
40
 
41
  Because the system configuration is baked directly into this Hugging Face repository card, nobody needs to manually create a local `Modelfile`.
42
 
43
+ ### ⚠️ Critical Usage Note for 13M Parameters
44
+ Because this model is highly optimized and ultra-lightweight (~13M parameters), it is architectural design-limited to **single-turn tasks**. It does not possess a multi-turn chat memory tracking mechanism.
45
+ Allowing Ollama to stack message history will cause the attention layers to collapse.
46
+
47
+ To ensure a perfect experience without crashes, use either of the two methods below:
48
+
49
+ ### Method 1: Stateless Mode (Recommended 🚀)
50
+ Pass your prompt directly inside quotes in your terminal. This forces Ollama to run a clean, stateless, single-turn generation that will **never crash**:
51
 
52
  ```bash
53
+ ollama run hf.co/Rohanify/Indenta-13M-Python "write a function to reverse a list"
54
+ ```
55
+
56
+ ### Method 2: Interactive Terminal Mode
57
+ If you are using the interactive chat loop (`ollama run hf.co/Rohanify/Indenta-13M-Python`), simply wipe the conversation memory before typing your next prompt by entering `/clear`:
58
+
59
+ ```text
60
+ >>> write a for loop
61
+ [Model generates code safely]
62
+
63
+ >>> /clear
64
+ Cleared session context
65
+
66
+ >>> reverse a linked list
67
+ [Model generates next code safely]
68
+ ```