Indenta-13M-Python / README.md
Rohanify's picture
Update README.md
050df37 verified
---
license: cc-by-nc-4.0
datasets:
- flytech/python-codes-25k
language:
- en
- py
tags:
- gpt2
- gguf
- python
- code-generation
pipeline_tag: text-generation
ollama:
template: |
{{- if .Prompt }}## Instruction:
{{ .Prompt }}
### Input:
Processing your request...
{{ end -}}
### Response:
{{- if .Response }}{{ .Response }}{{ end -}}
params:
temperature: 0.1
top_p: 0.85
repeat_penalty: 2.0
repeat_last_n: 32
num_ctx: 512
stop:
- "## Instruction:"
- "### Input:"
- "### Response:"
- "<|endoftext|>"
---
# 🚀 Indenta-13M-Python (GGUF)
An optimized from-scratch model made with a custom tokenizer and GPT-2 architecture.
This model is built specifically for lightning-fast Python code completions and basic code generation. At ~13M parameters, it runs with near-zero latency on absolutely any hardware!
---
## 🛠️ Web UI & Multi-Turn Stability Note
This model features a hyper-lightweight 13M parameter footprint optimized for single-task completions based directly on structural templates.
Because it lacks the large capacity required for conversational context processing, it can drop formatting structure if a graphical Web UI forces a massive, multi-message chat stream into its context layers.
### For Best Performance in Web UIs:
1. **Use New Chat Threads:** Click the "New Chat" or "Clear" button in your user interface between coding tasks. This keeps the model's focus squarely on your active prompt.
2. **Synchronized Template:** This model card includes an aligned layout format (`## Instruction:` and `### Input:`) matching the training data to stop token bleeding across chat iterations.