--- license: cc-by-nc-4.0 datasets: - flytech/python-codes-25k language: - en - py tags: - gpt2 - gguf - python - code-generation pipeline_tag: text-generation ollama: template: | {{- if .Prompt }}## Instruction: {{ .Prompt }} ### Input: Processing your request... {{ end -}} ### Response: {{- if .Response }}{{ .Response }}{{ end -}} params: temperature: 0.1 top_p: 0.85 repeat_penalty: 2.0 repeat_last_n: 32 num_ctx: 512 stop: - "## Instruction:" - "### Input:" - "### Response:" - "<|endoftext|>" --- # 🚀 Indenta-13M-Python (GGUF) An optimized from-scratch model made with a custom tokenizer and GPT-2 architecture. This model is built specifically for lightning-fast Python code completions and basic code generation. At ~13M parameters, it runs with near-zero latency on absolutely any hardware! --- ## 🛠️ Web UI & Multi-Turn Stability Note This model features a hyper-lightweight 13M parameter footprint optimized for single-task completions based directly on structural templates. Because it lacks the large capacity required for conversational context processing, it can drop formatting structure if a graphical Web UI forces a massive, multi-message chat stream into its context layers. ### For Best Performance in Web UIs: 1. **Use New Chat Threads:** Click the "New Chat" or "Clear" button in your user interface between coding tasks. This keeps the model's focus squarely on your active prompt. 2. **Synchronized Template:** This model card includes an aligned layout format (`## Instruction:` and `### Input:`) matching the training data to stop token bleeding across chat iterations.