File size: 3,123 Bytes
7229188 a9c333d 7229188 a9c333d 7229188 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 | ---
license: mit
---
# Model Card
# Qwen2-0.5B-Python-SFT (LoRA)
## Overview
This model is a Supervised Fine-Tuned (SFT) version of **Qwen/Qwen2-0.5B**, adapted for Python instruction-following tasks.
The fine-tuning was performed using QLoRA (4-bit quantization + LoRA adapters) on a curated Python instruction dataset to improve structured code generation and instruction alignment.
This repository contains **LoRA adapter weights**, not the full base model.
## Base Model
* Base: `Qwen/Qwen2-0.5B`
* Architecture: Decoder-only Transformer
* Parameters: 0.5B
* License: Refer to original Qwen license
Base model must be loaded separately.
## Training Dataset
* Dataset: `iamtarun/python_code_instructions_18k_alpaca`
* Size: ~18,000 instruction-output pairs
* Format: Alpaca-style instruction → response
* Domain: Python programming tasks
Each training sample followed:
```
Below is an instruction that describes a task.
Write a response that appropriately completes the request.
### Instruction:
...
### Response:
...
```
## Training Details
* Method: QLoRA (4-bit)
* Quantization: NF4
* Compute dtype: FP16
* Optimizer: paged_adamw_8bit
* Sequence length: 384–512
* Epochs: 1
* Final training loss: ~0.2–0.3
* Hardware: Tesla P100 (16GB)
* Frameworks:
* transformers
* peft
* trl
* bitsandbytes
## Intended Use
This model is designed for:
* Python code generation
* Simple algorithm implementation
* Educational coding tasks
* Instruction-following code responses
It performs best when prompted in Alpaca-style format:
```
Below is an instruction that describes a task.
### Instruction:
Write a Python function to reverse a linked list.
### Response:
```
## How to Use
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2-0.5B")
tokenizer = AutoTokenizer.from_pretrained("NNEngine/qwen2-0.5b-python-lora")
model = PeftModel.from_pretrained(base_model, "NNEngine/qwen2-0.5b-python-lora")
model.eval()
```
Example generation:
```python
prompt = """Below is an instruction that describes a task.
### Instruction:
Write a Python function to check if a number is prime.
### Response:
"""
```
## Observed Behavior
The model demonstrates:
* Improved Python code structuring
* Better adherence to instruction-response formatting
* Faster convergence for common programming tasks
Limitations:
* Small model size (0.5B) limits reasoning depth
* May hallucinate under high-temperature decoding
* Works best with explicit language specification ("Write a Python function")
## Limitations
* Not suitable for production-critical systems
* Limited mathematical and multi-step reasoning capability
* Sensitive to prompt formatting
* Performance depends heavily on decoding strategy
## Future Improvements
Potential enhancements:
* Mask instruction tokens during SFT
* Increase model size (1.5B+)
* Train on more diverse programming datasets
* Evaluate with pass@k benchmarks
## Acknowledgements
* Base model by Qwen team
* Dataset by `iamtarun`
|