NNEngine's picture
Update README.md
a9c333d verified
---
license: mit
---
# Model Card
# Qwen2-0.5B-Python-SFT (LoRA)
## Overview
This model is a Supervised Fine-Tuned (SFT) version of **Qwen/Qwen2-0.5B**, adapted for Python instruction-following tasks.
The fine-tuning was performed using QLoRA (4-bit quantization + LoRA adapters) on a curated Python instruction dataset to improve structured code generation and instruction alignment.
This repository contains **LoRA adapter weights**, not the full base model.
## Base Model
* Base: `Qwen/Qwen2-0.5B`
* Architecture: Decoder-only Transformer
* Parameters: 0.5B
* License: Refer to original Qwen license
Base model must be loaded separately.
## Training Dataset
* Dataset: `iamtarun/python_code_instructions_18k_alpaca`
* Size: ~18,000 instruction-output pairs
* Format: Alpaca-style instruction → response
* Domain: Python programming tasks
Each training sample followed:
```
Below is an instruction that describes a task.
Write a response that appropriately completes the request.
### Instruction:
...
### Response:
...
```
## Training Details
* Method: QLoRA (4-bit)
* Quantization: NF4
* Compute dtype: FP16
* Optimizer: paged_adamw_8bit
* Sequence length: 384–512
* Epochs: 1
* Final training loss: ~0.2–0.3
* Hardware: Tesla P100 (16GB)
* Frameworks:
* transformers
* peft
* trl
* bitsandbytes
## Intended Use
This model is designed for:
* Python code generation
* Simple algorithm implementation
* Educational coding tasks
* Instruction-following code responses
It performs best when prompted in Alpaca-style format:
```
Below is an instruction that describes a task.
### Instruction:
Write a Python function to reverse a linked list.
### Response:
```
## How to Use
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2-0.5B")
tokenizer = AutoTokenizer.from_pretrained("NNEngine/qwen2-0.5b-python-lora")
model = PeftModel.from_pretrained(base_model, "NNEngine/qwen2-0.5b-python-lora")
model.eval()
```
Example generation:
```python
prompt = """Below is an instruction that describes a task.
### Instruction:
Write a Python function to check if a number is prime.
### Response:
"""
```
## Observed Behavior
The model demonstrates:
* Improved Python code structuring
* Better adherence to instruction-response formatting
* Faster convergence for common programming tasks
Limitations:
* Small model size (0.5B) limits reasoning depth
* May hallucinate under high-temperature decoding
* Works best with explicit language specification ("Write a Python function")
## Limitations
* Not suitable for production-critical systems
* Limited mathematical and multi-step reasoning capability
* Sensitive to prompt formatting
* Performance depends heavily on decoding strategy
## Future Improvements
Potential enhancements:
* Mask instruction tokens during SFT
* Increase model size (1.5B+)
* Train on more diverse programming datasets
* Evaluate with pass@k benchmarks
## Acknowledgements
* Base model by Qwen team
* Dataset by `iamtarun`