---
license: mit
---

# Model Card


# Qwen2-0.5B-Python-SFT (LoRA)

## Overview

This model is a Supervised Fine-Tuned (SFT) version of **Qwen/Qwen2-0.5B**, adapted for Python instruction-following tasks.

The fine-tuning was performed using QLoRA (4-bit quantization + LoRA adapters) on a curated Python instruction dataset to improve structured code generation and instruction alignment.

This repository contains **LoRA adapter weights**, not the full base model.


## Base Model

* Base: `Qwen/Qwen2-0.5B`
* Architecture: Decoder-only Transformer
* Parameters: 0.5B
* License: Refer to original Qwen license

Base model must be loaded separately.


## Training Dataset

* Dataset: `iamtarun/python_code_instructions_18k_alpaca`
* Size: ~18,000 instruction-output pairs
* Format: Alpaca-style instruction → response
* Domain: Python programming tasks

Each training sample followed:

```
Below is an instruction that describes a task.
Write a response that appropriately completes the request.

### Instruction:
...

### Response:
...
```


## Training Details

* Method: QLoRA (4-bit)
* Quantization: NF4
* Compute dtype: FP16
* Optimizer: paged_adamw_8bit
* Sequence length: 384–512
* Epochs: 1
* Final training loss: ~0.2–0.3
* Hardware: Tesla P100 (16GB)
* Frameworks:

  * transformers
  * peft
  * trl
  * bitsandbytes


## Intended Use

This model is designed for:

* Python code generation
* Simple algorithm implementation
* Educational coding tasks
* Instruction-following code responses

It performs best when prompted in Alpaca-style format:

```
Below is an instruction that describes a task.

### Instruction:
Write a Python function to reverse a linked list.

### Response:
```


## How to Use

```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2-0.5B")
tokenizer = AutoTokenizer.from_pretrained("NNEngine/qwen2-0.5b-python-lora")

model = PeftModel.from_pretrained(base_model, "NNEngine/qwen2-0.5b-python-lora")

model.eval()
```

Example generation:

```python
prompt = """Below is an instruction that describes a task.

### Instruction:
Write a Python function to check if a number is prime.

### Response:
"""
```


## Observed Behavior

The model demonstrates:

* Improved Python code structuring
* Better adherence to instruction-response formatting
* Faster convergence for common programming tasks

Limitations:

* Small model size (0.5B) limits reasoning depth
* May hallucinate under high-temperature decoding
* Works best with explicit language specification ("Write a Python function")


## Limitations

* Not suitable for production-critical systems
* Limited mathematical and multi-step reasoning capability
* Sensitive to prompt formatting
* Performance depends heavily on decoding strategy

## Future Improvements

Potential enhancements:

* Mask instruction tokens during SFT
* Increase model size (1.5B+)
* Train on more diverse programming datasets
* Evaluate with pass@k benchmarks


## Acknowledgements

* Base model by Qwen team
* Dataset by `iamtarun`