File size: 3,123 Bytes
7229188
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a9c333d
7229188
a9c333d
7229188
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
---
license: mit
---

# Model Card


# Qwen2-0.5B-Python-SFT (LoRA)

## Overview

This model is a Supervised Fine-Tuned (SFT) version of **Qwen/Qwen2-0.5B**, adapted for Python instruction-following tasks.

The fine-tuning was performed using QLoRA (4-bit quantization + LoRA adapters) on a curated Python instruction dataset to improve structured code generation and instruction alignment.

This repository contains **LoRA adapter weights**, not the full base model.


## Base Model

* Base: `Qwen/Qwen2-0.5B`
* Architecture: Decoder-only Transformer
* Parameters: 0.5B
* License: Refer to original Qwen license

Base model must be loaded separately.


## Training Dataset

* Dataset: `iamtarun/python_code_instructions_18k_alpaca`
* Size: ~18,000 instruction-output pairs
* Format: Alpaca-style instruction → response
* Domain: Python programming tasks

Each training sample followed:

```
Below is an instruction that describes a task.
Write a response that appropriately completes the request.

### Instruction:
...

### Response:
...
```


## Training Details

* Method: QLoRA (4-bit)
* Quantization: NF4
* Compute dtype: FP16
* Optimizer: paged_adamw_8bit
* Sequence length: 384–512
* Epochs: 1
* Final training loss: ~0.2–0.3
* Hardware: Tesla P100 (16GB)
* Frameworks:

  * transformers
  * peft
  * trl
  * bitsandbytes


## Intended Use

This model is designed for:

* Python code generation
* Simple algorithm implementation
* Educational coding tasks
* Instruction-following code responses

It performs best when prompted in Alpaca-style format:

```
Below is an instruction that describes a task.

### Instruction:
Write a Python function to reverse a linked list.

### Response:
```


## How to Use

```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2-0.5B")
tokenizer = AutoTokenizer.from_pretrained("NNEngine/qwen2-0.5b-python-lora")

model = PeftModel.from_pretrained(base_model, "NNEngine/qwen2-0.5b-python-lora")

model.eval()
```

Example generation:

```python
prompt = """Below is an instruction that describes a task.

### Instruction:
Write a Python function to check if a number is prime.

### Response:
"""
```


## Observed Behavior

The model demonstrates:

* Improved Python code structuring
* Better adherence to instruction-response formatting
* Faster convergence for common programming tasks

Limitations:

* Small model size (0.5B) limits reasoning depth
* May hallucinate under high-temperature decoding
* Works best with explicit language specification ("Write a Python function")


## Limitations

* Not suitable for production-critical systems
* Limited mathematical and multi-step reasoning capability
* Sensitive to prompt formatting
* Performance depends heavily on decoding strategy

## Future Improvements

Potential enhancements:

* Mask instruction tokens during SFT
* Increase model size (1.5B+)
* Train on more diverse programming datasets
* Evaluate with pass@k benchmarks


## Acknowledgements

* Base model by Qwen team
* Dataset by `iamtarun`