JoeStrout commited on
Commit
572de6e
·
verified ·
1 Parent(s): 5dda1aa

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +82 -3
README.md CHANGED
@@ -1,3 +1,82 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ base_model:
6
+ - Qwen/Qwen2.5-Coder-7B-Instruct
7
+ pipeline_tag: text-generation
8
+ library_name: peft
9
+ tags:
10
+ - lora
11
+ - peft
12
+ - qwen2.5
13
+ - miniscript
14
+ - code
15
+ ---
16
+
17
+ # miniscript-code-helper-lora
18
+
19
+ This repository contains a LoRA adapter for `Qwen/Qwen2.5-Coder-7B-Instruct`, fine-tuned to help answer questions about the MiniScript programming language.
20
+
21
+ The adapter was trained on a small MiniScript Q&A corpus. On its own, it improves MiniScript awareness somewhat, but best results come when it is used together with a RAG pipeline over MiniScript reference materials.
22
+
23
+ ## Base model
24
+
25
+ - Qwen/Qwen2.5-Coder-7B-Instruct
26
+
27
+ ## What this repo contains
28
+
29
+ - PEFT/LoRA adapter weights only
30
+ - Not the full base model
31
+
32
+ ## Intended use
33
+
34
+ - Answering questions about MiniScript
35
+ - Assisting with MiniScript syntax and examples
36
+ - Best used with retrieval augmentation (RAG)
37
+
38
+ ## Limitations
39
+
40
+ - The adapter alone is not fully reliable
41
+ - It may still fall back to Python-flavored assumptions from the base model
42
+ - For best accuracy, pair it with a MiniScript documentation retriever
43
+
44
+ ## Example usage
45
+
46
+ ```python
47
+ from peft import PeftModel
48
+ from transformers import AutoModelForCausalLM, AutoTokenizer
49
+
50
+ base_model_id = "Qwen/Qwen2.5-Coder-7B-Instruct"
51
+ adapter_id = "YOUR_USERNAME/miniscript-code-helper-lora"
52
+
53
+ tokenizer = AutoTokenizer.from_pretrained(base_model_id)
54
+
55
+ base_model = AutoModelForCausalLM.from_pretrained(
56
+ base_model_id,
57
+ torch_dtype="auto",
58
+ device_map="auto",
59
+ )
60
+
61
+ model = PeftModel.from_pretrained(base_model, adapter_id)
62
+ model.eval()
63
+
64
+ messages = [
65
+ {"role": "system", "content": "You are a helpful assistant specializing in MiniScript programming."},
66
+ {"role": "user", "content": "How do I iterate over a map in MiniScript?"},
67
+ ]
68
+
69
+ text = tokenizer.apply_chat_template(
70
+ messages,
71
+ tokenize=False,
72
+ add_generation_prompt=True,
73
+ )
74
+ inputs = tokenizer([text], return_tensors="pt").to(model.device)
75
+ output = model.generate(**inputs, max_new_tokens=512)
76
+ response = tokenizer.decode(
77
+ output[0][len(inputs.input_ids[0]):],
78
+ skip_special_tokens=True,
79
+ )
80
+
81
+ print(response)
82
+ ```