qwen3-1.7b-lamini-qlora-instruction-tuned
Instruction-tuned Qwen3-1.7B-Base using SFT (QLoRA/LoRA) on MBZUAI/LaMini-instruction, then merged into a single model checkpoint for easy deployment (single-turn Q/A).
Model Details
- Base model:
Qwen/Qwen3-1.7B-Base - Finetuning: Supervised Fine-Tuning (SFT) with QLoRA
- Dataset:
MBZUAI/LaMini-instruction(we utilized half of the data) - Output: LoRA merged into base weights and saved as standard HF causal LM weights.
Prompt Format (Important)
This model was trained with the following text instruction format:
Instruction:
{instruction}
Input:
{input}
Response:
{model generates here}
If you omit ### Input:, use:
Instruction:
{instruction}
Response:
{model generates here}
For convenience, this repository may include helper utilities such as prompt_format.py.
Quickstart (Transformers)
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
MODEL_ID = "ericoh929/qwen3-1.7b-lamini-qlora-instruction-tuned"
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
MODEL_ID,
trust_remote_code=True,
device_map="auto",
torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
)
model.eval()
instruction = "Answer the question concisely."
inp = "If Tom has 3 apples and buys 4 more, how many apples does he have?"
prompt = (
f"### Instruction:\n{instruction}\n\n"
f"### Input:\n{inp}\n\n"
f"### Response:\n"
)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.inference_mode():
out = model.generate(
**inputs,
max_new_tokens=128,
do_sample=False,
pad_token_id=tokenizer.pad_token_id or tokenizer.eos_token_id,
eos_token_id=tokenizer.eos_token_id,
)
# Decode only the generated continuation (recommended)
gen_ids = out[0][inputs["input_ids"].shape[1]:]
answer = tokenizer.decode(gen_ids, skip_special_tokens=True).strip()
print(answer)
Intended Use
• Single-turn instruction following
• General Q/A, short reasoning, summarization style tasks
Limitations
• Trained on a synthetic/large instruction dataset; outputs can contain hallucinations.
• Best results are achieved when using the training prompt format shown above.
• This is a 1.7B model; complex reasoning / long-context tasks may be limited.
Training Notes
• Method: QLoRA (4-bit base during training) + LoRA adapters
• Merge: Loaded base model in fp16/bf16 and merged adapters with merge_and_unload()
• max_seq_len: 2048
- Downloads last month
- 790
Model tree for ericoh929/qwen3-1.7b-lamini-qlora-instruction-tuned
Base model
Qwen/Qwen3-1.7B-Base