qwen3-1.7b-lamini-qlora-instruction-tuned

Instruction-tuned Qwen3-1.7B-Base using SFT (QLoRA/LoRA) on MBZUAI/LaMini-instruction, then merged into a single model checkpoint for easy deployment (single-turn Q/A).

Model Details

  • Base model: Qwen/Qwen3-1.7B-Base
  • Finetuning: Supervised Fine-Tuning (SFT) with QLoRA
  • Dataset: MBZUAI/LaMini-instruction (we utilized half of the data)
  • Output: LoRA merged into base weights and saved as standard HF causal LM weights.

Prompt Format (Important)

This model was trained with the following text instruction format:

Instruction:

{instruction}

Input:

{input}

Response:

{model generates here}

If you omit ### Input:, use:

Instruction:

{instruction}

Response:

{model generates here}

For convenience, this repository may include helper utilities such as prompt_format.py.

Quickstart (Transformers)

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

MODEL_ID = "ericoh929/qwen3-1.7b-lamini-qlora-instruction-tuned"

tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    trust_remote_code=True,
    device_map="auto",
    torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
)
model.eval()

instruction = "Answer the question concisely."
inp = "If Tom has 3 apples and buys 4 more, how many apples does he have?"

prompt = (
    f"### Instruction:\n{instruction}\n\n"
    f"### Input:\n{inp}\n\n"
    f"### Response:\n"
)

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.inference_mode():
    out = model.generate(
        **inputs,
        max_new_tokens=128,
        do_sample=False,
        pad_token_id=tokenizer.pad_token_id or tokenizer.eos_token_id,
        eos_token_id=tokenizer.eos_token_id,
    )

# Decode only the generated continuation (recommended)
gen_ids = out[0][inputs["input_ids"].shape[1]:]
answer = tokenizer.decode(gen_ids, skip_special_tokens=True).strip()
print(answer)

Intended Use

•	Single-turn instruction following

•	General Q/A, short reasoning, summarization style tasks

Limitations

•	Trained on a synthetic/large instruction dataset; outputs can contain hallucinations.

•	Best results are achieved when using the training prompt format shown above.

•	This is a 1.7B model; complex reasoning / long-context tasks may be limited.

Training Notes

•	Method: QLoRA (4-bit base during training) + LoRA adapters

•	Merge: Loaded base model in fp16/bf16 and merged adapters with merge_and_unload()

•	max_seq_len: 2048
Downloads last month
790
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ericoh929/qwen3-1.7b-lamini-qlora-instruction-tuned

Adapter
(27)
this model

Dataset used to train ericoh929/qwen3-1.7b-lamini-qlora-instruction-tuned