🌀 Z-Image-Turbo-Booster-v1 (Dimensional Burden Edition)

"It wasn't a bug. It was a feature waiting to be discovered." — aifeifei798

This is an experimental LoRA adapter for Tongyi-MAI/Z-Image-Turbo. It introduces a new branch of the Fragmented Training (FT) paradigm applied to Computer Vision: Dimensional Burdening.

By forcing the model to adapt to "twisted" tensor dimensions during the gradient descent phase, we achieved a "Turbo Booster" effect that enhances texture adherence and structural robustness.

📄 Model Description

Z-Image-Turbo-Booster-v1 represents a departure from standard fine-tuning. Instead of feeding the model perfectly aligned data, we discovered that forcing the optimizer to handle Dynamic Dimensional Transposition (resolving a [Channel, Batch] vs [Batch, Channel] conflict on-the-fly) creates a form of "Elastic Learning."

The "Happy Accident" (Technical Insight)

During the development of our custom training script (train_zimage_lora.py), the model was subjected to a "Dimensional Burden." The loss function was calculated only after a forced transposition of the prediction tensors.

# === The "Dimensional Burden" Logic ===
# The model predicts in [Channel, Batch, H, W] but the Target is [Batch, Channel, H, W].
# Instead of fixing the dataloader, we force the gradients to flow through this transposition.

if model_pred.shape != target.shape:
    if (model_pred.shape[0] == target.shape[1] and 
        model_pred.shape[1] == target.shape[0]):
        
        # The "Burden": Force re-alignment during the forward pass
        model_pred = model_pred.transpose(0, 1)

loss = F.mse_loss(model_pred.float(), target.float(), reduction="mean")

This acts as a regularizer, preventing the model from overfitting to the memory layout and forcing it to focus on the semantic content of the latents.

🚀 Performance & Usage

This booster is designed to be loaded on top of the base Z-Image transformer. It enhances texture sharpness and generation consistency.

How to Load (Diffusers)

mport torch
from diffusers import ZImagePipeline
import os

# 1. Load the pipeline
# Use bfloat16 for optimal performance on supported GPUs
pipe = ZImagePipeline.from_pretrained(
    "Tongyi-MAI/Z-Image-Turbo",
    torch_dtype=torch.bfloat16,
    low_cpu_mem_usage=False,
)

# =================【这里是新增的加载 LoRA 代码】=================
# 指向你刚才训练输出的文件夹路径
lora_dir = "./feifei-zimage-lora" 
lora_file = "pytorch_lora_weights.safetensors"
full_path = os.path.join(lora_dir, lora_file)

if os.path.exists(full_path):
    print(f"正在加载 LoRA: {full_path}")
    try:
        # adapter_name 可以随意起，用来标记这个 LoRA
        pipe.load_lora_weights(lora_dir, weight_name=lora_file, adapter_name="feifei")
        print("✅ LoRA 加载成功！")
        
        pipe.set_adapters(["feifei"], adapter_weights=[0.1]) # 4步效果很好;2步只是能看,但是已经很有形状了,视频生成可以试试,硬件有限,2步视频我无法训练,lora强度我也是基础测试,有可能会有更好的强度比
        
    except Exception as e:
        print(f"❌ LoRA 加载失败: {e}")
        print("可能是键名不匹配，或者文件损坏。")
else:
    print(f"⚠️ 找不到 LoRA 文件: {full_path}")
# ===============================================================

pipe.to("cuda")

# [Optional] Attention Backend
# pipe.transformer.set_attention_backend("flash")

prompt = "jpop model in bikini at sea"

# 2. Generate Image
image = pipe(
    prompt=prompt,
    height=1024,
    width=1024,
    num_inference_steps=4,
    guidance_scale=0.0,
    generator=torch.Generator("cuda").manual_seed(42),
    # cross_attention_kwargs={"scale": 0.1} # 另一种控制 LoRA 强度的方法
).images[0]

image.save("example_lora_test.png")
print("图像已保存为 example_lora_test.png")

Step 9 Train TO # Step 2

python -m venv venv
source venv/bin/activate
git clone https://github.com/huggingface/diffusers.git
pip install .
cd diffusers/examples/text_to_image
pip install -r requirements.txt
wget https://hf-mirror.com/hfd/hfd.sh
chmod a+x hfd.sh
./hfd.sh aifeifei798/Z-Image-Turbo-Booster-v1
cd Z-Image-Turbo-Booster-v1
../hfd.sh Tongyi-MAI/Z-Image-Turbo
chmod +x run.sh
./run.sh

Recommended training steps: 200

Inference LoRa intensity:

1step = 0.01, add 1 step + 0.03
2step = 0.04
3step = 0.07
4step = 0.10
5setp = 0.13 ...

Inference steps: >= 2

dataset look feifei_pic

test run test_zimage.py

📚 Citation

@misc{aifeifei_2026,
    author       = { aifeifei },
    title        = { Z-Image-Turbo-Booster-v1 (Revision 2490b32) },
    year         = 2026,
    url          = { https://huggingface.co/aifeifei798/Z-Image-Turbo-Booster-v1 },
    doi          = { 10.57967/hf/7591 },
    publisher    = { Hugging Face }
}

Downloads last month: 147

Model tree for aifeifei798/Z-Image-Turbo-Booster-v1

Base model

Tongyi-MAI/Z-Image-Turbo

Adapter

(289)

this model

Collection including aifeifei798/Z-Image-Turbo-Booster-v1

FT(Fragmented Training)

Collection

8 items • Updated 1 day ago