| | --- |
| | base_model: Lightricks/LTX-Video |
| | library_name: gguf |
| | quantized_by: city96 |
| | tags: |
| | - ltx-video |
| | - text-to-video |
| | - image-to-video |
| | language: |
| | - en |
| | license: other |
| | license_link: LICENSE.md |
| | --- |
| | |
| | This is a direct GGUF conversion of [Lightricks/LTX-Video](https://huggingface.co/Lightricks/LTX-Video) |
| |
|
| | As this is a quantized model not a finetune, all the same restrictions/original license terms still apply. |
| |
|
| | The model files can be used with the [ComfyUI-GGUF](https://github.com/city96/ComfyUI-GGUF) custom node. |
| |
|
| | Place model files in `ComfyUI/models/unet` - see the GitHub readme for further install instructions. |
| |
|
| | Please refer to [this chart](https://github.com/ggerganov/llama.cpp/blob/master/examples/perplexity/README.md#llama-3-8b-scoreboard) for a basic overview of quantization types. |
| |
|
| | ## Diffusers support |
| |
|
| | You can also use the checkpoints with the `diffusers` library. |
| |
|
| | Make sure to install `diffusers` from source: |
| |
|
| | ```bash |
| | pip install git+https://github.com/huggingface/diffusers |
| | ``` |
| |
|
| | And then install `gguf`: |
| |
|
| | ```bash |
| | pip install -U gguf |
| | ``` |
| |
|
| | And then we're ready to perform inference: |
| |
|
| | <details> |
| | <summary>Inference code</summary> |
| |
|
| | ```py |
| | import torch |
| | from diffusers.utils import export_to_video |
| | from diffusers import LTXPipeline, LTXVideoTransformer3DModel, GGUFQuantizationConfig |
| | |
| | ckpt_path = ( |
| | "https://huggingface.co/city96/LTX-Video-gguf/blob/main/ltx-video-2b-v0.9-Q3_K_S.gguf" |
| | ) |
| | transformer = LTXVideoTransformer3DModel.from_single_file( |
| | ckpt_path, |
| | quantization_config=GGUFQuantizationConfig(compute_dtype=torch.bfloat16), |
| | torch_dtype=torch.bfloat16, |
| | ) |
| | pipe = LTXPipeline.from_pretrained( |
| | "Lightricks/LTX-Video", |
| | transformer=transformer, |
| | generator=torch.manual_seed(0), |
| | torch_dtype=torch.bfloat16, |
| | ) |
| | pipe.enable_model_cpu_offload() |
| | |
| | prompt = "A woman with long brown hair and light skin smiles at another woman with long blonde hair. The woman with brown hair wears a black jacket and has a small, barely noticeable mole on her right cheek. The camera angle is a close-up, focused on the woman with brown hair's face. The lighting is warm and natural, likely from the setting sun, casting a soft glow on the scene. The scene appears to be real-life footage" |
| | negative_prompt = "worst quality, inconsistent motion, blurry, jittery, distorted" |
| | |
| | video = pipe( |
| | prompt=prompt, |
| | negative_prompt=negative_prompt, |
| | width=704, |
| | height=480, |
| | num_frames=161, |
| | num_inference_steps=50, |
| | ).frames[0] |
| | export_to_video(video, "output_gguf_ltx.mp4", fps=24) |
| | ``` |
| | |
| | </details> |