Motion RVQ (Move Reconstruction)

This model uses Residual Vector Quantization (RVQ) to reconstruct motion represented as 263-dimensional frame vectors. It is a custom PyTorch model (not a Transformers AutoModel) and is loaded from safetensors with rvq_model.py.

image

Model Summary

  • Architecture: encoder -> 4-level RVQ -> decoder
  • Input shape: (T, 263) per sequence (frame-major)
  • Training window: 100 frames (with crop/pad in dataset loader)
  • Output: reconstructed motion sequence in the same 263-dim representation

Repository Files

  • motion_rvq_weights.safetensors - main published checkpoint
  • config.json - model configuration metadata
  • rvq_model.py - model architecture (MotionRVQ_VAE)
  • TestRVQ.py - inference + 3-panel visualization
  • TrainRVQ.py - training script
  • rvq_humanml_dataset.py - training dataset loader
  • Mean.npy, Std.npy - normalization statistics
  • 000001.npy, 000012.npy - sample motion files

motion_rvq_weights.pth can be treated as a legacy artifact; code uses motion_rvq_weights.safetensors.

Install

pip install torch safetensors numpy matplotlib

Inference

Run the provided visualization script:

python TestRVQ.py

By default, TestRVQ.py uses 000001.npy. You can change FILE_TO_TEST in TestRVQ.py to another sequence.

Minimal loading example:

from pathlib import Path
import torch
from safetensors.torch import load_file
from rvq_model import MotionRVQ_VAE

base = Path(".")
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

model = MotionRVQ_VAE().to(device)
state_dict = load_file(str(base / "motion_rvq_weights.safetensors"), device=str(device))
model.load_state_dict(state_dict)
model.eval()

Training From Scratch

Expected layout:

rvq/
  TrainRVQ.py
  rvq_model.py
  rvq_humanml_dataset.py
  Mean.npy
  Std.npy
  new_joint_vecs/
    *.npy

Run training:

python TrainRVQ.py

Output checkpoint:

  • motion_rvq_weights.safetensors

Limitations

  • This model reconstructs motion vectors; it is not a text-to-motion generator.
  • Input format must match the same 263-dim representation and normalization scheme used during training.
Downloads last month
21
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train Wojtekb30/Motion-RVQ-263d-reconstructor-humanML

Collection including Wojtekb30/Motion-RVQ-263d-reconstructor-humanML