--- library_name: pytorch tags: - motion - rvq - vector-quantization - human-motion - safetensors - humanML - motion-reconstructor license: cc-by-nc-4.0 datasets: - Wojtekb30/HumanML3D-500ms-FPP-descriptions-CoTs-1 --- # Motion RVQ (Move Reconstruction) This model uses Residual Vector Quantization (RVQ) to reconstruct motion represented as 263-dimensional frame vectors. It is a custom PyTorch model (not a Transformers `AutoModel`) and is loaded from `safetensors` with `rvq_model.py`. ![image](https://cdn-uploads.huggingface.co/production/uploads/68a8a393b10e7ec7d9e0ace3/8Sb7EkktHeB1Pl8N8saAg.png) ## Model Summary - Architecture: encoder -> 4-level RVQ -> decoder - Input shape: `(T, 263)` per sequence (frame-major) - Training window: 100 frames (with crop/pad in dataset loader) - Output: reconstructed motion sequence in the same 263-dim representation ## Repository Files - `motion_rvq_weights.safetensors` - main published checkpoint - `config.json` - model configuration metadata - `rvq_model.py` - model architecture (`MotionRVQ_VAE`) - `TestRVQ.py` - inference + 3-panel visualization - `TrainRVQ.py` - training script - `rvq_humanml_dataset.py` - training dataset loader - `Mean.npy`, `Std.npy` - normalization statistics - `000001.npy`, `000012.npy` - sample motion files `motion_rvq_weights.pth` can be treated as a legacy artifact; code uses `motion_rvq_weights.safetensors`. ## Install ```bash pip install torch safetensors numpy matplotlib ``` ## Inference Run the provided visualization script: ```bash python TestRVQ.py ``` By default, `TestRVQ.py` uses `000001.npy`. You can change `FILE_TO_TEST` in `TestRVQ.py` to another sequence. Minimal loading example: ```python from pathlib import Path import torch from safetensors.torch import load_file from rvq_model import MotionRVQ_VAE base = Path(".") device = torch.device("cuda" if torch.cuda.is_available() else "cpu") model = MotionRVQ_VAE().to(device) state_dict = load_file(str(base / "motion_rvq_weights.safetensors"), device=str(device)) model.load_state_dict(state_dict) model.eval() ``` ## Training From Scratch Expected layout: ```text rvq/ TrainRVQ.py rvq_model.py rvq_humanml_dataset.py Mean.npy Std.npy new_joint_vecs/ *.npy ``` Run training: ```bash python TrainRVQ.py ``` Output checkpoint: - `motion_rvq_weights.safetensors` ## Limitations - This model reconstructs motion vectors; it is not a text-to-motion generator. - Input format must match the same 263-dim representation and normalization scheme used during training.