You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Track4D-360 camera-control checkpoint backup β€” 2026-04-23 / extended 2026-04-28

Tagged copies of every benchmarked Track4D-360 camera-control checkpoint. The 1.3B family (4 files, ~2.4 GB each) was the original 2026-04-23 backup for the CLIP-F/V benchmark (doc/track4d-360/2026-04-23-clipfv-4models-288x512-benchmark.md); the 14B savefix step-2000 file was added 2026-04-28 once the multi-node FSDP-savefix run (doc/track4d-360/bugs/2026-04-25-multinode-training-desync-fixes.md) produced its first verified-correct checkpoint.

Filenames are renamed to carry the variant tag so the source of truth is legible without walking back through the training dirs.

Upload destination: all files in this directory get pushed to yslan/track4d_360 on HuggingFace, preserving the exact filename. Filename = HF file path.

backup file size source training @
warped_step-13000.safetensors 2.2 GB warped_appearance_concat_proj_mixed_real_synth_144x256x49_1p3b_2gpu/train/Wan2.1-T2V-1.3B_track4d360_warped_appearance_concat_proj_mixed_real_synth/step-13000.safetensors 144x256 (Lyra-2 latent-fuse new architecture)
static13k_step-13500.safetensors 2.4 GB hybrid_dense_plucker_mixed_real_synth_concat_project_trainplucker_attnffn_cond_dropout_144x256x49_1p3b_2gpu/train/.../step-13500.safetensors 144x256 (old-arch, static-only, no syn4d)
dynamic5k_step-5000.safetensors 2.4 GB hybrid_dense_plucker_mixed_real_synth_syn4d_concat_project_trainplucker_attnffn_cond_dropout_144x256x49_1p3b_2gpu/train/.../step-5000.safetensors 144x256 (old-arch, + syn4d dynamic)
ismb288_3k_step-3000.safetensors 2.4 GB ismb_hybrid_dense_plucker_mixed_real_synth_syn4d_recam_syncam_cond_dropout_288x512x49_1p3b_16gpu/train/.../step-3000.safetensors 288x512 (native) (old-arch, + syn4d + RecamMaster + SynCamMaster, Isambard-trained, synced 2026-04-23)
14b_savefix_step-2000.safetensors 22.9 GB ismb_hybrid_dense_plucker_mixed_real_synth_syn4d_recam_syncam_savefix_cond_dropout_144x256x49_14b_16gpu_fsdp/train/Wan2.1-Fun-14B_track4d360_..._savefix_cond_dropout/step-2000.safetensors 144x256 14B (Wan-Fun 14B FSDP, post-savefix bug fix, 4-node Isambard, copied 2026-04-28). Train launcher: bash_scripts/track4d_360/ismb/14b/sbatch/ismb_sbatch_14b_4node_144x256_fsdp_noise_commtuned_savefix_resume8800.sh

Size delta within the 1.3B family: warped is 2.2 GB vs 2.4 GB for the others. This is the architectural difference β€” warped has dense_in_channels=2 (geometry-only dense) vs 5 (RGB+geom) for the old-arch models, and a different track_injection_mode with different trainable subgraphs.

The 14B file's 22.9 GB is bf16 weights for the full 14B Wan-Fun DiT plus the trainable Track4D-360 adapter modules (track_adapter, track_block_injector, dense_target_control_encoder, plucker control_adapter) β€” see the [Eval] DiT checkpoint load line printed by the eval script for the exact key inventory.

Backup method: cp (1.3B family was rsync 2026-04-23; 14B was a single cp 2026-04-28). Verified by byte-size match against sources.


Reproducibility bundles (dataset subsets, for sharing)

Some benchmarks need a tiny slice of the full dataset roots. Those bundles live alongside the checkpoints so the whole "checkpoints + data + scripts" tree can be tarred together for a collaborator.

bundle dir size bench it reproduces
bench_ismb288_multiframe_repro/ ~52 GB benchmark_ismb288_3k_multiframe_vs_zbuffer_288x512.sh (5 datasets Γ— 5 scenes Γ— 10 trajectories @ 288Γ—512). See bench_ismb288_multiframe_repro/README.md for layout, run instructions, and what's deliberately NOT included (Wan base + VXF source). Generated by prepare_repro_data_ismb288_multiframe.sh.

To create an archive for upload (data + scripts only β€” checkpoint is already on HF as yslan/track4d_360/ismb288_3k_step-3000.safetensors, no need to re-bundle it). The bundle is mostly PNG/EXR/safetensors β€” already compressed content, so gzip is slow and gains almost nothing. Recommended: plain .tar.

cd /scratch/shared/beegfs/yushi/logs/track4d-360/backup

# Recommended β€” plain tar, fast (just streams bytes; PNG/EXR don't compress):
tar -cf bench_ismb288_multiframe_repro.tar bench_ismb288_multiframe_repro

# Alternative if you prefer .tar.gz format β€” use parallel gzip:
# tar -c bench_ismb288_multiframe_repro | pigz -p $(nproc) > bench_ismb288_multiframe_repro.tar.gz

# Alternative β€” tar + zstd (best size/speed for HF if both sides have zstd):
# tar --use-compress-program='zstd -T0 -3' -cf bench_ismb288_multiframe_repro.tar.zst bench_ismb288_multiframe_repro

# Avoid: tar -czf ... β€” single-threaded gzip on 52 GB, ~hours, near-zero gain.

# After verifying the archive is good (and ideally after uploading to HF),
# the unzipped tree is redundant β€” drop it to reclaim 52 GB:
rm -rf bench_ismb288_multiframe_repro

# Re-creating later is cheap (rsync -a, ~52 GB read from beegfs sources):
bash /scratch/shared/beegfs/yushi/Repo/geo4d_360/VideoX-Fun/bash_scripts/track4d_360/plucker/benchmark/prepare_repro_data_ismb288_multiframe.sh

How to eval

All 4 checkpoints evaluate through the SAME master benchmark script (under VideoX-Fun/ repo root). It takes care of per-variant arg dispatch so you don't have to think about the flags listed below.

Master benchmark (all 4 models, 5 datasets Γ— 5 scenes Γ— 10 trajectories @ 288x512x49)

cd /scratch/shared/beegfs/yushi/Repo/geo4d_360/VideoX-Fun
bash bash_scripts/track4d_360/plucker/benchmark_clipfv_4models_288x512_2gpu.sh

Options:

# one variant only:
MODEL=warped      bash bash_scripts/track4d_360/plucker/benchmark_clipfv_4models_288x512_2gpu.sh
MODEL=static13k   bash ...
MODEL=dynamic5k   bash ...
MODEL=ismb288_3k  bash ...

# custom GPU pair (defaults GPU0=0 GPU1=1):
GPU0=4 GPU1=5 bash ...

# point at backup ckpts instead of the live train dirs (example override):
CKPT_WARPED=/scratch/shared/beegfs/yushi/logs/track4d-360/backup/warped_step-13000.safetensors \
  bash ...

The script writes to /scratch/shared/beegfs/yushi/logs/track4d-360/benchmark/clipfv_4models_288x512_2gpu/, with summary.md aggregated by python -m track4d_360.tools.aggregate_clip_benchmark.

Already-completed trajectories auto-skip on re-run (pred_rgb.mp4 existence check in each novel-traj script).

Single-dataset / ad-hoc invocations

If you just want to run one checkpoint against one dataset, the benchmark script dispatches to these three eval entrypoints (all under examples/wan2.1_fun/):

dataset script
mvs_synth, dl3dv, re10k eval_track4d360_hybrid_dense_static_scene_novel_traj.py
kubric eval_track4d360_hybrid_dense_kubric_novel_traj.py
syn4d eval_track4d360_hybrid_dense_syn4d_novel_traj.py

All three use track4d_360.shared_args as of 2026-04-23 β€” so they accept the full warped + plucker + dense + track flag set. Exact per-variant flags are the table below β€” do not forget them: build_eval_pipeline reads warped_condition_mode via getattr(..., "off"), so an omitted flag on a warped checkpoint silently runs the model in the wrong architecture.

Per-variant eval-flag recipe

Must match training β€” silent mismatches are the #1 source of wrong results. See doc/track4d-360/2026-04-23-clipfv-4models-288x512-benchmark.md Β§2 bug log.

warped_step-13000.safetensors:
    --track_injection_mode single
    --warped_condition_mode latent_fuse
    --warped_appearance_fusion concat_proj
    --warped_geom_only_dense
    --dense_in_channels 2

static13k_step-13500.safetensors
dynamic5k_step-5000.safetensors
ismb288_3k_step-3000.safetensors:
    --track_injection_mode per_block
    --track_injection_block_mode concat_project
    --warped_condition_mode off
    --dense_in_channels 5

Shared across all 4:

--use_plucker_camera_control
--enable_v2v_plucker_camera_control
--use_query_frame_impulse_condition
--use_dense_branch
--dense_proj_dim 32
--dense_num_residual_blocks 2
--dense_alpha_track 1.0
--track_config config/track4d_360/default_conv3d_patchify_srcdepth.yaml
--num_inference_steps 50
--cfg_scale 1.0
--sigma_shift 5.0
--seed 42

And the base-DiT init path is always weights/wan21-1p3b/diffusion_pytorch_model.safetensors via --vxf_init_checkpoint (CLAUDE.md load-order Invariant A/B β€” VXF init must run BEFORE LoRA wrap and is required on both scratch and resume paths).

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support