VGGT LaCT (stage 1) — slim adapter weights
These files are LaCT-block weights only (~200 MB), not a full VGGT checkpoint. They plug into the public facebook/VGGT-1B backbone: DINOv2 patch embed, frame-wise attention, and prediction heads stay at Meta’s pretrained VGGT-1B; only the global-attention layers are replaced by LaCT-style fast-weight GLU blocks trained with stage-1 distillation against the frozen teacher.
Code: github.com/Akrao9/vggt_ttt (install vggt from facebookresearch/vggt as in that README).
Files
| File | Description |
|---|---|
vggt_ttt_lact_stage1.pt |
Stage 1 distilled LaCT state dict (lact_state_dict() format). Keys are prefixed with aggregator.lact_blocks.. |
Load (Python)
import torch
from huggingface_hub import hf_hub_download
# From the vggt_ttt repo (with `vggt` installed per upstream README):
from model.vggt_ttt import VGGT_TTT
from model.io_utils import torch_load_checkpoint
ckpt_path = hf_hub_download("akrao9/VGGT-LACT", "vggt_ttt_lact_stage1.pt")
device = "cuda"
model = VGGT_TTT.from_pretrained("facebook/VGGT-1B", chunk_size=16).to(device).eval()
state = torch_load_checkpoint(ckpt_path, map_location=device)
model.load_lact_state_dict(state, strict=True)
Use a local path instead of hf_hub_download if you already downloaded the .pt file.
Inference CLI
From the vggt_ttt repo, after downloading this checkpoint locally:
python scripts/run_inference.py \
--input path/to/video.mp4 --fps 2 \
--checkpoint ./vggt_ttt_lact_stage1.pt \
--out ./out
(--checkpoint accepts this LaCT-only dict; see scripts/run_inference.py.)
Training summary
- Stage 1: distillation from frozen
facebook/VGGT-1B(pose / depth / world points), trainable parameters confined to the 24 LaCT blocks;c_projzero-init for a near-identity start. - Checkpoints: saved with
torch.save(model.lact_state_dict(), path)— same tensor layout as this Hub file.
Hardware / scaling
LaCT path is aimed at longer frame sequences with more favorable VRAM scaling than full global attention; see the GitHub README for benchmark tables (DL3DV-style eval).
License and attribution
- This adapter repository and the training code release are under Apache 2.0 (see project
LICENSE/NOTICEon GitHub). - VGGT-1B is subject to Meta’s license and terms on its model card; you must comply with those when using the backbone.
- Method builds on VGGT and LaCT-style components as described in the upstream README.
Citation
If you use these weights or the vggt_ttt codebase, cite the original VGGT paper/repo and credit this adapter as appropriate for your venue.
Model tree for akrao9/VGGT-LACT
Base model
facebook/VGGT-1B