Instructions to use xiaomoguhzz/VisionEncoder with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use xiaomoguhzz/VisionEncoder with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("xiaomoguhzz/VisionEncoder", dtype="auto") - Notebooks
- Google Colab
- Kaggle
docs: fix eval pointer (GitHub README step 7 -> section 4)
Browse files
README.md
CHANGED
|
@@ -32,7 +32,7 @@ The repo is organized into three top-level folders.
|
|
| 32 |
| `ckpts/4b_stock` | 4B stock baseline (raw Qwen3.5 ViT, skips declip), checkpoint-505, 9.5G |
|
| 33 |
| `ckpts/4b_v9_1` | 4B V9.1 (V-JEPA 2.1 video self-distill), checkpoint-505, 9.5G |
|
| 34 |
|
| 35 |
-
Download either and feed it straight to evaluation (see the GitHub README,
|
| 36 |
|
| 37 |
## `legacy/` — historical assets (~368G)
|
| 38 |
|
|
|
|
| 32 |
| `ckpts/4b_stock` | 4B stock baseline (raw Qwen3.5 ViT, skips declip), checkpoint-505, 9.5G |
|
| 33 |
| `ckpts/4b_v9_1` | 4B V9.1 (V-JEPA 2.1 video self-distill), checkpoint-505, 9.5G |
|
| 34 |
|
| 35 |
+
Download either and feed it straight to evaluation (see the GitHub README, section 4 — MLLM evaluation) to skip declip + S1 + S2.
|
| 36 |
|
| 37 |
## `legacy/` — historical assets (~368G)
|
| 38 |
|