AR Models with FlexTok EPFL-VILAB/FlexAR-113M-T2I Text-to-Image • Updated Mar 11 • 3 • 1 EPFL-VILAB/FlexAR-382M-T2I Text-to-Image • Updated Mar 11 • 3 EPFL-VILAB/FlexAR-1B-T2I Text-to-Image • Updated Mar 11 • 5 EPFL-VILAB/FlexAR-3B-T2I Text-to-Image • Updated Mar 11 • 27
FlexTok Tokenizers & VAEs Flexible 1D tokenizers and VAEs from https://flextok.epfl.ch/ EPFL-VILAB/flextok_d18_d28_dfn 3B • Updated Mar 19, 2025 • 360 • 1 EPFL-VILAB/flextok_d18_d28_in1k 3B • Updated Mar 19, 2025 • 111 EPFL-VILAB/flextok_d18_d18_in1k 0.9B • Updated Mar 19, 2025 • 310 EPFL-VILAB/flextok_d12_d12_in1k 0.3B • Updated Mar 19, 2025 • 146
4M Tokenizers Multimodal tokenizers from https://4m.epfl.ch/ EPFL-VILAB/4M_tokenizers_rgb_16k_224-448 0.3B • Updated Jun 14, 2024 • 3.7k • 4 EPFL-VILAB/4M_tokenizers_depth_8k_224-448 0.3B • Updated Jun 14, 2024 • 7.02k • 1 EPFL-VILAB/4M_tokenizers_normal_8k_224-448 0.3B • Updated Jun 14, 2024 • 271 • 1 EPFL-VILAB/4M_tokenizers_semseg_4k_224-448 0.2B • Updated Jun 14, 2024 • 505 • 1
TST Datasets & Models Multimodal datasets and models from https://tst-vision.epfl.ch EPFL-VILAB/TST-ProcTHOR Viewer • Updated 10 days ago • 963k • 40 EPFL-VILAB/TST-Replica Updated 19 days ago • 957 EPFL-VILAB/TST-Scannet-pp Viewer • Updated 19 days ago • 733k • 524 EPFL-VILAB/TST-ProcTHOR-adapted Updated May 11, 2025
4M Models Multimodal models from https://4m.epfl.ch/ EPFL-VILAB/4M-7_B_CC12M Any-to-Any • 0.4B • Updated Oct 7, 2024 • 97 • 19 EPFL-VILAB/4M-7_L_CC12M Any-to-Any • 1B • Updated Oct 7, 2024 • 26 • 2 EPFL-VILAB/4M-7_XL_CC12M Any-to-Any • 3B • Updated Oct 7, 2024 • 18 • 1 EPFL-VILAB/4M-7_B_COYO700M Any-to-Any • 0.4B • Updated Oct 7, 2024 • 9 • 1
Omnidata depth & normals models Omnidata surface normals and depth estimators sashasax/omnidata_normal_dpt_hybrid_384 Updated Sep 25, 2024 • 1 sashasax/omnidata_depth_dpt_hybrid_384 Updated Sep 25, 2024 • 1 Runtime error Agents 3 Omnidata Monocular Surface Normal Dpt Hybrid 384 🐠 3 Runtime error Agents 3 Omnidata Monocular Depth DPT Hybrid 384 🐠 3
AR Models with FlexTok EPFL-VILAB/FlexAR-113M-T2I Text-to-Image • Updated Mar 11 • 3 • 1 EPFL-VILAB/FlexAR-382M-T2I Text-to-Image • Updated Mar 11 • 3 EPFL-VILAB/FlexAR-1B-T2I Text-to-Image • Updated Mar 11 • 5 EPFL-VILAB/FlexAR-3B-T2I Text-to-Image • Updated Mar 11 • 27
TST Datasets & Models Multimodal datasets and models from https://tst-vision.epfl.ch EPFL-VILAB/TST-ProcTHOR Viewer • Updated 10 days ago • 963k • 40 EPFL-VILAB/TST-Replica Updated 19 days ago • 957 EPFL-VILAB/TST-Scannet-pp Viewer • Updated 19 days ago • 733k • 524 EPFL-VILAB/TST-ProcTHOR-adapted Updated May 11, 2025
FlexTok Tokenizers & VAEs Flexible 1D tokenizers and VAEs from https://flextok.epfl.ch/ EPFL-VILAB/flextok_d18_d28_dfn 3B • Updated Mar 19, 2025 • 360 • 1 EPFL-VILAB/flextok_d18_d28_in1k 3B • Updated Mar 19, 2025 • 111 EPFL-VILAB/flextok_d18_d18_in1k 0.9B • Updated Mar 19, 2025 • 310 EPFL-VILAB/flextok_d12_d12_in1k 0.3B • Updated Mar 19, 2025 • 146
4M Models Multimodal models from https://4m.epfl.ch/ EPFL-VILAB/4M-7_B_CC12M Any-to-Any • 0.4B • Updated Oct 7, 2024 • 97 • 19 EPFL-VILAB/4M-7_L_CC12M Any-to-Any • 1B • Updated Oct 7, 2024 • 26 • 2 EPFL-VILAB/4M-7_XL_CC12M Any-to-Any • 3B • Updated Oct 7, 2024 • 18 • 1 EPFL-VILAB/4M-7_B_COYO700M Any-to-Any • 0.4B • Updated Oct 7, 2024 • 9 • 1
4M Tokenizers Multimodal tokenizers from https://4m.epfl.ch/ EPFL-VILAB/4M_tokenizers_rgb_16k_224-448 0.3B • Updated Jun 14, 2024 • 3.7k • 4 EPFL-VILAB/4M_tokenizers_depth_8k_224-448 0.3B • Updated Jun 14, 2024 • 7.02k • 1 EPFL-VILAB/4M_tokenizers_normal_8k_224-448 0.3B • Updated Jun 14, 2024 • 271 • 1 EPFL-VILAB/4M_tokenizers_semseg_4k_224-448 0.2B • Updated Jun 14, 2024 • 505 • 1
Omnidata depth & normals models Omnidata surface normals and depth estimators sashasax/omnidata_normal_dpt_hybrid_384 Updated Sep 25, 2024 • 1 sashasax/omnidata_depth_dpt_hybrid_384 Updated Sep 25, 2024 • 1 Runtime error Agents 3 Omnidata Monocular Surface Normal Dpt Hybrid 384 🐠 3 Runtime error Agents 3 Omnidata Monocular Depth DPT Hybrid 384 🐠 3