HTR ByteDance/Sa2VA-4B Image-Text-to-Text • Updated Sep 8, 2025 • 4.64k • 96 Finnish-NLP/Ahma-2-4B-Instruct Text Generation • 4B • Updated Nov 25, 2025 • 83 • 4 black-forest-labs/FLUX.2-dev Image-to-Image • Updated Feb 17 • 220k • • 1.5k mistralai/Mistral-Large-Instruct-2407 Updated Jul 28, 2025 • 7.26k • 859
Computer Vision Vision Grid Transformer for Document Layout Analysis Paper • 2308.14978 • Published Aug 29, 2023 • 4
HTR ByteDance/Sa2VA-4B Image-Text-to-Text • Updated Sep 8, 2025 • 4.64k • 96 Finnish-NLP/Ahma-2-4B-Instruct Text Generation • 4B • Updated Nov 25, 2025 • 83 • 4 black-forest-labs/FLUX.2-dev Image-to-Image • Updated Feb 17 • 220k • • 1.5k mistralai/Mistral-Large-Instruct-2407 Updated Jul 28, 2025 • 7.26k • 859
Computer Vision Vision Grid Transformer for Document Layout Analysis Paper • 2308.14978 • Published Aug 29, 2023 • 4