F2LLM-v2: Inclusive, Performant, and Efficient Embeddings for a Multilingual World Paper • 2603.19223 • Published 2 days ago • 23
OCRVerse: Towards Holistic OCR in End-to-End Vision-Language Models Paper • 2601.21639 • Published Jan 29 • 51
F2LLM Technical Report: Matching SOTA Embedding Performance with 6 Million Open-Source Data Paper • 2510.02294 • Published Oct 2, 2025 • 48
Every Sample Matters: Leveraging Mixture-of-Experts and High-Quality Data for Efficient and Accurate Code LLM Paper • 2503.17793 • Published Mar 22, 2025 • 24
Code Graph Model (CGM): A Graph-Integrated Large Language Model for Repository-Level Software Engineering Tasks Paper • 2505.16901 • Published May 22, 2025 • 48