QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
Paper
• 2309.14717
• Published
• 46
PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Paper
• 2310.09199
• Published
• 28
Can GPT models be Financial Analysts? An Evaluation of ChatGPT and GPT-4
on mock CFA Exams
Paper
• 2310.08678
• Published
• 13
MiniGPT-v2: large language model as a unified interface for
vision-language multi-task learning
Paper
• 2310.09478
• Published
• 21
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper
• 2310.11453
• Published
• 106
JudgeLM: Fine-tuned Large Language Models are Scalable Judges
Paper
• 2310.17631
• Published
• 35
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme
Long Sequence Transformer Models
Paper
• 2309.14509
• Published
• 20
Skywork: A More Open Bilingual Foundation Model
Paper
• 2310.19341
• Published
• 6
UFOGen: You Forward Once Large Scale Text-to-Image Generation via
Diffusion GANs
Paper
• 2311.09257
• Published
• 47
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world
APIs
Paper
• 2307.16789
• Published
• 102
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective
Depth Up-Scaling
Paper
• 2312.15166
• Published
• 61
MobileQuant: Mobile-friendly Quantization for On-device Language Models
Paper
• 2408.13933
• Published
• 16
mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page
Document Understanding
Paper
• 2409.03420
• Published
• 26
Scaling Smart: Accelerating Large Language Model Pre-training with Small
Model Initialization
Paper
• 2409.12903
• Published
• 22
Training Language Models to Self-Correct via Reinforcement Learning
Paper
• 2409.12917
• Published
• 140
Language Models Learn to Mislead Humans via RLHF
Paper
• 2409.12822
• Published
• 11
MathCoder2: Better Math Reasoning from Continued Pretraining on
Model-translated Mathematical Code
Paper
• 2410.08196
• Published
• 48
Transformer^2: Self-adaptive LLMs
Paper
• 2501.06252
• Published
• 55
Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System
Collaboration
Paper
• 2505.20256
• Published
• 19
VideoREPA: Learning Physics for Video Generation through Relational
Alignment with Foundation Models
Paper
• 2505.23656
• Published
• 25
SuperWriter: Reflection-Driven Long-Form Generation with Large Language
Models
Paper
• 2506.04180
• Published
• 34
MemOS: A Memory OS for AI System
Paper
• 2507.03724
• Published
• 159
CriticLean: Critic-Guided Reinforcement Learning for Mathematical
Formalization
Paper
• 2507.06181
• Published
• 45
Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed
Inference
Paper
• 2508.02193
• Published
• 136
Memp: Exploring Agent Procedural Memory
Paper
• 2508.06433
• Published
• 36
Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning
Paper
• 2508.08221
• Published
• 50
Pass@k Training for Adaptively Balancing Exploration and Exploitation of
Large Reasoning Models
Paper
• 2508.10751
• Published
• 28
AWorld: Dynamic Multi-Agent System with Stable Maneuvering for Robust
GAIA Problem Solving
Paper
• 2508.09889
• Published
• 32
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper
• 2508.06471
• Published
• 206
Efficient Code Embeddings from Code Generation Models
Paper
• 2508.21290
• Published
• 19
SINQ: Sinkhorn-Normalized Quantization for Calibration-Free
Low-Precision LLM Weights
Paper
• 2509.22944
• Published
• 80
LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts
Paper
• 2510.19363
• Published
• 62
M3-Bench: Multi-Modal, Multi-Hop, Multi-Threaded Tool-Using MLLM Agent Benchmark
Paper
• 2511.17729
• Published
• 17
Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning
Paper
• 2512.20605
• Published
• 62
Schoenfeld's Anatomy of Mathematical Reasoning by Language Models
Paper
• 2512.19995
• Published
• 16