Collections
Discover the best community collections!
Collections including paper arxiv:2601.03559
-
TiDAR: Think in Diffusion, Talk in Autoregression
Paper • 2511.08923 • Published • 126 -
Diffusion Language Models are Super Data Learners
Paper • 2511.03276 • Published • 129 -
What Makes Diffusion Language Models Super Data Learners?
Paper • 2510.04071 • Published -
LLaDA2.0: Scaling Up Diffusion Language Models to 100B
Paper • 2512.15745 • Published • 80
-
WorldVLA: Towards Autoregressive Action World Model
Paper • 2506.21539 • Published • 40 -
Fast and Simplex: 2-Simplicial Attention in Triton
Paper • 2507.02754 • Published • 25 -
IntFold: A Controllable Foundation Model for General and Specialized Biomolecular Structure Prediction
Paper • 2507.02025 • Published • 35 -
Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact
Paper • 2507.00951 • Published • 24
-
Contrastive Decoding Improves Reasoning in Large Language Models
Paper • 2309.09117 • Published • 39 -
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
Paper • 2310.08491 • Published • 56 -
Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding
Paper • 2411.04282 • Published • 37 -
Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models
Paper • 2411.14432 • Published • 25
-
MMGR: Multi-Modal Generative Reasoning
Paper • 2512.14691 • Published • 117 -
KlingAvatar 2.0 Technical Report
Paper • 2512.13313 • Published • 43 -
SemanticGen: Video Generation in Semantic Space
Paper • 2512.20619 • Published • 92 -
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI
Paper • 2512.16676 • Published • 212
-
Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
Paper • 2508.07629 • Published • 43 -
Less Is More: Training-Free Sparse Attention with Global Locality for Efficient Reasoning
Paper • 2508.07101 • Published • 14 -
Compressing Chain-of-Thought in LLMs via Step Entropy
Paper • 2508.03346 • Published • 8 -
Train Long, Think Short: Curriculum Learning for Efficient Reasoning
Paper • 2508.08940 • Published • 27
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 195 • 98 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 36 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
-
MMGR: Multi-Modal Generative Reasoning
Paper • 2512.14691 • Published • 117 -
KlingAvatar 2.0 Technical Report
Paper • 2512.13313 • Published • 43 -
SemanticGen: Video Generation in Semantic Space
Paper • 2512.20619 • Published • 92 -
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI
Paper • 2512.16676 • Published • 212
-
TiDAR: Think in Diffusion, Talk in Autoregression
Paper • 2511.08923 • Published • 126 -
Diffusion Language Models are Super Data Learners
Paper • 2511.03276 • Published • 129 -
What Makes Diffusion Language Models Super Data Learners?
Paper • 2510.04071 • Published -
LLaDA2.0: Scaling Up Diffusion Language Models to 100B
Paper • 2512.15745 • Published • 80
-
Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
Paper • 2508.07629 • Published • 43 -
Less Is More: Training-Free Sparse Attention with Global Locality for Efficient Reasoning
Paper • 2508.07101 • Published • 14 -
Compressing Chain-of-Thought in LLMs via Step Entropy
Paper • 2508.03346 • Published • 8 -
Train Long, Think Short: Curriculum Learning for Efficient Reasoning
Paper • 2508.08940 • Published • 27
-
WorldVLA: Towards Autoregressive Action World Model
Paper • 2506.21539 • Published • 40 -
Fast and Simplex: 2-Simplicial Attention in Triton
Paper • 2507.02754 • Published • 25 -
IntFold: A Controllable Foundation Model for General and Specialized Biomolecular Structure Prediction
Paper • 2507.02025 • Published • 35 -
Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact
Paper • 2507.00951 • Published • 24
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 195 • 98 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 36 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
-
Contrastive Decoding Improves Reasoning in Large Language Models
Paper • 2309.09117 • Published • 39 -
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
Paper • 2310.08491 • Published • 56 -
Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding
Paper • 2411.04282 • Published • 37 -
Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models
Paper • 2411.14432 • Published • 25