EntRGi: Entropy Aware Reward Guidance for Diffusion Language Models Paper β’ 2602.05000 β’ Published 18 days ago β’ 1
view post Post 1618 Are you familiar with reverse residual connections or looping in language models?Excited to share my Looped-GPT blog post and codebase πhttps://github.com/sanyalsunny111/Looped-GPTTL;DR: looping during pre-training improves generalization.Plot shows GPT2 LMs pre-trained with 15.73B OWT tokensP.S. This is my first post here β I have ~4 followers and zero expectations for reach π See translation 3 replies Β· π§ 6 6 π 3 3 + Reply
PhD Knowledge Not Required: A Reasoning Challenge for Large Language Models Paper β’ 2502.01584 β’ Published Feb 3, 2025 β’ 9
Normalizing Flows are Capable Generative Models Paper β’ 2412.06329 β’ Published Dec 9, 2024 β’ 11
What If We Recaption Billions of Web Images with LLaMA-3? Paper β’ 2406.08478 β’ Published Jun 12, 2024 β’ 43
CORE-MM: Complex Open-Ended Reasoning Evaluation For Multi-Modal Large Language Models Paper β’ 2311.11567 β’ Published Nov 20, 2023 β’ 8
Re-imagine the Negative Prompt Algorithm: Transform 2D Diffusion into 3D, alleviate Janus problem and Beyond Paper β’ 2304.04968 β’ Published Apr 11, 2023
Exploiting Chain Rule and Bayes' Theorem to Compare Probability Distributions Paper β’ 2012.14100 β’ Published Dec 28, 2020
Mixing and Shifting: Exploiting Global and Local Dependencies in Vision MLPs Paper β’ 2202.06510 β’ Published Feb 14, 2022
Contrastive Attraction and Contrastive Repulsion for Representation Learning Paper β’ 2105.03746 β’ Published May 8, 2021 β’ 1
DR2: Diffusion-based Robust Degradation Remover for Blind Face Restoration Paper β’ 2303.06885 β’ Published Mar 13, 2023
ProtoTEx: Explaining Model Decisions with Prototype Tensors Paper β’ 2204.05426 β’ Published Apr 11, 2022
The State of Human-centered NLP Technology for Fact-checking Paper β’ 2301.03056 β’ Published Jan 8, 2023