Papers
arxiv:2603.01367

DUEL: Exact Likelihood for Masked Diffusion via Deterministic Unmasking

Published on Mar 10
Authors:
,
,

Abstract

The DUEL framework provides exact likelihood computation for masked diffusion models, resolving limitations in perplexity measurement and demonstrating their superior performance compared to autoregressive models.

AI-generated summary

Masked diffusion models (MDMs) generate text by iteratively selecting positions to unmask and then predicting tokens at those positions. Yet MDMs lack proper likelihood evaluation: the evidence lower bound (ELBO) is not only a loose bound on log-likelihood, but, as we show, is also computed under the training distribution rather than the test-time distribution. We resolve this within our DUEL framework, which unifies leading MDM sampling strategies that employ deterministic position selection. We prove that DUEL samplers admit exact likelihood computation under the test-time distribution -- giving MDMs proper likelihood, and hence proper perplexity, for the first time. This proper perplexity is the natural analogue of autoregressive perplexity and lets us revisit key questions about MDMs. MDMs are substantially better than previously thought: the MDM-autoregressive perplexity gap shrinks by up to 32% on in-domain data and 82% on zero-shot benchmarks. DUEL enables the first principled comparison of fast,parallel samplers across compute budgets -- an analysis impossible with the ELBO and unreliable with generative perplexity -- identifying a strong default method. Finally, oracle search over position orderings reveals MDMs can far surpass autoregressive models -- achieving 36.47 vs. 52.11 perplexity on AG News -- demonstrating the ceiling of MDM performance has not yet been reached.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2603.01367
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2603.01367 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2603.01367 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2603.01367 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.