Papers
arxiv:2012.15525

BANG: Bridging Autoregressive and Non-autoregressive Generation with Large Scale Pretraining

Published on Dec 31, 2020
Authors:
,
,
,
,
,
,
,
,
,
,

Abstract

BANG is a pretraining model that unifies autoregressive and non-autoregressive generation through a novel architecture supporting multiple generation paradigms simultaneously.

AI-generated summary

In this paper, we propose BANG, a new pretraining model to Bridge the gap between Autoregressive (AR) and Non-autoregressive (NAR) Generation. AR and NAR generation can be uniformly regarded as to what extent previous tokens can be attended, and BANG bridges AR and NAR generation by designing a novel model structure for large-scale pretraining. The pretrained BANG model can simultaneously support AR, NAR and semi-NAR generation to meet different requirements. Experiments on question generation (SQuAD 1.1), summarization (XSum) and dialogue generation (PersonaChat) show that BANG improves NAR and semi-NAR performance significantly as well as attaining comparable performance with strong AR pretrained models. Compared with the semi-NAR strong baselines, BANG achieves absolute improvements of 14.01 and 5.24 in the overall scores of SQuAD 1.1 and XSum, respectively. In addition, BANG achieves absolute improvements of 10.73, 6.39 and 5.90 in the overall scores of SQuAD, XSUM and PersonaChat respectively compared with the strong NAR baselines.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2012.15525 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2012.15525 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2012.15525 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.