arxiv:1909.10642

Data Ordering Patterns for Neural Machine Translation: An Empirical Study

Published on Sep 23, 2019

Authors:

Abstract

Empirical study demonstrates that pre-fixing training data ordering based on perplexity scores from a pre-trained model yields better neural machine translation performance than random shuffling.

AI-generated summary

Recent works show that ordering of the training data affects the model performance for Neural Machine Translation. Several approaches involving dynamic data ordering and data sharding based on curriculum learning have been analysed for the their performance gains and faster convergence. In this work we propose to empirically study several ordering approaches for the training data based on different metrics and evaluate their impact on the model performance. Results from our study show that pre-fixing the ordering of the training data based on perplexity scores from a pre-trained model performs the best and outperforms the default approach of randomly shuffling the training data every epoch.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/1909.10642 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/1909.10642 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.