arxiv:2404.13895

Optimal Design for Human Preference Elicitation

Published on Apr 22, 2024

Authors:

Abstract

Efficient human preference elicitation methods are developed using optimal design principles applied to item lists, with algorithms designed for both absolute and ranking feedback models and demonstrated on question-answering tasks.

AI-generated summary

Learning of preference models from human feedback has been central to recent advances in artificial intelligence. Motivated by the cost of obtaining high-quality human annotations, we study efficient human preference elicitation for learning preference models. The key idea in our work is to generalize optimal designs, an approach to computing optimal information-gathering policies, to lists of items that represent potential questions with answers. The policy is a distribution over the lists and we elicit preferences from them proportionally to their probabilities. To show the generality of our ideas, we study both absolute and ranking feedback models on items in the list. We design efficient algorithms for both and analyze them. Finally, we demonstrate that our algorithms are practical by evaluating them on existing question-answering problems.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2404.13895 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2404.13895 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2404.13895 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.