Fix chat_template crash when assistant message omits the `content` key

#3
by qgallouedec HF Staff - opened

Fix chat_template to handle assistant messages without a content key

⚠️ This template will start crashing for every tool-calling user as soon as the next transformers release ships.

The upstream PR https://github.com/huggingface/transformers/pull/45422 normalizes message inputs by stripping content=None before rendering (None and absent are semantically identical, and content=None is exactly what the OpenAI API returns for tool-call-only messages). That normalization is correct, but it exposes a latent bug in this template: the tool_calls branch reads message['content'] directly, which raises when the key is absent.

Concretely, this code path is hit by any tool-calling pipeline (OpenAI-compatible servers, agent frameworks, function-calling demos) that produces assistant messages with tool_calls and no textual content. Today most of them happen to pass content=None explicitly and get away with it. After the transformers release, all of them break.

Repro

Today (works):

from transformers import AutoTokenizer

tok = AutoTokenizer.from_pretrained("unsloth/DeepSeek-R1-BF16")
tok.apply_chat_template(
    [
        {"role": "user", "content": "What's the weather in Paris?"},
        {"role": "assistant", "content": None, "tool_calls": [{
            "type": "function",
            "function": {"name": "get_weather", "arguments": '{"city":"Paris"}'},
        }]},
    ],
    tokenize=False,
)
# renders correctly

After https://github.com/huggingface/transformers/pull/45422 (same call, same input — transformers strips content=None before rendering, so the template sees an absent key and crashes):

UndefinedError: 'dict object' has no attribute 'content'

You can reproduce the post-release behavior today by simply omitting the content key.

The fix

A one-character change: message['content'] is nonemessage.get('content') is none. .get() returns None whether the key is absent or set to None, so both cases are handled identically.

Verified against a 14-case regression suite (single-turn, multi-turn, tool flows with/without final answers, multi-system, </think> reasoning, unicode, empty content): all cases either render bit-identically to the current template or, for the previously crashing case, render correctly. Zero regressions.


Disclaimer: this PR was opened as part of a scan for repos whose chat_template is derived from (or copies) the DeepSeek-R1 template, identified by the presence of the buggy substring message['content'] is none. The same one-line fix is proposed wherever that pattern appears verbatim. I do not maintain this model, please review before merging.

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment