Instructions to use naver-hyperclovax/HyperCLOVAX-SEED-Think-32B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use naver-hyperclovax/HyperCLOVAX-SEED-Think-32B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="naver-hyperclovax/HyperCLOVAX-SEED-Think-32B", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("naver-hyperclovax/HyperCLOVAX-SEED-Think-32B", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use naver-hyperclovax/HyperCLOVAX-SEED-Think-32B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "naver-hyperclovax/HyperCLOVAX-SEED-Think-32B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "naver-hyperclovax/HyperCLOVAX-SEED-Think-32B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/naver-hyperclovax/HyperCLOVAX-SEED-Think-32B

SGLang

How to use naver-hyperclovax/HyperCLOVAX-SEED-Think-32B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "naver-hyperclovax/HyperCLOVAX-SEED-Think-32B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "naver-hyperclovax/HyperCLOVAX-SEED-Think-32B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "naver-hyperclovax/HyperCLOVAX-SEED-Think-32B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "naver-hyperclovax/HyperCLOVAX-SEED-Think-32B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use naver-hyperclovax/HyperCLOVAX-SEED-Think-32B with Docker Model Runner:
```
docker model run hf.co/naver-hyperclovax/HyperCLOVAX-SEED-Think-32B
```

Update chat_template.jinja

#12

by jp1924 - opened Mar 30

base: refs/heads/main

←

from: refs/pr/12

Discussion Files changed

+25

-24

Update chat_template.jinjae2b40b29

jp1924

Mar 30

No description provided.

jp1924

Mar 30

@bigshanedogg

jp1924

Mar 31

•

edited 6 days ago

check this: https://github.com/huggingface/transformers/pull/44314#discussion_r3008382827
I added this logic because video handling for HCX is broken on recent versions of transformers.

If you look at HCX's chat_template.jinja, it handles image/video differently from most models.
Models like Qwen2_5_vl or VideoLLaMA3 expect multimodal items in a normalized form such as {"type": "image"} or {"type": "video"}.
HCX, however, uses {"type": "image_url", "image_url": {"url": "video.mp4"}} and {"type": "image_url", "image_url": {"url": "image.png"}}, then branches by file extension inside the template.

In newer transformers, this block rewrites image_url into image:
https://github.com/huggingface/transformers/blob/2da00a3cec88fac160d481406e7961cf59472894/src/transformers/processing_utils.py#L1792-L1799

Because of that rewrite, HCX's image_url-based branch no longer triggers correctly. As a result, even when a video is provided, it gets rendered like plain text:

>>> from transformers import AutoProcessor
>>> processor = AutoProcessor.from_pretrained("naver-hyperclovax/HyperCLOVAX-SEED-Think-32B", trust_remote_code=False)
You are using a model of type `vlm` to instantiate a model of type ``. This may be expected if you are loading a checkpoint that shares a subset of the architecture (e.g., loading a `sam2_video` checkpoint into `Sam2Model`), but is otherwise not supported and can yield errors. Please verify that the checkpoint is compatible with the model you are instantiating.
>>> video_messages = [
...     {
...         "role": "user",
...         "content": [
...             {"type": "image_url", "image_url": {"url": "https://huggingface.co/datasets/hf-internal-testing/sam2-fixtures/resolve/main/bedroom.mp4"}},
...             {"type": "text", "text": "What is shown in this video?"},
...         ],
...     }
... ]
>>> text = processor.apply_chat_template(video_messages, tokenize=False, add_generation_prompt=True)
>>> text
'<|im_start|>user\n\nWhat is shown in this video?<|im_end|>\n<|im_start|>assistant\n<think>\n\n</think>\n\n'
>>> video_messages
[{'role': 'user', 'content': [{'type': 'image', 'url': 'https://huggingface.co/datasets/hf-internal-testing/sam2-fixtures/resolve/main/bedroom.mp4'}, {'type': 'text', 'text': 'What is shown in this video?'}]}]

If it were working correctly, the output should include the video MIME and video tokens, e.g. mime_start / video_aux_start / VIDEO_PAD, not the plain-text-only prompt above.

bigshanedogg The current multimodal input contract in naver-hyperclovax/HyperCLOVAX-SEED-Think-32B is non-standard.
I opened a Hub discussion/PR to update chat_template.jinja so it aligns with current processor behavior:
https://huggingface.co/naver-hyperclovax/HyperCLOVAX-SEED-Think-32B/discussions/12

zucchini-nlp Until that template update is merged, we need to keep this compatibility code in this PR.
I'll follow up again once we have a decision on the Hub-side template PR.

bigshanedogg

12 days ago

Thank you for opening this PR! 🙏
I've reviewed the points discussed in huggingface/transformers#44314, and we'll incorporate the changes accordingly.
Just to be safe, we'll post a BC-break notice first and aim to apply the changes within a day or so afterward. I'll follow up here once the update is reflected.

Update chat_template: support standard type-based image/video detection5a8f52b7

jp1924

4 days ago

•

edited 4 days ago

"""
Verify the updated HyperCLOVAX Vision V2 chat template.

Run:
    pip install transformers huggingface_hub
    python test_chat_template.py
"""

from huggingface_hub import hf_hub_download
from transformers import HyperCLOVAXVisionV2Processor

# Load processor and inject the updated template from the PR branch
processor = HyperCLOVAXVisionV2Processor.from_pretrained(
    "naver-hyperclovax/HyperCLOVAX-SEED-Think-32B"
)
template_path = hf_hub_download(
    repo_id="naver-hyperclovax/HyperCLOVAX-SEED-Think-32B",
    filename="chat_template.jinja",
    revision="refs/pr/12",
)
with open(template_path) as f:
    processor.tokenizer.chat_template = f.read()

apply = processor.tokenizer.apply_chat_template


def show(title, messages, **kwargs):
    print(f"\n{'─' * 60}\n[{title}]\n{'─' * 60}")
    print(apply(messages, tokenize=False, add_generation_prompt=True, **kwargs))


# 1. Text only
show("1. Text only", [
    {"role": "system", "content": "당신은 유능한 AI 어시스턴트입니다."},
    {"role": "user",   "content": "대한민국의 수도는 어디인가요?"},
])
# <|im_start|>system
# 당신은 유능한 AI 어시스턴트입니다.<|im_end|>
# <|im_start|>user
# 대한민국의 수도는 어디인가요?<|im_end|>
# <|im_start|>assistant
# <think>
#
# </think>

# 2. System as list — previously crashed, now supported
show("2. System as list (multimodal)", [
    {"role": "system", "content": [
        {"type": "text",  "text": "당신은 유능한 AI 어시스턴트입니다."},
    ]},
    {"role": "user", "content": "안녕하세요!"},
])
# <|im_start|>system
# 당신은 유능한 AI 어시스턴트입니다.<|im_end|>
# <|im_start|>user
# 안녕하세요!<|im_end|>
# <|im_start|>assistant
# <think>
#
# </think>

# 3. Image — "url" key
show("3. Image (url key)", [
    {"role": "system", "content": "당신은 유능한 AI 어시스턴트입니다."},
    {"role": "user",   "content": [
        {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg"},
        {"type": "text",  "text": "이 이미지를 설명해 주세요."},
    ]},
])
# <|im_start|>system
# 당신은 유능한 AI 어시스턴트입니다.<|im_end|>
# <|im_start|>user
# <|mime_start|>{"id": "image_00", "type": "image/jpeg", "filename": "image.jpg"}<|mime_end|>
# <|image_start|><|IMAGE_PAD|><|image_end|>
# 이 이미지를 설명해 주세요.<|im_end|>
# <|im_start|>assistant
# <think>
#
# </think>

# 4. Image — "image" key (Qwen-style)
show("4. Image (image key)", [
    {"role": "system", "content": "당신은 유능한 AI 어시스턴트입니다."},
    {"role": "user",   "content": [
        {"type": "image", "image": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg"},
        {"type": "text",  "text": "이 이미지를 설명해 주세요."},
    ]},
])
# <|im_start|>system
# 당신은 유능한 AI 어시스턴트입니다.<|im_end|>
# <|im_start|>user
# <|mime_start|>{"id": "image_00", "type": "image/jpeg", "filename": "image.jpg"}<|mime_end|>
# <|image_start|><|IMAGE_PAD|><|image_end|>
# 이 이미지를 설명해 주세요.<|im_end|>
# <|im_start|>assistant
# <think>
#
# </think>

# 5. Video
show("5. Video", [
    {"role": "system", "content": "당신은 유능한 AI 어시스턴트입니다."},
    {"role": "user",   "content": [
        {"type": "video", "url": "https://huggingface.co/datasets/hf-internal-testing/fixtures_videos/resolve/main/tennis.mp4"},
        {"type": "text",  "text": "이 비디오에서 무엇이 일어나고 있나요?"},
    ]},
])
# <|im_start|>system
# 당신은 유능한 AI 어시스턴트입니다.<|im_end|>
# <|im_start|>user
# <|mime_start|>{"id": "video_00", "type": "video/mp4", "filename": "video.mp4"}<|mime_end|>
# <|video_aux_start|>다음 중 video_duration은 비디오 길이 정보입니다. 참고하여 답변하세요. {"video_duration": "<|video_meta_duration|>"}<|video_aux_end|>
# <|video_start|><|VIDEO_PAD|><|video_end|>
# 이 비디오에서 무엇이 일어나고 있나요?<|im_end|>
# <|im_start|>assistant
# <think>
#
# </think>

# 6. Multi-turn
show("6. Multi-turn", [
    {"role": "system",    "content": "당신은 유능한 AI 어시스턴트입니다."},
    {"role": "user",      "content": [
        {"type": "image", "image": "https://example.com/cat.jpg"},
        {"type": "text",  "text": "이 이미지에서 무엇이 보이나요?"},
    ]},
    {"role": "assistant", "content": "고양이가 소파에 앉아 있는 모습이 보입니다."},
    {"role": "user",      "content": "귀엽네요! 고양이 품종이 무엇인가요?"},
])
# <|im_start|>system
# 당신은 유능한 AI 어시스턴트입니다.<|im_end|>
# <|im_start|>user
# <|mime_start|>{"id": "image_00", "type": "image/jpeg", "filename": "image.jpg"}<|mime_end|>
# <|image_start|><|IMAGE_PAD|><|image_end|>
# 이 이미지에서 무엇이 보이나요?<|im_end|>
# <|im_start|>assistant
# 고양이가 소파에 앉아 있는 모습이 보입니다.<|im_end|>
# <|im_start|>user
# 귀엽네요! 고양이 품종이 무엇인가요?<|im_end|>
# <|im_start|>assistant
# <think>
#
# </think>

# 7. Thinking mode
show("7. Thinking mode", [
    {"role": "user", "content": "1부터 10까지의 합을 구해 주세요."},
], thinking=True)
# <|im_start|>user
# 1부터 10까지의 합을 구해 주세요.<|im_end|>
# <|im_start|>assistant
# <think>

# 8. Tool calling
tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "특정 위치의 현재 날씨를 가져옵니다.",
        "parameters": {
            "type": "object",
            "properties": {"location": {"type": "string", "description": "도시 이름"}},
            "required": ["location"],
        },
    },
}]

show("8. Tool calling", [
    {"role": "system", "content": "당신은 유능한 AI 어시스턴트입니다."},
    {"role": "user",   "content": "서울의 현재 날씨를 알려주세요."},
], tools=tools)
# <|im_start|>system
# 당신은 유능한 AI 어시스턴트입니다.
#
# # Tools
# ...
# <tools>
# {"type": "function", "function": {"name": "get_weather", ...}}
# </tools>
# ...
# </tool_call><|im_end|>
# <|im_start|>user
# 서울의 현재 날씨를 알려주세요.<|im_end|>
# <|im_start|>assistant
# <think>
#
# </think>

# 9. Tool call → tool response
show("9. Tool call → tool response", [
    {"role": "user", "content": "서울의 현재 날씨를 알려주세요."},
    {"role": "assistant", "content": "", "tool_calls": [
        {"function": {"name": "get_weather", "arguments": {"location": "Seoul"}}}
    ]},
    {"role": "tool", "name": "get_weather", "content": '{"temperature": 22, "condition": "맑음"}'},
], tools=tools)
# <|im_start|>user
# 서울의 현재 날씨를 알려주세요.<|im_end|>
# <|im_start|>assistant
# <tool_call>get_weather
# <arg_key>location</arg_key>
# <arg_value>Seoul</arg_value>
# </tool_call><|im_end|>
# <|im_start|>tool
# <tool_response>get_weather
# {"temperature": 22, "condition": "맑음"}
# </tool_response><|im_end|>
# <|im_start|>assistant
# <think>
#
# </think>

@bigshanedogg
The revised chat template I've uploaded can handle these cases. I think you should take a look at it.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment