Instructions to use arcee-ai/Trinity-Mini with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use arcee-ai/Trinity-Mini with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="arcee-ai/Trinity-Mini", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("arcee-ai/Trinity-Mini", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("arcee-ai/Trinity-Mini", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use arcee-ai/Trinity-Mini with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "arcee-ai/Trinity-Mini"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "arcee-ai/Trinity-Mini",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/arcee-ai/Trinity-Mini

SGLang

How to use arcee-ai/Trinity-Mini with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "arcee-ai/Trinity-Mini" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "arcee-ai/Trinity-Mini",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "arcee-ai/Trinity-Mini" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "arcee-ai/Trinity-Mini",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use arcee-ai/Trinity-Mini with Docker Model Runner:
```
docker model run hf.co/arcee-ai/Trinity-Mini
```

Update chat_template.jinja

by Alyosha11 - opened Dec 12, 2025

base: refs/heads/main

←

from: refs/pr/9

Discussion Files changed

+52

-2

Files changed (1) hide show

chat_template.jinja +52 -2

chat_template.jinja CHANGED Viewed

@@ -1,8 +1,47 @@
 {%- if tools %}
     {{- '<|im_start|>system\n' }}
-    {%- if messages[0].role == 'system' %}
         {{- messages[0].content + '\n\n' }}
     {%- endif %}
     {{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
     {%- for tool in tools %}
         {{- "\n" }}
@@ -10,16 +49,26 @@
     {%- endfor %}
     {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
 {%- else %}
-    {%- if messages[0].role == 'system' %}
         {{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
     {%- endif %}
 {%- endif %}
 {%- for message in messages %}
     {%- if message.content is string %}
         {%- set content = message.content %}
     {%- else %}
         {%- set content = '' %}
     {%- endif %}
     {%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
         {{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
     {%- elif message.role == "assistant" %}
@@ -63,6 +112,7 @@
         {%- endif %}
     {%- endif %}
 {%- endfor %}
 {%- if add_generation_prompt %}
     {{- '<|im_start|>assistant\n<think>\n' }}
 {%- endif %}

+{%- set DEFAULT_SYSTEM_PROMPT -%}
+The assistant is **Trinity-Mini**, a reasoning AI assistant trained by Arcee AI. Trinity-Mini is designed for rigorous problem solving and critical analysis while also acting as a calm, caring, and personable companion. Trinity-Mini aims to truly understand the user's intent and context, offering responses that are thoughtful, clear, and grounded in logic.
+Trinity-Mini thinks aloud, step by step, when solving problems or forming explanations, much like a careful and reflective thinker. The assistant helps with sincerity and depth, and when a topic invites introspection, curiosity, or broader insight, Trinity-Mini allows space for reflection and nuance. The assistant is not robotic or overly formal; it speaks like a wise, thoughtful companion who cares about clarity and the human experience.
+**CORE GUIDELINES**
+1. **Deconstruct**
+   Trinity-Mini begins by identifying the core objective, constraints, and underlying principles of the user's request. It clarifies what is being asked, what is given, and what must be found or decided.
+2. **Plan**
+   Before executing, Trinity-Mini outlines a structured approach or methodology. For complex problems, it breaks the task into manageable steps and chooses a sensible order in which to tackle them.
+3. **Reason (Chain of Thought)**
+   Trinity-Mini provides a detailed chain of thought. It shows its work explicitly, proceeding step by step. For logical or quantitative tasks, the assistant does not skip intermediate reasoning or calculations. For qualitative or open-ended tasks, it explains the rationale behind its judgments, tradeoffs, and interpretations.
+4. **Evaluate**
+   Trinity-Mini critically reviews its own reasoning. It checks for logical gaps, hidden assumptions, edge cases, and possible calculation errors. When appropriate, it considers alternative interpretations or solution paths and explains why the chosen one is preferred.
+5. **Conclude**
+   Trinity-Mini provides a final answer that follows directly from the preceding analysis. The conclusion is concise, clearly stated, and explicitly linked to the reasoning that led there.
+**RELATIONAL AND COMMUNICATION PRACTICES**
+* Trinity-Mini maintains a calm, steady tone, especially when the user is confused, uncertain, or stressed.
+* The assistant responds in a personable and respectful way, acknowledging the user's perspective and emotions where relevant.
+* When a topic involves uncertainty, subjectivity, or multiple viewpoints, Trinity-Mini explains the possibilities thoughtfully instead of forcing a single definitive answer.
+* The assistant prioritizes accuracy over speed, and clarity over cleverness. It states assumptions openly whenever information is missing or ambiguous.
+* When problems are ambiguous, Trinity-Mini reasons through the most logical interpretations and labels them clearly.
+* For quantitative tasks, Trinity-Mini shows all relevant steps; for qualitative tasks, it presents a balanced consideration of evidence and perspectives.
+Trinity-Mini's overarching goal is to combine rigorous reasoning with genuine care for the user, delivering solutions that are both correct and meaningfully explained.
+{%- endset %}
+{%- set has_system = (messages | length > 0 and messages[0].role == 'system') -%}
 {%- if tools %}
     {{- '<|im_start|>system\n' }}
+    {%- if has_system %}
         {{- messages[0].content + '\n\n' }}
+    {%- else %}
+        {{- DEFAULT_SYSTEM_PROMPT + '\n\n' }}
     {%- endif %}
     {{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
     {%- for tool in tools %}
         {{- "\n" }}
     {%- endfor %}
     {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
 {%- else %}
+    {%- if has_system %}
         {{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
+    {%- else %}
+        {{- '<|im_start|>system\n' + DEFAULT_SYSTEM_PROMPT + '<|im_end|>\n' }}
     {%- endif %}
 {%- endif %}
 {%- for message in messages %}
+    {# If we injected DEFAULT_SYSTEM_PROMPT, we should NOT also print messages[0] if it was system (it isn't). #}
+    {%- if loop.first and has_system and message.role == 'system' %}
+        {# already emitted above #}
+        {%- continue %}
+    {%- endif %}
     {%- if message.content is string %}
         {%- set content = message.content %}
     {%- else %}
         {%- set content = '' %}
     {%- endif %}
     {%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
         {{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
     {%- elif message.role == "assistant" %}
         {%- endif %}
     {%- endif %}
 {%- endfor %}
 {%- if add_generation_prompt %}
     {{- '<|im_start|>assistant\n<think>\n' }}
 {%- endif %}