AgentIR-4B is a retriever specialized for Deep Research agents. Unlike conventional retrievers that process queries with no awareness of the agent, AgentIR explicitly incorporates the agent's reasoning trace by jointly embedding it with the query, leveraging the rich intent and contextual information expressed in the agent's reasoning.

When employed for end-to-end Deep Research on BrowseComp-Plus, AgentIR brings substantial effectiveness and efficiency gains for agents, improving agent accuracy while reducing the number of problem-solving iterations.

Quick Usage

For a quick start with only minimal dependencies (torch and transformers):

import torch
from transformers import AutoModel, AutoTokenizer

MODEL = "Tevatron/AgentIR-4B"
PREFIX = "Instruct: Given a user's reasoning followed by a web search query, retrieve relevant passages that answer the query while incorporating the user's reasoning\nQuery:"
QUERY = """Reasoning: Search results show some relevant info about music and Grammy. We need a composer who won a Grammy, could be from Sweden/Finland/Austria (joined 1995)? The person is known for a certain creation that is a subgenre known for euphoric finale. Which subgenre has a euphoric finale? "Progressive house"? There's a structure: Build-up, breakdown, climax, drop, euphoria. They started creating this piece in a small studio's backroom.

Query: "backroom" "studio" "early 2010s" "euphoric"
"""
DOCS = [
    "35+ Studios With Upcoming Games to Watch: Turtle Rock Studios\n\nMaking its name on the classic Left 4 Dead series of games, Turtle Rock Studios is working on an all-new co-op game called Back 4 Blood that sees you fighting through a zombie apocalypse. Sound familiar? Announced in early 2019 and being published",
    "name: Otto Knows\nimage_upright: 1.25\nbirth_name: Otto Jettman\nbirth_date: 6 05 1989\nbirth_place: Stockholm, Sweden\ngenre: Electro house, house, progressive house\noccupation: DJ, music producer, remixer\n\nOtto Jettman (born 6 May 1989), better known by his stage name Otto Knows is a Swedish DJ, producer and remixer who has had a number of hits in Sweden, Belgium and the Netherlands"
]

def embed(texts, model, tokenizer, device, is_query=False):
    batch = tokenizer(
        [PREFIX + t if is_query else t for t in texts],
        padding=True,
        truncation=True,
        max_length=8192,
        return_tensors="pt",
    )
    batch = {k: v.to(device) for k, v in batch.items()}
    with torch.no_grad():
        hidden = model(**batch, return_dict=True).last_hidden_state
        reps = hidden[:, -1]
        return torch.nn.functional.normalize(reps, p=2, dim=-1).cpu()

model = AutoModel.from_pretrained(MODEL, torch_dtype=torch.float16, device_map="auto")
device = model.device
tokenizer = AutoTokenizer.from_pretrained(MODEL, padding_side="left")

q = embed([QUERY], model, tokenizer, device, is_query=True)[0]
docs = embed(DOCS, model, tokenizer, device)
for doc, vec in zip(DOCS, docs):
    print(f"{torch.dot(q, vec).item():.6f}  {doc}")

To run end-to-end Deep Research with AgentIR-4B, please see https://github.com/texttron/AgentIR/tree/main.

Citation

@article{chen2026AgentIR,
      title={AgentIR: Reasoning-Aware Retrieval for Deep Research Agents}, 
      author={Zijian Chen and Xueguang Ma and Shengyao Zhuang and Jimmy Lin and Akari Asai and Victor Zhong},
      year={2026},
      journal={arXiv preprint arXiv:2603.04384}
}
Downloads last month
54
Safetensors
Model size
4B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Tevatron/AgentIR-4B

Base model

Qwen/Qwen3-4B-Base
Finetuned
(39)
this model

Paper for Tevatron/AgentIR-4B