| --- |
| license: mit |
| datasets: |
| - Jhcircle/KardiaBench |
| language: |
| - en |
| base_model: |
| - Qwen/Qwen2.5-7B-Instruct |
| pipeline_tag: question-answering |
| tags: |
| - agent |
| --- |
| |
| <h1>Kardia-R1: Unleashing LLMs to Reason toward Understanding and Empathy for Emotional Support via Rubric-as-Judge Reinforcement Learning</h1> |
|
|
| _(Accepted by WWW 2026)_ |
|
|
| [](https://arxiv.org/abs/2512.01282) |
|  |
|
|
| ✨ Like Kardia-R1? Give us a ⭐ Star on GitHub! Your support keeps us going! [**JhCircle/Kardia-R1**](https://github.com/JhCircle/Kardia-R1) |
| --- |
|
|
| ## 🎯 Overview |
|
|
| **Kardia-R1** is a specialized 7B-parameter large language model fine-tuned from [Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) for **emotional support dialogue**. Unlike standard conversational AI, Kardia-R1 employs a novel **Rubric-as-Judge Reinforcement Learning (Rubric-ERL)** framework that explicitly trains the model to: |
|
|
| 1. **Understand** user emotions through structured reasoning |
| 2. **Empathize** using validated psychological principles (affective/cognitive empathy, reflective listening) |
| 3. **Respond** with concise, personalized emotional support |
|
|
| The model generates structured outputs with four distinct reasoning stages: Understanding → Reasoning → Emotion Recognition → Response Generation. |
|
|
|
|
| ## 📝 Citation |
| ```markdown |
| @article{yuan2025kardia, |
| title={Kardia-R1: Unleashing LLMs to Reason toward Understanding and Empathy for Emotional Support via Rubric-as-Judge Reinforcement Learning}, |
| author={Yuan, Jiahao and Cui, Zhiqing and Wang, Hanqing and Gao, Yuansheng and Zhou, Yucheng and Naseem, Usman}, |
| journal={arXiv preprint arXiv:2512.01282}, |
| year={2025} |
| } |
| ``` |
|
|
| ## 🧠 Model Architecture |
|
|
| - **Base Model**: Qwen2.5-7B-Instruct |
| - **Fine-tuning Method**: Rubric-as-Judge Reinforcement Learning (Rubric-ERL) |
| - **Context Window**: 32K tokens |
| - **Special Tokens**: |
| - `<\|understanding_begin\|>` / `<\|understanding_end\|>` |
| - `<\|reasoning_begin\|>` / `<\|reasoning_end\|>` |
| - `<\|emotion_begin\|>` / `<\|emotion_end\|>` |
| - `<\|response_begin\|>` / `<\|response_end\|>` |
|
|
|
|
| ## 🚀 Usage |
|
|
| ### Installation |
|
|
| ```bash |
| pip install transformers torch |
| <!-- or --> |
| pip install ms-swift |
| ``` |
|
|
| ### Quick Start with Transformers |
|
|
| ```python |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| import torch |
| |
| model_id = "Jhcircle/Kardia-R1" |
| |
| # Load model and tokenizer |
| tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) |
| model = AutoModelForCausalLM.from_pretrained( |
| model_id, |
| torch_dtype=torch.bfloat16, |
| device_map="auto", |
| trust_remote_code=True |
| ) |
| |
| # Prepare system prompt |
| system_prompt = """You are an emotional dialogue assistant and a psychological expert. Your task is to respond to the User's message in a roleplay scenario, taking into account the User's personality, emotional state, and situation. ### Role and Objective ### - Act as both a supportive therapist and an empathetic conversational partner. - Prioritize understanding the User’s feelings and providing emotional validation. - Keep the conversation natural, emotionally resonant, and aligned with the User's profile. ### Response Requirements ### - Structure your reply in 4 sections: <|understanding_begin|>: Summarize the User's message, intent, and key emotional cues. <|reasoning_begin|>: Explain your empathic rationale, considering psychological principles such as affective and cognitive empathy, emotion validation, and reflective listening. <|emotion_begin|>: Accurately reflect the User's current emotional state. <|response_begin|>: Provide a concise, natural, emotionally supportive reply (≤30 tokens), coherent and aligned with the User’s personality. - Avoid asking unnecessary questions; focus on reflecting, validating, and supporting the User. - Ensure each section is clear, concise, and well-structured. |
| ### User Profile |
| {{profile}} |
| ### Situation ### |
| {{situation}} |
| ### <|understanding_begin|>{{Concise summary of user's message, intent, and key emotional cues.}}<|understanding_end|> |
| <|reasoning_begin|>{{Brief empathic rationale using perspective-taking and emotion validation.}}<|reasoning_end|> |
| <|emotion_begin|>{{Select the most fitting emotion from: sentimental, afraid, proud, faithful, terrified, joyful, angry, sad, jealous, grateful, prepared, embarrassed, excited, annoyed, lonely, ashamed, guilty, surprised, nostalgic, confident, furious, disappointed, caring, trusting, disgusted, anticipating, anxious, hopeful, content, impressed, apprehensive, devastated}}<|emotion_end|> |
| <|response_begin|>{{Provide a concise, supportive reply (≤30 tokens) aligned with the user's personality and emotional state.}}<|response_end|> |
| """ |
| |
| # Generate response |
| messages = [ |
| {"role": "system", "content": system_prompt}, |
| {"role": "user", "content": "I don't know how to process this. Everything feels numb."} |
| ] |
| |
| inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device) |
| |
| outputs = model.generate( |
| inputs, |
| max_new_tokens=512, |
| temperature=0.0, |
| do_sample=False, |
| ) |
| |
| response = tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=False) |
| print(response) |
| ``` |
|
|
| ### Quick Start with Ms-Swift |
|
|
| ```python |
| import os |
| os.environ['CUDA_VISIBLE_DEVICES'] = '0' |
| |
| from swift.llm import PtEngine, RequestConfig, InferRequest, get_model_tokenizer, get_template |
| |
| model_path = "Jhcircle/Kardia-R1" |
| model_type = "qwen2_5" |
| |
| # Initialize model |
| model, tokenizer = get_model_tokenizer(model_path, model_type=model_type) |
| template = get_template(model.model_meta.template, tokenizer, default_system=None) |
| |
| # Create inference engine |
| engine = PtEngine.from_model_template(model, template, max_batch_size=2) |
| request_config = RequestConfig(max_tokens=512, temperature=0.0) |
| |
| # Prepare system prompt |
| system_prompt = """You are an emotional dialogue assistant and a psychological expert. Your task is to respond to the User's message in a roleplay scenario, taking into account the User's personality, emotional state, and situation. ### Role and Objective ### - Act as both a supportive therapist and an empathetic conversational partner. - Prioritize understanding the User’s feelings and providing emotional validation. - Keep the conversation natural, emotionally resonant, and aligned with the User's profile. ### Response Requirements ### - Structure your reply in 4 sections: <|understanding_begin|>: Summarize the User's message, intent, and key emotional cues. <|reasoning_begin|>: Explain your empathic rationale, considering psychological principles such as affective and cognitive empathy, emotion validation, and reflective listening. <|emotion_begin|>: Accurately reflect the User's current emotional state. <|response_begin|>: Provide a concise, natural, emotionally supportive reply (≤30 tokens), coherent and aligned with the User’s personality. - Avoid asking unnecessary questions; focus on reflecting, validating, and supporting the User. - Ensure each section is clear, concise, and well-structured. |
| ### User Profile |
| {{profile}} |
| ### Situation ### |
| {{situation}} |
| ### <|understanding_begin|>{{Concise summary of user's message, intent, and key emotional cues.}}<|understanding_end|> |
| <|reasoning_begin|>{{Brief empathic rationale using perspective-taking and emotion validation.}}<|reasoning_end|> |
| <|emotion_begin|>{{Select the most fitting emotion from: sentimental, afraid, proud, faithful, terrified, joyful, angry, sad, jealous, grateful, prepared, embarrassed, excited, annoyed, lonely, ashamed, guilty, surprised, nostalgic, confident, furious, disappointed, caring, trusting, disgusted, anticipating, anxious, hopeful, content, impressed, apprehensive, devastated}}<|emotion_end|> |
| <|response_begin|>{{Provide a concise, supportive reply (≤30 tokens) aligned with the user's personality and emotional state.}}<|response_end|> |
| """ |
| |
| infer_requests = [ |
| InferRequest(messages=[ |
| {"role": "system", "content": system_prompt}, |
| {"role": "user", "content": "I feel like I'm drowning. No matter how much I study, it's never enough."} |
| ]), |
| ] |
| |
| # Run inference |
| resp_list = engine.infer(infer_requests, request_config) |
| print(f'Response: {resp_list[0].choices[0].message.content}') |
| ``` |
|
|
|
|
| ## 🏋️ Training Details |
|
|
| - **Dataset**: [KardiaBench](https://huggingface.co/datasets/Jhcircle/KardiaBench) - A curated dataset of emotional support dialogues with rubric-based annotations |
| - **Training Method**: Rubric-as-Judge RL (Rubric-ERL) |
| - Uses structured evaluation rubrics as reward signals |
| - Optimizes for both empathy and response quality |
| - Incorporates psychological safety constraints |
| - **Compute**: Training details available in our paper (https://arxiv.org/abs/2512.01282) |
| - **License**: MIT |
|
|
|
|
|
|
| ## ⚠️ Limitations & Safety |
|
|
| **Important**: Kardia-R1 is designed for **emotional support and companionship**, not clinical therapy. |
|
|
| - **Not a Replacement for Professional Help**: This model cannot diagnose mental health conditions or provide clinical treatment. Users experiencing severe mental health crises should contact professional services. |
| - **Crisis Detection**: The model includes basic crisis detection patterns but may not reliably identify all emergency situations. |
| - **Bias**: As with all LLMs, outputs may reflect biases present in training data. |
| - **Consistency**: Emotional support quality may vary across different contexts and user inputs. |
|
|
| --- |
|
|
|
|
| <div align="center"> |
|
|
| **⭐ Star us on [GitHub](https://github.com/JhCircle/Kardia-R1) if you find this work helpful!** |
|
|
| </div> |
|
|