| --- |
| language: |
| - ar |
| license: apache-2.0 |
| base_model: AISA-Framework/AISA-AR-FunctionCall-FT |
| tags: |
| - function-calling |
| - arabic |
| - tool-use |
| - agentic |
| - gemma |
| - reasoning |
| - lora |
| - think |
| datasets: |
| - AISA-Framework/AISA-AR-FunctionCall |
| pipeline_tag: text-generation |
| library_name: transformers |
| --- |
| |
| # AISA-AR-FunctionCall-Think |
|
|
| <p align="center"> |
| <img src="https://cdn-uploads.huggingface.co/production/uploads/628f7a71dd993507cfcbe587/21Mxl67VW-RQFiXTnvheT.png" width="700"/> |
| </p> |
|
|
| **Reasoning-Augmented Arabic Structured Tool Calling** |
|
|
| `AISA-AR-FunctionCall-Think` is a reasoning-enhanced variant of the Arabic function-calling model introduced in the **AISA-AR-FunctionCall** framework. The model generates an intermediate reasoning trace before invoking a tool, enabling transparent decision-making for Arabic agentic systems. |
|
|
| This model extends [AISA-AR-FunctionCall-FT](https://huggingface.co/AISA-Framework/AISA-AR-FunctionCall-FT) by introducing explicit reasoning supervision using `<think>` blocks prior to tool execution. |
|
|
| --- |
|
|
| ## Model Overview |
|
|
| | Field | Value | |
| |---|---| |
| | **Model name** | AISA-AR-FunctionCall-Think | |
| | **Base model** | AISA-AR-FunctionCall-FT | |
| | **Architecture** | Gemma 3 (FunctionGemma 270M) | |
| | **Training method** | LoRA reasoning fine-tuning | |
| | **Primary task** | Arabic reasoning-aware function calling | |
|
|
| The model produces outputs in the following pattern: |
|
|
| ``` |
| <think> |
| reasoning about tool selection |
| </think> |
| <start_function_call> |
| call:tool_name{arguments} |
| </end_function_call> |
| ``` |
|
|
| This allows the system to expose the reasoning behind tool selection. |
|
|
| --- |
|
|
| ## Key Capabilities |
|
|
| - Reasoning-aware tool selection |
| - Explicit decision traces for tool invocation |
| - Improved argument extraction consistency |
| - Interpretable structured execution |
|
|
| **Supported domains:** |
|
|
| | Domain | |
| |---| |
| | Travel | |
| | Utilities | |
| | Islamic services | |
| | Weather | |
| | Healthcare | |
| | Banking & finance | |
| | E-commerce | |
| | Government services | |
|
|
| **Supported Arabic dialect groups:** |
|
|
| - Modern Standard Arabic (MSA) |
| - Gulf |
| - Egyptian |
| - Levantine |
| - Maghrebi |
|
|
| --- |
|
|
| ## Training Dataset |
|
|
| Training uses a subset of the [AISA-AR-FunctionCall](https://huggingface.co/datasets/AISA-Framework/AISA-AR-FunctionCall) dataset with reasoning annotations. |
|
|
| | Property | Value | |
| |---|---| |
| | Dataset size | ~12k reasoning-augmented samples | |
| | Dialect coverage | 5 Arabic dialects | |
| | Domains | 8 real-world domains | |
| | Tools | 27 structured tools | |
|
|
| --- |
|
|
| ## Training Methodology |
|
|
| The reasoning model is trained by augmenting assistant outputs with explicit reasoning segments. |
|
|
| **Training format:** |
|
|
| ``` |
| <think> |
| tool selection reasoning |
| </think> |
| <start_function_call> |
| call:tool{arguments} |
| </end_function_call> |
| ``` |
|
|
| Reasoning supervision is enforced during inference by priming the model to begin its generation with `<think>`. |
|
|
| **Training configuration:** |
|
|
| | Parameter | Value | |
| |---|---| |
| | Training type | LoRA fine-tuning | |
| | LoRA rank | 64 | |
| | Alpha | 64 | |
| | Dropout | 0.05 | |
| | Trainable parameters | ~5.36% | |
| | Epochs | 3 | |
| | Learning rate | 3e-6 | |
| | Effective batch size | 32 | |
| | Optimizer | 8-bit AdamW | |
| | Scheduler | Cosine | |
|
|
| Additional training signals include **negative tool examples** to reduce hallucinated tool calls when no tool invocation is required. |
|
|
| --- |
|
|
| ## Evaluation Results |
|
|
| Evaluation is performed on a strict reasoning evaluation subset. |
|
|
| ### Strict Evaluation (n = 240) |
|
|
| | Metric | Score | |
| |---|---| |
| | Tool Call Rate | 0.992 | |
| | Think-Before-Call Rate | **1.000** | |
| | Function Name Accuracy | 0.992 | |
| | Argument F1 | **1.000** | |
| | Decision Accuracy | 0.992 | |
| | Hallucination Rate | **0.000** | |
|
|
| These results indicate that the model consistently performs reasoning before tool invocation and achieves near-perfect structured alignment within the evaluated subset. |
|
|
| ### Important Note on Format Validation |
|
|
| Standard function-call validators may classify reasoning outputs as **parse failures** because `<think>` tokens appear before the function call marker. |
|
|
| This does **not** indicate structural instability — it reflects a difference in serialization format. When reasoning segments are permitted, tool invocation correctness remains near-perfect. |
|
|
| --- |
|
|
| ## Example Usage |
|
|
| **User query:** |
|
|
| ``` |
| ما حالة الطقس في الرياض اليوم؟ |
| ``` |
|
|
| **Model output:** |
|
|
| ``` |
| <think> |
| المستخدم يريد معرفة حالة الطقس في مدينة الرياض، لذا يجب استخدام أداة get_weather. |
| </think> |
| <start_function_call> |
| call:get_weather{city:<escape>الرياض<escape>,days:1} |
| </end_function_call> |
| ``` |
|
|
| --- |
|
|
| ## Intended Use |
|
|
| This model is intended for: |
|
|
| - Research on reasoning-aware tool calling |
| - Interpretable agent systems |
| - Arabic reasoning supervision experiments |
| - Debugging tool selection behavior |
|
|
| ### Production Recommendation |
|
|
| This model is an **exploratory research variant**. For production deployment, we recommend using: |
|
|
| [AISA-AR-FunctionCall-FT](https://huggingface.co/AISA-Framework/AISA-AR-FunctionCall-FT) |
|
|
| --- |
|
|
| ## Related Resources |
|
|
| | Resource | Link | |
| |---|---| |
| | Dataset | [AISA-Framework/AISA-AR-FunctionCall](https://huggingface.co/datasets/AISA-Framework/AISA-AR-FunctionCall) | |
| | Production model | [AISA-AR-FunctionCall-FT](https://huggingface.co/AISA-Framework/AISA-AR-FunctionCall-FT) | |
| | Model collection | [AISA Arabic FunctionCall](https://huggingface.co/collections/AISA-Framework/aisa-arabic-functioncall-datasets-and-models) | |
|
|
| --- |
|
|
| ## Paper |
|
|
| **From Language to Action in Arabic: Reliable Structured Tool Calling via Data-Centric Fine-Tuning** |
|
|
| *AISA Framework* |
|
|
| --- |
|
|
| ## AISA Framework |
|
|
| This model is part of the **AISA** (Agentic AI Systems Architecture) initiative for building reliable multilingual AI agents. |
|
|
| --- |
|
|
| ## License |
|
|
| [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |