README.md · AISA-Framework/AISA-AR-FunctionCall-Think at main

AISA-AR-FunctionCall-Think / README.md

Omartificial-Intelligence-Space

Create README.md

49b980c verified 18 days ago

preview code

raw

history blame contribute delete

5.79 kB

	---
	language:
	- ar
	license: apache-2.0
	base_model: AISA-Framework/AISA-AR-FunctionCall-FT
	tags:
	- function-calling
	- arabic
	- tool-use
	- agentic
	- gemma
	- reasoning
	- lora
	- think
	datasets:
	- AISA-Framework/AISA-AR-FunctionCall
	pipeline_tag: text-generation
	library_name: transformers
	---

	# AISA-AR-FunctionCall-Think

	<p align="center">
	<img src="https://cdn-uploads.huggingface.co/production/uploads/628f7a71dd993507cfcbe587/21Mxl67VW-RQFiXTnvheT.png" width="700"/>
	</p>

	Reasoning-Augmented Arabic Structured Tool Calling

	`AISA-AR-FunctionCall-Think` is a reasoning-enhanced variant of the Arabic function-calling model introduced in the AISA-AR-FunctionCall framework. The model generates an intermediate reasoning trace before invoking a tool, enabling transparent decision-making for Arabic agentic systems.

	This model extends [AISA-AR-FunctionCall-FT](https://huggingface.co/AISA-Framework/AISA-AR-FunctionCall-FT) by introducing explicit reasoning supervision using `<think>` blocks prior to tool execution.

	---

	## Model Overview

	\| Field \| Value \|
	\|---\|---\|
	\| Model name \| AISA-AR-FunctionCall-Think \|
	\| Base model \| AISA-AR-FunctionCall-FT \|
	\| Architecture \| Gemma 3 (FunctionGemma 270M) \|
	\| Training method \| LoRA reasoning fine-tuning \|
	\| Primary task \| Arabic reasoning-aware function calling \|

	The model produces outputs in the following pattern:

	```
	<think>
	reasoning about tool selection
	</think>
	<start_function_call>
	call:tool_name{arguments}
	</end_function_call>
	```

	This allows the system to expose the reasoning behind tool selection.

	---

	## Key Capabilities

	- Reasoning-aware tool selection
	- Explicit decision traces for tool invocation
	- Improved argument extraction consistency
	- Interpretable structured execution

	Supported domains:

	\| Domain \|
	\|---\|
	\| Travel \|
	\| Utilities \|
	\| Islamic services \|
	\| Weather \|
	\| Healthcare \|
	\| Banking & finance \|
	\| E-commerce \|
	\| Government services \|

	Supported Arabic dialect groups:

	- Modern Standard Arabic (MSA)
	- Gulf
	- Egyptian
	- Levantine
	- Maghrebi

	---

	## Training Dataset

	Training uses a subset of the [AISA-AR-FunctionCall](https://huggingface.co/datasets/AISA-Framework/AISA-AR-FunctionCall) dataset with reasoning annotations.

	\| Property \| Value \|
	\|---\|---\|
	\| Dataset size \| ~12k reasoning-augmented samples \|
	\| Dialect coverage \| 5 Arabic dialects \|
	\| Domains \| 8 real-world domains \|
	\| Tools \| 27 structured tools \|

	---

	## Training Methodology

	The reasoning model is trained by augmenting assistant outputs with explicit reasoning segments.

	Training format:

	```
	<think>
	tool selection reasoning
	</think>
	<start_function_call>
	call:tool{arguments}
	</end_function_call>
	```

	Reasoning supervision is enforced during inference by priming the model to begin its generation with `<think>`.

	Training configuration:

	\| Parameter \| Value \|
	\|---\|---\|
	\| Training type \| LoRA fine-tuning \|
	\| LoRA rank \| 64 \|
	\| Alpha \| 64 \|
	\| Dropout \| 0.05 \|
	\| Trainable parameters \| ~5.36% \|
	\| Epochs \| 3 \|
	\| Learning rate \| 3e-6 \|
	\| Effective batch size \| 32 \|
	\| Optimizer \| 8-bit AdamW \|
	\| Scheduler \| Cosine \|

	Additional training signals include negative tool examples to reduce hallucinated tool calls when no tool invocation is required.

	---

	## Evaluation Results

	Evaluation is performed on a strict reasoning evaluation subset.

	### Strict Evaluation (n = 240)

	\| Metric \| Score \|
	\|---\|---\|
	\| Tool Call Rate \| 0.992 \|
	\| Think-Before-Call Rate \| 1.000 \|
	\| Function Name Accuracy \| 0.992 \|
	\| Argument F1 \| 1.000 \|
	\| Decision Accuracy \| 0.992 \|
	\| Hallucination Rate \| 0.000 \|

	These results indicate that the model consistently performs reasoning before tool invocation and achieves near-perfect structured alignment within the evaluated subset.

	### Important Note on Format Validation

	Standard function-call validators may classify reasoning outputs as parse failures because `<think>` tokens appear before the function call marker.

	This does not indicate structural instability — it reflects a difference in serialization format. When reasoning segments are permitted, tool invocation correctness remains near-perfect.

	---

	## Example Usage

	User query:

	```
	ما حالة الطقس في الرياض اليوم؟
	```

	Model output:

	```
	<think>
	المستخدم يريد معرفة حالة الطقس في مدينة الرياض، لذا يجب استخدام أداة get_weather.
	</think>
	<start_function_call>
	call:get_weather{city:<escape>الرياض<escape>,days:1}
	</end_function_call>
	```

	---

	## Intended Use

	This model is intended for:

	- Research on reasoning-aware tool calling
	- Interpretable agent systems
	- Arabic reasoning supervision experiments
	- Debugging tool selection behavior

	### Production Recommendation

	This model is an exploratory research variant. For production deployment, we recommend using:

	[AISA-AR-FunctionCall-FT](https://huggingface.co/AISA-Framework/AISA-AR-FunctionCall-FT)

	---

	## Related Resources

	\| Resource \| Link \|
	\|---\|---\|
	\| Dataset \| [AISA-Framework/AISA-AR-FunctionCall](https://huggingface.co/datasets/AISA-Framework/AISA-AR-FunctionCall) \|
	\| Production model \| [AISA-AR-FunctionCall-FT](https://huggingface.co/AISA-Framework/AISA-AR-FunctionCall-FT) \|
	\| Model collection \| [AISA Arabic FunctionCall](https://huggingface.co/collections/AISA-Framework/aisa-arabic-functioncall-datasets-and-models) \|

	---

	## Paper

	From Language to Action in Arabic: Reliable Structured Tool Calling via Data-Centric Fine-Tuning

	AISA Framework

	---

	## AISA Framework

	This model is part of the AISA (Agentic AI Systems Architecture) initiative for building reliable multilingual AI agents.

	---

	## License

	[Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)