| | --- |
| | datasets: |
| | - independently-platform/tasky |
| | language: |
| | - en |
| | - it |
| | base_model: |
| | - google/functiongemma-270m-it |
| | library_name: transformers |
| | --- |
| | |
| | # Tasky |
| |
|
| | ## About the model |
| | This model is a fine-tuned **function-calling assistant** for a todo/task application. It maps user requests to one of four tools and produces valid tool |
| | arguments according to the schema in `AI-TRAINING-TOOLS.md`. |
| |
|
| | - **Base model:** `google/functiongemma-270m-it` |
| | - **Primary languages:** English and Italian (with light spelling errors/typos to mimic real users) |
| | - **Task:** Structured tool selection + argument generation |
| |
|
| | ## Intended Use |
| | Use this model to translate natural language task requests into tool calls for: |
| | - `create_tasks` |
| | - `search_tasks` |
| | - `update_tasks` |
| | - `delete_tasks` |
| |
|
| | It is designed for **task/todo management** workflows and should be paired with strict validation of tool arguments before execution. |
| |
|
| | ### Example |
| | **Input (user):** |
| |
|
| | Aggiungi un task per pagare la bolletta della luce domani mattina |
| |
|
| |
|
| | **Expected output (model):** |
| | ```json |
| | { |
| | "tool_name": "create_tasks", |
| | "tool_arguments": "{\"tasks\":[{\"content\":\"pagare la bolletta della luce\",\"dueDate\":\"2026-01-13T09:00:00.000Z\"}]}" |
| | } |
| | |
| | ## Training Data |
| | |
| | Synthetic, bilingual tool-calling data built from the tool schema, including: |
| | |
| | - Multiple phrasings and paraphrases |
| | - Mixed English/Italian prompts |
| | - Light typos and user mistakes in user_content |
| | - Broad coverage of optional parameters |
| | |
| | Splits: |
| | |
| | - Train: 1,500 examples |
| | - Eval: 500 examples |
| | |
| | ## Training Procedure |
| | |
| | - Fine-tuning on synthetic tool-calling samples |
| | - Deduplicated examples |
| | - Balanced coverage of all tools and key parameters |
| | |
| | ## Evaluation |
| | |
| | Reported success rate: 99.5% on the 500‑example eval split vs 0% base model. |
| | Success was measured as exact match on the predicted tool name and the JSON arguments after normalization. |
| | |
| | ## Limitations |
| | |
| | - Trained for a specific tool schema; not a general-purpose assistant. |
| | - Outputs may include incorrect or incomplete tool arguments; validate before execution. |
| | - Language coverage is strongest in English and Italian. |
| | - Synthetic data may not capture all real-world user phrasing or ambiguity. |