Upload fine-tuned intent detection model - 2025-06-24 17:12:41

4deef57 verified 9 months ago

1.99 kB

	---
	language: en
	license: mit
	base_model: mlx-community/Qwen2.5-1.5B-Instruct-4bit
	tags:
	- mlx
	- intent-detection
	- fine-tuned
	- knowledge-management
	- ios
	library_name: mlx
	---

	# knowledgebase-intent-llm

	Fine-tuned model for intent detection in a knowledge management iOS app.

	## Model Details

	- Base Model: mlx-community/Qwen2.5-1.5B-Instruct-4bit
	- Model Type: intent_detection
	- Format: complete_merged
	- Framework: MLX
	- Training Examples: 5000
	- Training Iterations: 100

	## Usage

	```python
	from mlx_lm import load, generate

	# Load the model
	model, tokenizer = load("hebertgo/knowledgebase-intent-llm")

	# Generate intent classification
	prompt = '''You are a helpful AI assistant for a knowledge-management app on an iPhone. Analyze the user's request and respond with JSON in this format:
	{
	"action": "Search\|Create\|Clarify\|Conversation",
	"response": "User-friendly response message",
	"contentType": "videos\|bookmarks\|todos",
	"topic": "extracted topic or null"
	}

	User query: find videos about python'''

	response = generate(model, tokenizer, prompt=prompt, max_tokens=256)
	print(response)
	```

	## iOS Integration

	This model is designed for use in iOS apps with MLX Swift:

	```swift
	let config = ModelConfiguration(
	id: "hebertgo/knowledgebase-intent-llm",
	defaultPrompt: ""
	)

	let model = try await LLMModelFactory.shared.loadContainer(
	configuration: config
	)
	```

	## Training Details

	- Fine-tuning Method: LoRA with model fusion
	- Export Date: 2025-06-24T17:12:38.111419
	- Fusion Completed: True

	## Expected Outputs

	The model generates JSON responses with these action types:
	- Search: Find existing content (videos, bookmarks, todos)
	- Create: Add new content
	- Clarify: Request more information
	- Conversation: General chat responses

	Content types supported:
	- videos
	- bookmarks
	- todos

	## Performance

	Optimized for Apple Silicon devices with MLX framework for efficient on-device inference.