upload README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,55 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language: en
|
| 3 |
+
license: gemma
|
| 4 |
+
base_model: google/functiongemma-270m-it
|
| 5 |
+
tags:
|
| 6 |
+
- coreml
|
| 7 |
+
- apple-neural-engine
|
| 8 |
+
- gemma3
|
| 9 |
+
- function-calling
|
| 10 |
+
- on-device
|
| 11 |
+
library_name: coreml
|
| 12 |
+
---
|
| 13 |
+
|
| 14 |
+
# FunctionGemma-270M for Apple CoreML (ANE-optimized)
|
| 15 |
+
|
| 16 |
+
CoreML conversion of `google/functiongemma-270m-it` produced with the
|
| 17 |
+
[CoreML-LLM](https://github.com/john-rocky/CoreML-LLM) pipeline. Targets
|
| 18 |
+
iOS 26 / macOS 26.
|
| 19 |
+
|
| 20 |
+
## What's in this repo
|
| 21 |
+
|
| 22 |
+
| File | Notes |
|
| 23 |
+
|---|---|
|
| 24 |
+
| `model.mlmodelc/` | Compiled stateful decoder (fp16, 840 MB). Drop-in for `MLModel(contentsOf:)` |
|
| 25 |
+
| `model_config.json` | Bundle metadata (architecture, dims, function-call markers) |
|
| 26 |
+
| `hf_model/` | Tokenizer + chat template (function-calling format) |
|
| 27 |
+
| `cos_*.npy`, `sin_*.npy` | Pre-computed RoPE tables (optional) |
|
| 28 |
+
|
| 29 |
+
## ANE residency
|
| 30 |
+
|
| 31 |
+
**99.42% on Apple Neural Engine** (1893/1904 dispatched ops, verified via
|
| 32 |
+
`MLComputePlan` on macOS 26). The 11 CPU-only ops are unavoidable
|
| 33 |
+
input-boundary ops (token gather, argmax, scalar squeeze).
|
| 34 |
+
|
| 35 |
+
## Use it
|
| 36 |
+
|
| 37 |
+
Via the [CoreML-LLM Swift package](https://github.com/john-rocky/CoreML-LLM):
|
| 38 |
+
|
| 39 |
+
```swift
|
| 40 |
+
import CoreMLLLM
|
| 41 |
+
let bundleURL = try await Gemma3BundleDownloader.download(
|
| 42 |
+
.functionGemma270m, into: appSupportDir)
|
| 43 |
+
let fg = try await FunctionGemma.load(bundleURL: bundleURL)
|
| 44 |
+
let text = try fg.generate(prompt: "Turn on the flashlight",
|
| 45 |
+
maxNewTokens: 64)
|
| 46 |
+
```
|
| 47 |
+
|
| 48 |
+
For raw Core ML usage, the model expects the same I/O contract as Gemma 4:
|
| 49 |
+
`input_ids (1,1) int32`, `position_ids (1,) int32`, `causal_mask (1,1,1,ctx) fp16`,
|
| 50 |
+
`update_mask (1,1,ctx,1) fp16`, with a stateful `kv_cache_0` MLState
|
| 51 |
+
(2*L, kv_heads, ctx, head_dim).
|
| 52 |
+
|
| 53 |
+
## License
|
| 54 |
+
|
| 55 |
+
Inherits Google's [Gemma terms of use](https://ai.google.dev/gemma/terms).
|