mlboydaisuke commited on
Commit
0403a07
·
verified ·
1 Parent(s): d6aebae

upload README.md

Browse files
Files changed (1) hide show
  1. README.md +55 -0
README.md ADDED
@@ -0,0 +1,55 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: gemma
4
+ base_model: google/functiongemma-270m-it
5
+ tags:
6
+ - coreml
7
+ - apple-neural-engine
8
+ - gemma3
9
+ - function-calling
10
+ - on-device
11
+ library_name: coreml
12
+ ---
13
+
14
+ # FunctionGemma-270M for Apple CoreML (ANE-optimized)
15
+
16
+ CoreML conversion of `google/functiongemma-270m-it` produced with the
17
+ [CoreML-LLM](https://github.com/john-rocky/CoreML-LLM) pipeline. Targets
18
+ iOS 26 / macOS 26.
19
+
20
+ ## What's in this repo
21
+
22
+ | File | Notes |
23
+ |---|---|
24
+ | `model.mlmodelc/` | Compiled stateful decoder (fp16, 840 MB). Drop-in for `MLModel(contentsOf:)` |
25
+ | `model_config.json` | Bundle metadata (architecture, dims, function-call markers) |
26
+ | `hf_model/` | Tokenizer + chat template (function-calling format) |
27
+ | `cos_*.npy`, `sin_*.npy` | Pre-computed RoPE tables (optional) |
28
+
29
+ ## ANE residency
30
+
31
+ **99.42% on Apple Neural Engine** (1893/1904 dispatched ops, verified via
32
+ `MLComputePlan` on macOS 26). The 11 CPU-only ops are unavoidable
33
+ input-boundary ops (token gather, argmax, scalar squeeze).
34
+
35
+ ## Use it
36
+
37
+ Via the [CoreML-LLM Swift package](https://github.com/john-rocky/CoreML-LLM):
38
+
39
+ ```swift
40
+ import CoreMLLLM
41
+ let bundleURL = try await Gemma3BundleDownloader.download(
42
+ .functionGemma270m, into: appSupportDir)
43
+ let fg = try await FunctionGemma.load(bundleURL: bundleURL)
44
+ let text = try fg.generate(prompt: "Turn on the flashlight",
45
+ maxNewTokens: 64)
46
+ ```
47
+
48
+ For raw Core ML usage, the model expects the same I/O contract as Gemma 4:
49
+ `input_ids (1,1) int32`, `position_ids (1,) int32`, `causal_mask (1,1,1,ctx) fp16`,
50
+ `update_mask (1,1,ctx,1) fp16`, with a stateful `kv_cache_0` MLState
51
+ (2*L, kv_heads, ctx, head_dim).
52
+
53
+ ## License
54
+
55
+ Inherits Google's [Gemma terms of use](https://ai.google.dev/gemma/terms).