Mingke977 commited on
Commit
2e2822f
·
verified ·
1 Parent(s): d1b0e89

Add files using upload-large-folder tool

Browse files
.gitattributes CHANGED
@@ -33,3 +33,8 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ figures/joyai-logo.png filter=lfs diff=lfs merge=lfs -text
37
+ JoyAI-LLM-Flash-IQ3_XS.gguf filter=lfs diff=lfs merge=lfs -text
38
+ JoyAI-LLM-Flash-IQ4_XS.gguf filter=lfs diff=lfs merge=lfs -text
39
+ JoyAI-LLM-Flash-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
40
+ JoyAI-LLM-Flash-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
JoyAI-LLM-Flash-IQ3_XS.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9057e9015838e778cf20c7e44a4b898ad49c345b179d611e5dcbcd666c3516b1
3
+ size 20086884320
JoyAI-LLM-Flash-IQ4_XS.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:31e5aa805870a085e88684524fb36c5c6f411c5ac4eca6422c72352fc9bb23e0
3
+ size 26411919104
JoyAI-LLM-Flash-Q4_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:144d086e5b74ab5e08b060eba9af0cc4ec2efadc68535bfaa261557f68fc5115
3
+ size 29669242624
JoyAI-LLM-Flash-Q8_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b4a95b59201f76b3f1a26eacac541574427a993e008adc5a92e8e4cfda9e9a0b
3
+ size 52067555072
README.md ADDED
@@ -0,0 +1,376 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - zh
4
+ - en
5
+ pipeline_tag: text-generation
6
+ library_name: transformers
7
+ ---
8
+ <div align="center">
9
+ <picture>
10
+ <img src="figures/joyai-logo.png" width="30%" alt="JoyAI-LLM Flash">
11
+ </picture>
12
+ </div>
13
+ <hr>
14
+
15
+ <div align="center" style="line-height: 1;">
16
+ <a href="https://huggingface.co/jdopensource" target="_blank"><img alt="Hugging Face" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-JD-ffc107?color=ffc107&logoColor=white"/></a>
17
+ <a href="https://huggingface.co/jdopensource/JoyAI-LLM-Flash/blob/main/LICENSE"><img alt="License" src="https://img.shields.io/badge/License-Modified_MIT-f5de53?&color=f5de53"/></a>
18
+ </div>
19
+
20
+
21
+
22
+
23
+ ## 1. Model Introduction
24
+
25
+ JoyAI-LLM-Flash is a state-of-the-art medium-sized instruct language model with 3 billion activated parameters and 48 billion total parameters. JoyAI-LLM-Flash was pretrained on 20 trillion text tokens using Muon optimizer, followed by large-scale supervised fine-tuning (SFT), direct preference optimization (DPO), and reinforcement learning (RL) across diverse environments. JoyAI-LLM-Flash achieves strong performance across frontier knowledge, reasoning, coding tasks and agentic capabilities.
26
+
27
+ ### Key Features
28
+
29
+ - Fiber Bundle RL: Introduces fiber bundle theory into reinforcement learning, proposing a novel optimization framework, FiberPO. This method is specifically designed to handle the challenges of large-scale and heterogeneous agent training, improving stability and robustness under complex data distributions.
30
+ - Training-Inference Collaboration: apply Muon optimizer with dense MTP, develop novel optimization techniques to resolve instabilities while scaling up, delivering 1.3× to 1.7× the throughput of the non-MTP version.
31
+ - Agentic Intelligence: designed for tool use, reasoning, and autonomous problem-solving.
32
+
33
+ ## 2. Model Summary
34
+
35
+ | | |
36
+ | :-----------------------------------------: | :----------------------: |
37
+ | **Architecture** | Mixture-of-Experts (MoE) |
38
+ | **Total Parameters** | 48B |
39
+ | **Activated Parameters** | 3B |
40
+ | **Number of Layers** (Dense layer included) | 40 |
41
+ | **Number of Dense Layers** | 1 |
42
+ | **Attention Hidden Dimension** | 2048 |
43
+ | **MoE Hidden Dimension** (per Expert) | 768 |
44
+ | **Number of Attention Heads** | 32 |
45
+ | **Number of Experts** | 256 |
46
+ | **Selected Experts per Token** | 8 |
47
+ | **Number of Shared Experts** | 1 |
48
+ | **Vocabulary Size** | 129K |
49
+ | **Context Length** | 128K |
50
+ | **Attention Mechanism** | MLA |
51
+ | **Activation Function** | SwiGLU |
52
+ | </div> | |
53
+
54
+
55
+ ## 3. Evaluation Results
56
+
57
+ <table>
58
+ <thead>
59
+ <tr>
60
+ <th align="center">Benchmark</th>
61
+ <th align="center"><sup>JoyAI-LLM Flash</sup></th>
62
+ <th align="center"><sup>Qwen3-30B-A3B-Instuct-2507</sup></th>
63
+ <th align="center"><sup>GLM-4.7-Flash<br>(Non-thinking)</sup></th>
64
+ </tr>
65
+ </thead>
66
+ <tbody>
67
+
68
+
69
+ <tr>
70
+ <td align="center" colspan=8><strong>Knowledge &amp; Alignment</strong></td>
71
+ </tr>
72
+ <tr>
73
+ <td align="center" style="vertical-align: middle">MMLU</td>
74
+ <td align="center" style="vertical-align: middle"><strong>89.50</strong></td>
75
+ <td align="center" style="vertical-align: middle">86.87</td>
76
+ <td align="center" style="vertical-align: middle">80.53</td>
77
+ </tr>
78
+ <tr>
79
+ <td align="center" style="vertical-align: middle">MMLU-Pro</td>
80
+ <td align="center" style="vertical-align: middle"><strong>81.02</strong></td>
81
+ <td align="center" style="vertical-align: middle">73.88</td>
82
+ <td align="center" style="vertical-align: middle">63.62</td>
83
+ </tr>
84
+ <tr>
85
+ <td align="center" style="vertical-align: middle">CMMLU</td>
86
+ <td align="center" style="vertical-align: middle"><strong>87.03</strong></td>
87
+ <td align="center" style="vertical-align: middle">85.88</td>
88
+ <td align="center" style="vertical-align: middle">75.85</td>
89
+ </tr>
90
+ <tr>
91
+ <td align="center" style="vertical-align: middle">GPQA-Diamond</td>
92
+ <td align="center" style="vertical-align: middle"><strong>74.43</strong></td>
93
+ <td align="center" style="vertical-align: middle">68.69</td>
94
+ <td align="center" style="vertical-align: middle">39.90</td>
95
+ </tr>
96
+ <tr>
97
+ <td align="center" style="vertical-align: middle">SuperGPQA</td>
98
+ <td align="center" style="vertical-align: middle"><strong>55.00</strong></td>
99
+ <td align="center" style="vertical-align: middle">52.00</td>
100
+ <td align="center" style="vertical-align: middle">32.00</td>
101
+ </tr>
102
+ <tr>
103
+ <td align="center" style="vertical-align: middle">LiveBench</td>
104
+ <td align="center" style="vertical-align: middle"><strong>72.90</strong></td>
105
+ <td align="center" style="vertical-align: middle">59.70</td>
106
+ <td align="center" style="vertical-align: middle">43.10</td>
107
+ </tr>
108
+ <tr>
109
+ <td align="center" style="vertical-align: middle">IFEval</td>
110
+ <td align="center" style="vertical-align: middle"><strong>86.69</strong></td>
111
+ <td align="center" style="vertical-align: middle">83.18</td>
112
+ <td align="center" style="vertical-align: middle">82.44</td>
113
+ </tr>
114
+ <tr>
115
+ <td align="center" style="vertical-align: middle">AlignBench</td>
116
+ <td align="center" style="vertical-align: middle"><strong>8.24</strong></td>
117
+ <td align="center" style="vertical-align: middle">8.07</td>
118
+ <td align="center" style="vertical-align: middle">6.85</td>
119
+ </tr>
120
+ <tr>
121
+ <td align="center" style="vertical-align: middle">HellaSwag</td>
122
+ <td align="center" style="vertical-align: middle"><strong>91.79</strong></td>
123
+ <td align="center" style="vertical-align: middle">89.90</td>
124
+ <td align="center" style="vertical-align: middle">60.84</td>
125
+ </tr>
126
+
127
+ <tr>
128
+ <td align="center" colspan=8><strong>Coding</strong></td>
129
+ </tr>
130
+ <tr>
131
+ <td align="center" style="vertical-align: middle">HumanEval</td>
132
+ <td align="center" style="vertical-align: middle"><strong>96.34</strong></td>
133
+ <td align="center" style="vertical-align: middle">95.12</td>
134
+ <td align="center" style="vertical-align: middle">74.39</td>
135
+ </tr>
136
+ <tr>
137
+ <td align="center" style="vertical-align: middle">LiveCodeBench</td>
138
+ <td align="center" style="vertical-align: middle"><strong>65.60</strong></td>
139
+ <td align="center" style="vertical-align: middle">39.71</td>
140
+ <td align="center" style="vertical-align: middle">27.43</td>
141
+ </tr>
142
+ <tr>
143
+ <td align="center" style="vertical-align: middle">SciCode</td>
144
+ <td align="center" style="vertical-align: middle"><strong>3.08/22.92</strong></td>
145
+ <td align="center" style="vertical-align: middle"><strong>3.08/22.92</strong></td>
146
+ <td align="center" style="vertical-align: middle">3.08/15.11</td>
147
+ </tr>
148
+ <tr>
149
+ <td align="center" colspan=8><strong>Mathematics</strong></td>
150
+ </tr>
151
+ <tr>
152
+ <td align="center" style="vertical-align: middle">GSM8K</td>
153
+ <td align="center" style="vertical-align: middle"><strong>95.83</strong></td>
154
+ <td align="center" style="vertical-align: middle">79.83</td>
155
+ <td align="center" style="vertical-align: middle">81.88</td>
156
+ </tr>
157
+ <tr>
158
+ <td align="center" style="vertical-align: middle">AIME2025</td>
159
+ <td align="center" style="vertical-align: middle"><strong>65.83</strong></td>
160
+ <td align="center" style="vertical-align: middle">62.08</td>
161
+ <td align="center" style="vertical-align: middle">24.17</td>
162
+ </tr>
163
+ <tr>
164
+ <td align="center" style="vertical-align: middle">MATH 500</td>
165
+ <td align="center" style="vertical-align: middle"><strong>97.10</strong></td>
166
+ <td align="center" style="vertical-align: middle">89.80</td>
167
+ <td align="center" style="vertical-align: middle">90.90</td>
168
+ </tr>
169
+
170
+ <tr>
171
+ <td align="center" colspan=8><strong>Agentic</strong></td>
172
+ </tr>
173
+ <tr>
174
+ <td align="center" style="vertical-align: middle">SWE-bench Verified</td>
175
+ <td align="center" style="vertical-align: middle"><strong>60.60</strong></td>
176
+ <td align="center" style="vertical-align: middle">24.44</td>
177
+ <td align="center" style="vertical-align: middle">51.60</td>
178
+ </tr>
179
+ <tr>
180
+ <td align="center" style="vertical-align: middle">Tau2-Retail</td>
181
+ <td align="center" style="vertical-align: middle"><strong>67.55</strong></td>
182
+ <td align="center" style="vertical-align: middle">53.51</td>
183
+ <td align="center" style="vertical-align: middle">62.28</td>
184
+ </tr>
185
+ <tr>
186
+ <td align="center" style="vertical-align: middle">Tau2-Airline</td>
187
+ <td align="center" style="vertical-align: middle"><strong>54.00</strong></td>
188
+ <td align="center" style="vertical-align: middle">32.00</td>
189
+ <td align="center" style="vertical-align: middle">52.00</td>
190
+ </tr>
191
+ <tr>
192
+ <td align="center" style="vertical-align: middle">Tau2-Telecom</td>
193
+ <td align="center" style="vertical-align: middle">79.83</td>
194
+ <td align="center" style="vertical-align: middle">4.39</td>
195
+ <td align="center" style="vertical-align: middle"><strong>88.60</strong></td>
196
+ </tr>
197
+
198
+ <tr>
199
+ <td align="center" colspan=8><strong>Long Context</strong></td>
200
+ </tr>
201
+ <tr>
202
+ <td align="center" style="vertical-align: middle">RULER</td>
203
+ <td align="center" style="vertical-align: middle"><strong>95.60</strong></td>
204
+ <td align="center" style="vertical-align: middle">89.66</td>
205
+ <td align="center" style="vertical-align: middle">56.12</td>
206
+ </tr>
207
+ </tbody>
208
+ </table>
209
+
210
+
211
+ ## 4. Deployment
212
+
213
+ > [!Note]
214
+ > You can access JoyAI-LLM Flash API on https://docs.jdcloud.com/cn/jdaip/chat and we provide OpenAI/Anthropic-compatible API for you.
215
+ > Currently, JoyAI-LLM-Flash-GGUF is recommended to run on the following inference engines:
216
+
217
+ * Llama.cpp
218
+ * Ollama
219
+
220
+
221
+ ## 5. Model Usage
222
+
223
+ The usage demos below demonstrate how to call our official API.
224
+
225
+ For third-party APIs deployed with vLLM or SGLang, please note that:
226
+
227
+ > [!Note] Recommended sampling parameters: `temperature=0.6`, `top_p=1.0`
228
+
229
+ ### Chat Completion
230
+
231
+ This is a simple chat completion script which shows how to call JoyAI-Flash API.
232
+
233
+ ```python
234
+ from openai import OpenAI
235
+
236
+ client = OpenAI(base_url="http://IP:PORT/v1", api_key="EMPTY")
237
+
238
+
239
+ def simple_chat(client: OpenAI):
240
+ messages = [
241
+ {
242
+ "role": "user",
243
+ "content": [
244
+ {
245
+ "type": "text",
246
+ "text": "which one is bigger, 9.11 or 9.9? think carefully.",
247
+ }
248
+ ],
249
+ },
250
+ ]
251
+ model_name = client.models.list().data[0].id
252
+ response = client.chat.completions.create(
253
+ model=model_name, messages=messages, stream=False, max_tokens=4096
254
+ )
255
+ print(f"response: {response.choices[0].message.content}")
256
+
257
+
258
+ if __name__ == "__main__":
259
+ simple_chat(client)
260
+ ```
261
+
262
+
263
+ ### Tool call Completion
264
+
265
+ This is a simple toll call completion script which shows how to call JoyAI-Flash API.
266
+
267
+ ```python
268
+ import json
269
+
270
+ from openai import OpenAI
271
+
272
+ client = OpenAI(base_url="http://IP:PORT/v1", api_key="EMPTY")
273
+
274
+
275
+ def my_calculator(expression: str) -> str:
276
+ return str(eval(expression))
277
+
278
+
279
+ def rewrite(expression: str) -> str:
280
+ return str(expression)
281
+
282
+
283
+ def simple_tool_call(client: OpenAI):
284
+ messages = [
285
+ {
286
+ "role": "user",
287
+ "content": [
288
+ {
289
+ "type": "text",
290
+ "text": "use my functions to compute the results for the equations: 6+1",
291
+ },
292
+ ],
293
+ },
294
+ ]
295
+ tools = [
296
+ {
297
+ "type": "function",
298
+ "function": {
299
+ "name": "my_calculator",
300
+ "description": "A calculator that can evaluate a mathematical equation and compute its results.",
301
+ "parameters": {
302
+ "type": "object",
303
+ "properties": {
304
+ "expression": {
305
+ "type": "string",
306
+ "description": "The mathematical expression to evaluate.",
307
+ },
308
+ },
309
+ "required": ["expression"],
310
+ },
311
+ },
312
+ },
313
+ {
314
+ "type": "function",
315
+ "function": {
316
+ "name": "rewrite",
317
+ "description": "Rewrite a given text for improved clarity",
318
+ "parameters": {
319
+ "type": "object",
320
+ "properties": {
321
+ "text": {
322
+ "type": "string",
323
+ "description": "The input text to rewrite",
324
+ }
325
+ },
326
+ },
327
+ },
328
+ },
329
+ ]
330
+ model_name = client.models.list().data[0].id
331
+ response = client.chat.completions.create(
332
+ model=model_name,
333
+ messages=messages,
334
+ temperature=1.0,
335
+ max_tokens=1024,
336
+ tools=tools,
337
+ tool_choice="auto",
338
+ )
339
+ tool_calls = response.choices[0].message.tool_calls
340
+
341
+ results = []
342
+ for tool_call in tool_calls:
343
+ function_name = tool_call.function.name
344
+ function_args = tool_call.function.arguments
345
+ if function_name == "my_calculator":
346
+ result = my_calculator(**json.loads(function_args))
347
+ results.append(result)
348
+ messages.append({"role": "assistant", "tool_calls": tool_calls})
349
+ for tool_call, result in zip(tool_calls, results):
350
+ messages.append(
351
+ {
352
+ "role": "tool",
353
+ "tool_call_id": tool_call.id,
354
+ "name": tool_call.function.name,
355
+ "content": result,
356
+ }
357
+ )
358
+ response = client.chat.completions.create(
359
+ model=model_name,
360
+ messages=messages,
361
+ temperature=1.0,
362
+ max_tokens=1024,
363
+ )
364
+ print(response.choices[0].message.content)
365
+
366
+
367
+ if __name__ == "__main__":
368
+ simple_tool_call(client)
369
+
370
+ ```
371
+
372
+ ---
373
+
374
+ ## 6. License
375
+
376
+ Both the code repository and the model weights are released under the [Modified MIT License](LICENSE).
figures/joyai-logo.png ADDED

Git LFS Details

  • SHA256: 4ea9d6a20a7707ca8dc427d6dcb5db6e2489f7730d5bffea26d8db20b1c54365
  • Pointer size: 131 Bytes
  • Size of remote file: 250 kB