Improve language tag

#1
by lbourdois - opened
Files changed (1) hide show
  1. README.md +64 -52
README.md CHANGED
@@ -1,53 +1,65 @@
1
- ---
2
- base_model:
3
- - deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
4
- - Qwen/Qwen2.5-7B
5
- - Qwen/Qwen2.5-Math-7B
6
- - huihui-ai/DeepSeek-R1-Distill-Qwen-7B-abliterated-v2
7
- - Qwen/Qwen2.5-7B-Instruct
8
- - Qwen/Qwen2.5-Coder-7B
9
- - RLHFlow/Qwen2.5-7B-DPO
10
- library_name: transformers
11
- license: mit
12
- language:
13
- - en
14
- ---
15
- # merge
16
-
17
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
18
-
19
- ## Merge Details
20
- ### Merge Method
21
-
22
- This model was merged using the [Model Stock](https://arxiv.org/abs/2403.19522) merge method using [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) as a base.
23
-
24
- ### Models Merged
25
-
26
- The following models were included in the merge:
27
- * [deepseek-ai/DeepSeek-R1-Distill-Qwen-7B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B)
28
- * [Qwen/Qwen2.5-7B](https://huggingface.co/Qwen/Qwen2.5-7B)
29
- * [Qwen/Qwen2.5-Math-7B](https://huggingface.co/Qwen/Qwen2.5-Math-7B)
30
- * [huihui-ai/DeepSeek-R1-Distill-Qwen-7B-abliterated-v2](https://huggingface.co/huihui-ai/DeepSeek-R1-Distill-Qwen-7B-abliterated-v2)
31
- * [Qwen/Qwen2.5-Coder-7B](https://huggingface.co/Qwen/Qwen2.5-Coder-7B)
32
- * [RLHFlow/Qwen2.5-7B-DPO](https://huggingface.co/RLHFlow/Qwen2.5-7B-DPO)
33
-
34
- ### Configuration
35
-
36
- The following YAML configuration was used to produce this model:
37
-
38
- ```yaml
39
- models:
40
- - model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
41
- - model: huihui-ai/DeepSeek-R1-Distill-Qwen-7B-abliterated-v2
42
- - model: Qwen/Qwen2.5-7B
43
- - model: Qwen/Qwen2.5-7B-Instruct
44
- - model: Qwen/Qwen2.5-Coder-7B
45
- - model: Qwen/Qwen2.5-Math-7B
46
- - model: RLHFlow/Qwen2.5-7B-DPO
47
-
48
- merge_method: model_stock
49
- base_model: Qwen/Qwen2.5-7B-Instruct
50
- normalize: true
51
- int8_mask: true
52
- dtype: bfloat16
 
 
 
 
 
 
 
 
 
 
 
 
53
  ```
 
1
+ ---
2
+ base_model:
3
+ - deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
4
+ - Qwen/Qwen2.5-7B
5
+ - Qwen/Qwen2.5-Math-7B
6
+ - huihui-ai/DeepSeek-R1-Distill-Qwen-7B-abliterated-v2
7
+ - Qwen/Qwen2.5-7B-Instruct
8
+ - Qwen/Qwen2.5-Coder-7B
9
+ - RLHFlow/Qwen2.5-7B-DPO
10
+ library_name: transformers
11
+ license: mit
12
+ language:
13
+ - zho
14
+ - eng
15
+ - fra
16
+ - spa
17
+ - por
18
+ - deu
19
+ - ita
20
+ - rus
21
+ - jpn
22
+ - kor
23
+ - vie
24
+ - tha
25
+ - ara
26
+ ---
27
+ # merge
28
+
29
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
30
+
31
+ ## Merge Details
32
+ ### Merge Method
33
+
34
+ This model was merged using the [Model Stock](https://arxiv.org/abs/2403.19522) merge method using [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) as a base.
35
+
36
+ ### Models Merged
37
+
38
+ The following models were included in the merge:
39
+ * [deepseek-ai/DeepSeek-R1-Distill-Qwen-7B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B)
40
+ * [Qwen/Qwen2.5-7B](https://huggingface.co/Qwen/Qwen2.5-7B)
41
+ * [Qwen/Qwen2.5-Math-7B](https://huggingface.co/Qwen/Qwen2.5-Math-7B)
42
+ * [huihui-ai/DeepSeek-R1-Distill-Qwen-7B-abliterated-v2](https://huggingface.co/huihui-ai/DeepSeek-R1-Distill-Qwen-7B-abliterated-v2)
43
+ * [Qwen/Qwen2.5-Coder-7B](https://huggingface.co/Qwen/Qwen2.5-Coder-7B)
44
+ * [RLHFlow/Qwen2.5-7B-DPO](https://huggingface.co/RLHFlow/Qwen2.5-7B-DPO)
45
+
46
+ ### Configuration
47
+
48
+ The following YAML configuration was used to produce this model:
49
+
50
+ ```yaml
51
+ models:
52
+ - model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
53
+ - model: huihui-ai/DeepSeek-R1-Distill-Qwen-7B-abliterated-v2
54
+ - model: Qwen/Qwen2.5-7B
55
+ - model: Qwen/Qwen2.5-7B-Instruct
56
+ - model: Qwen/Qwen2.5-Coder-7B
57
+ - model: Qwen/Qwen2.5-Math-7B
58
+ - model: RLHFlow/Qwen2.5-7B-DPO
59
+
60
+ merge_method: model_stock
61
+ base_model: Qwen/Qwen2.5-7B-Instruct
62
+ normalize: true
63
+ int8_mask: true
64
+ dtype: bfloat16
65
  ```