Johnblick187 commited on
Commit
5fc333f
·
verified ·
1 Parent(s): f46564f

Update modeling_smartcoder_moe.py

Browse files
Files changed (1) hide show
  1. modeling_smartcoder_moe.py +1 -1
modeling_smartcoder_moe.py CHANGED
@@ -4,7 +4,7 @@ Custom model class for SmartCoderMoE.
4
 
5
  Architecture (from tensor inspection):
6
  - vocab_size: 65536, hidden: 2048, layers: 40
7
- - Attention: q[2048,2048], k/v[512,2048] 16 heads, 4 KV heads, head_dim=128
8
  - MLP (hybrid dense + MoE):
9
  dense_fc: [8192, 2048] up
10
  dense_proj: [2048, 8192] down
 
4
 
5
  Architecture (from tensor inspection):
6
  - vocab_size: 65536, hidden: 2048, layers: 40
7
+ - Attention: q[2048,2048], k/v[512,2048] - 16 heads, 4 KV heads, head_dim=128
8
  - MLP (hybrid dense + MoE):
9
  dense_fc: [8192, 2048] up
10
  dense_proj: [2048, 8192] down