Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

HumorR1
/
policy-e2b-grpo-thinking

PEFT
Safetensors
English
vision-language
new-yorker
humor
rlhf
grpo-thinking
Model card Files Files and versions
xet
Community
policy-e2b-grpo-thinking
209 MB
Ctrl+K
Ctrl+K
  • 1 contributor
History: 2 commits
Broyojo's picture
Broyojo
upload E2b (grpo_thinking)
b5ae70e verified 3 days ago
  • .gitattributes
    1.57 kB
    upload E2b (grpo_thinking) 3 days ago
  • README.md
    1.99 kB
    upload E2b (grpo_thinking) 3 days ago
  • adapter_config.json
    1.17 kB
    upload E2b (grpo_thinking) 3 days ago
  • adapter_model.safetensors
    197 MB
    xet
    upload E2b (grpo_thinking) 3 days ago
  • chat_template.jinja
    5.2 kB
    upload E2b (grpo_thinking) 3 days ago
  • processor_config.json
    1.19 kB
    upload E2b (grpo_thinking) 3 days ago
  • tokenizer.json
    11.4 MB
    xet
    upload E2b (grpo_thinking) 3 days ago
  • tokenizer_config.json
    761 Bytes
    upload E2b (grpo_thinking) 3 days ago
  • trainer_state.json
    61.1 kB
    upload E2b (grpo_thinking) 3 days ago
  • training_args.bin
    7.38 kB
    xet
    upload E2b (grpo_thinking) 3 days ago