Commit History

Update training/AgentDebuggerEnv_GRPO_Training.ipynb
a2cb0a0
verified

shashaank0707 commited on

Update: Added final imporvements for hackathon
713f336

shank commited on

Fix: batch%num_generations math
2b499e7

shank commited on

Cuda returns false fixed
b8172c5

shank commited on

COMPUTE_DRIVE fix
77156dd

shank commited on

Fix: Removed BitsandBytes
bdec91d

shank commited on

Fix: Fixed again again
accb271

shank commited on

Fix: Fixed again
9864e61

shank commited on

Fix: Fixing Again
6747185

shank commited on

Fix: Fixing
18b4e8a

shank commited on

Fix: Trying to fix dependency issues
024f3c7

shank commited on

Fix: Fixed file
cb09ef1

shank commited on

fix: serialize bug_metadata as JSON to fix pyarrow mixed-type error
4668456

shank commited on

fix: upgrade bitsandbytes>=0.49.0 (triton.ops), switch to Qwen2.5-Coder-3B
a2fa47a

shank commited on

fix: torch at build time, remove mergekit (conflicts accelerate/peft/trl)
2bfaf77

shank commited on

fix: empty requirements.txt, install training deps at runtime
5d0b2d4

shank commited on

fix: remove wandb - click conflict with gradio causes resolution-too-deep
2005cd2

shank commited on

chore: normalize dataset inputs and fix mergekit dependency for TRL 0.14.0
e67270e

shank commited on

Auto-detect GPU: bfloat16+batch2+gen8 on A100, float16+batch1+gen4 on T4 — same script works on both
ea6fe4e

shank commited on

Reduce max_completion_length to 160 for T4 speed: target 1000 steps in <8hrs
9487853

shank commited on

Optimize for Kaggle P100: float16, batch=1, grad_accum=8, num_gen=4, max_completion=256, lora_r=8
73f957d

shank commited on

Fix GRPOConfig: rename max_new_tokens to max_completion_length for trl==0.14.0
8b16369

shank commited on

Align gradio version with Hugging Face Space builder2
633a3b7

shank commited on

Stabilize Space runtime: pin ML deps and disable runtime package drift
663b8db

shank commited on

Pin torch to cu121 build + use model.device instead of hardcoded cuda string
8f291e0

shank commited on

Replace unsloth with bitsandbytes+peft: fixes CUDA driver incompatibility on HF A100
c325ad7

shank commited on

Reduce training to 500 steps with tightened curriculum for A10G budget
ba8df98

shank commited on

Fix eval device selection with CUDA-safe fallback
dc8001b

shank commited on

Optimize for A100 80GB: 8 generations, batch 4, lr 2e-5, dense logging
2b1fbf3

shank commited on

Restore full 1000-step training with original curriculum
1128de1

shank commited on

Reduce training to 500 steps with tightened curriculum for A10G budget
3152fa9

shank commited on

Add Gradio training monitor and fix subprocess python path
b92ad01

shank commited on

Update: Started making changes for the hackathon
a55c81d

shank commited on