sql_env / docs /learnings /testing.md
hjerpe's picture
Upload folder using huggingface_hub
9e64e71 verified
# Learnings - Testing
- Add a focused trajectory-shape unit test that asserts the final two turns are `tool` then content-only `assistant` (no `tool_calls`) and keep a separate low-reward test proving incorrect trajectories are still filtered to `None`. *(F014)*
- Keep notebook and pipeline smoke tests aligned by asserting stable training-flow anchors (for example `before_rollouts = sample_random_baseline` appears before `run_training_with_metrics(trainer)`) whenever notebook cells are refactored. *(F015)*