| # Workspace Guide ("What lives where") |
|
|
| This is the single orientation page for new contributors. |
|
|
| ## 1) Production surface |
|
|
| Use these when you want real user-facing behavior: |
|
|
| - **Community agent/tooling** |
| - Card: `.fast-agent/tool-cards/hf_hub_community.md` |
| - Backend function tool: `.fast-agent/tool-cards/hf_api_tool.py` |
| - Focus: Hub users/orgs/discussions/collections/activity API workflows |
|
|
| - **Papers search agent/tooling** |
| - Card: `.fast-agent/tool-cards/hf_paper_search.md` |
| - Backend function tool: `.fast-agent/tool-cards/hf_papers_tool.py` |
| - Focus: `/api/daily_papers` filtering and retrieval |
|
|
| --- |
|
|
| ## 2) Eval inputs (challenge sets) |
|
|
| - `scripts/hf_hub_community_challenges.txt` |
| - `scripts/hf_hub_community_coverage_prompts.json` |
| - `scripts/tool_routing_challenges.txt` |
| - `scripts/tool_routing_expected.json` |
| - `scripts/tool_description_variants.json` |
|
|
| These are the canonical prompt sets/configs used for reproducible scoring. |
|
|
| --- |
|
|
| ## 3) Eval execution scripts |
|
|
| - `scripts/score_hf_hub_community_challenges.py` |
| - Runs + scores the community challenge pack. |
|
|
| - `scripts/score_hf_hub_community_coverage.py` |
| - Runs + scores endpoint-coverage prompts that avoid overlap with the core challenge pack. |
|
|
| - `scripts/score_tool_routing_confusion.py` |
| - Scores tool-routing quality for a single model. |
|
|
| - `scripts/run_tool_routing_batch.py` |
| - Runs routing eval across many models + creates aggregate summary. |
|
|
| - `scripts/eval_tool_description_ab.py` |
| - A/B tests tool-description variants across models. |
|
|
| - `scripts/eval_hf_hub_prompt_ab.py` |
| - A/B compares prompt/card variants using both challenge and coverage packs, with summary plots. |
|
|
| - `scripts/plot_tool_description_eval.py` |
| - Generates plots from A/B summary CSV. |
|
|
| --- |
|
|
| ## 4) Eval outputs (results) |
|
|
| - Community challenge reports: |
| - `docs/hf_hub_community_challenge_report.md` |
| - `docs/hf_hub_community_challenge_report.json` |
|
|
| - Tool routing results: |
| - `docs/tool_routing_eval/` |
|
|
| - Tool description A/B outputs: |
| - `docs/tool_description_eval/` |
|
|
| --- |
|
|
| ## 5) Instructions / context docs |
|
|
| - `docs/hf_hub_community_challenge_pack.md` |
| - `docs/tool_description_eval_setup.md` |
| - `docs/tool_description_eval/tool_description_interpretation.md` |
| - `bench.md` |
|
|
| --- |
|
|
| ## 6) Suggested newcomer workflow |
|
|
| 1. Read this file + top-level `README.md`. |
| 2. Run one production query for each agent. |
| 3. Run one scoring script (community or routing). |
| 4. Inspect generated markdown report in `docs/`. |
| 5. Only then edit tool cards or script logic. |
|
|
|
|
| --- |
|
|
| ## 7) Results at a glance |
|
|
| - `docs/RESULTS.md` is the index page for all generated reports and plots. |
|
|